PlayTube CMS. Sitemap-based pagination (listing has no GET paging),
JSON-LD VideoObject metadata, pornstar/category pills, " Clips"
categories mapped to studio. Direct mp4 (cdnde.com/okcdn.ru), tokens
time-bound and portable cross-IP, so mobile plays direct.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
trafficdeposit poster tokens live ~1h (hour-bucketed), so stored URLs can't persist.
New GET /proxy/sxyprn-thumb/{post_id}: resolves the current og:image from the live
/post/<id> page (cache resolved poster URL ~40min), streams bytes with Referer +
long client Cache-Control (URL is stable per post_id → client disk-caches the image,
backend fetches each post ~once). Deleted posts ("Post Not Found") → 404.
Scene grid now emits /proxy/sxyprn-thumb/<id> for sxyprn sources (derived from
page_url) instead of the dead stored trafficdeposit URL. Verified: live post → 200
image, deleted → 404, grid emits resolver URL. Backend-only, no OTA.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CORRECTION: trafficdeposit thumbnail tokens are hour-bucketed and valid only ~1h
(verified 2026-06-10: stored ts=11:00 dead at 12:27, current ts=13:00 loads). Earlier
"~weekly rot" read was wrong. Storing/periodically-refreshing sxyprn thumbnail URLs
is futile — they expire within the hour. Default the refresh job OFF (kept in code).
The dead-marking sweep (Post Not Found → dead_at) it performed was still valid. Live
sxyprn thumbnails need on-demand resolution at serve time (future work).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CORRECTION to earlier "unrecoverable" call: the /post/<id> page is alive (200) and
DOES expose the scene's own fresh-signed poster via og:image / <video poster>
(post-id embedded, current timestamp) — only the STORED thumbnail URL had rotted.
Search/listings don't re-surface old posts (0 overlap), but per-post fetch works.
scripts/refresh_sxyprn_thumbs.py: iterate live sxyprn sources, fetch post page,
extract fresh og:image, UPDATE thumbnail_url (verified: refreshed URLs return 200).
_job_refresh_sxyprn_thumbs: every 12h refresh the 1200 least-recently-updated sources
(cycles the ~19k catalog within the expiry window). Pairs with the scene_resolver
overwrite fix so refreshed thumbnails stick.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
_upsert_playback_sources only set thumbnail_url when the existing value was NULL,
so signed CDN thumbnails that ROT (sxyprn/trafficdeposit tokens expire ~weekly →
404) were never replaced even when a fresh re-scrape captured a valid URL — making
the rot permanent (bug 2026-06-10). Always overwrite thumbnail_url/animated_thumbnail_url
with the freshly-scraped value when present; other fields keep fill-if-null. Lets
the regular performer-driven ingest self-heal thumbnails for re-crawled scenes.
(Note: old sxyprn backlog can't be bulk-refreshed — search/listings don't re-surface
those posts, verified 0 overlap — so it's forward-looking; old sxyprn-only scenes
fall back to the clean placeholder.)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
bug-report c25e9b55: long-press scene actions were in Polish — translate menu,
banner and confirm dialogs to English. Thumb 'error' state (e.g. expired sxyprn
thumbnail 404) now shows the same 🎬 placeholder as 'empty' instead of a ⚠ broken
glyph (bug 2026-06-10).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
sxyprn thumbnails are time-signed on trafficdeposit CDN and ROT — the signed asset
404s after ~weeks and can't be re-signed/refreshed server-side (bug 2026-06-10,
~15k sxyprn-only scenes showed broken thumbs). In the light-list slim-thumbnail pick,
prefer a thumbnail from any non-trafficdeposit source; fall back to sxyprn only when
it's the scene's sole thumbnail (recent ones still load; dead ones now render a clean
placeholder client-side instead of a broken image).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
GoonClient attaches a stable per-install device id (SecureStore, lazy UUID) on
every request so server-side user state is scoped per device. On first launch
after update, call /me/adopt-legacy once (SecureStore flag) to claim the previous
shared state onto this device — the instance owner should relaunch first.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Public instance has no accounts, so all user state was GLOBAL in DB — new users
saw/overwrote each other's (and Jan's) favorites, watched badges and blacklists
(bug 2026-06-10). Add device_id (VARCHAR 64) to 9 state tables with composite PK
(device_id, entity_id); app sends X-Device-Id header (get_device_id dep). All
favorites/scene-favorites/blacklist/watch + scene&movie list/detail (is_favorite,
watched, blacklist-hide) now filter by device. Existing rows backfilled to
'legacy-shared'; POST /me/adopt-legacy reassigns them to the caller once. Old
clients (no header) map to legacy-shared so they keep working until OTA updates.
Migration 0022: add col, backfill, composite PK. Verified on prod: 967 progress
rows preserved, device isolation holds (new device sees none of legacy state).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
bug-report 5a6844db: the hold-to-preview animated gesture did nothing useful.
Replace it with a long-press action menu on scene tiles:
- Ukryj scenę → POST /scenes/{id}/hide
- Oznacz jako duplikat → enter selection mode; tapping another tile merges the
long-pressed scene INTO the tapped one (POST /scenes/{keep}/merge/{drop}).
SceneActionsProvider holds the selection state + a bottom banner, so it works across
all 5 scene-list screens via the shared SceneTile (no per-screen wiring). Selecting
mode highlights tappable tiles and badges the pending duplicate. Animated thumbnails
kept only as a still-fallback image; has_animated_thumbnail filter removed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The hold-to-preview gesture is being removed (did nothing useful), and no client
sends this filter. Remove the Query param, its EXISTS filter, and the pure-default
count guard reference.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
POST /scenes/{id}/hide — marks all playback_sources dead so the scene drops out
of has_playback lists (reversible via dead_at; row kept for dedup/refs).
POST /scenes/{keep_id}/merge/{drop_id} — merges drop into keep via scene_merge
(moves refs/performers/tags/fingerprints/playback). Backs the new tile long-press
menu (hide / mark-duplicate) replacing the dead animated-preview gesture.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
make_token embedded the current timestamp in the expiry, so every /scenes fetch
produced a DIFFERENT proxied URL for the same thumbnail → expo-image (keyed by URI)
cache-missed and re-downloaded every list load / app launch. Add stable_bucket_sec:
quantize the expiry base to a window so the URL is identical across requests.
_wrap_image_proxy uses a 7-day bucket → thumbnails disk-cache for a week instead of
re-fetching constantly. Answers "czy miniatury są cache'owane" — now yes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
xhamster: move from WebView fallback to server-side native HLS. The scene page
is fetchable server-side and the xhcdn master m3u8 (variants + segments) is
time-bound, not IP-bound (verified cross-IP), so mobile plays the HLS direct
with zero proxy bandwidth. New tubes/xhamster.py pulls the master m3u8 from
SSR HTML and returns type='m3u8' mobile_direct; registry remaps xhamstercom
off _vps_blocked_fallback.
hqporner: add flyflv to the player-iframe host whitelist. hqporner rotated
some players to flyflv.com; the CDN host was already whitelisted but the iframe
host was not, so those scenes returned no stream.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up to 1a4bf258 feedback (a627637b + 0264a3ff): the flexWrap chip list ate
too much vertical space and tapping navigated away to TagScenes. Rework: single-row
horizontal scroll of toggle-chips that filter the performer's scenes IN-PLACE
(performer_ids + tags in one listScenes query, no navigation). Selected chip is
highlighted with a ✕ affordance; tap again clears. One line tall instead of N rows.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Final Polish-char print crashed with UnicodeEncodeError on Windows cp1252 stdout
AFTER a successful publish, making exit code 1 misleading. Reconfigure stdout/stderr
to UTF-8 up front.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
bug-report 1a4bf258: "Re-scrape mógłby zniknąć, za to tagi/kategorie by mogły".
Re-scrape was a dev-only bulk thumbnail/tag enrich — noise on the performer page
(per-scene enrich already happens on SceneDetail). Removed it; kept Search.
New GET /performers/{id}/tags aggregates scene_tags across the performer's
live-playback scenes (top N). PerformerScenes renders them as chips → tap navigates
to TagScenes. Search button widened to full row.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A merged scene often aggregates several uploads from ONE tube (re-encodes / 4K
dups). bug-report aa79a995 "why 2 links, both porntrex?" = same scene std + 4K
(porntrex 2591377 + 2593449 "...in 4K"). In the UI these are indistinguishable
links to one hoster (same extractor). Keep one best per origin: prefer duration
matching the scene → any duration → first (origin-asc stable). Dead already filtered.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Chrome-DevTools investigation of bug-report 827a50a1 (sxyland "long loading,
then webview, no autoplay") showed sxyland embeds playmogo.com/e/<id> — a
DoodStream clone (doodcdn.io infra, pass_md5 protocol, get_slides) behind an
INVISIBLE Cloudflare Turnstile (not an interactive CAPTCHA; auto-passes in a
real browser/WebView from a residential IP). The sxyland page itself is NOT
Turnstile-gated — VPS curl pulls the playmogo iframe URL straight from the HTML.
sxylandcom was wired to _vps_blocked_fallback → phone loaded the entire sxyland
page in WebView (ads, click-to-play, no autoplay = the reported symptom), and the
playmogo embed never reached the phone's dood resolver. _embed_iframe (which
already lists sxyland in its docstring) extracts the playmogo embed and emits it
as type='hoster' → PlayerScreen routes playmogo URLs to doodstream.ts (resolveDoodStream),
which resolves phone-side (phone IP passes invisible Turnstile) → direct mp4 → autoplay.
Mobile unchanged (hoster→dood path already exists for xmoviesforyou/siska). Backend-only.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
--playback-only restricts to scenes with live playback (app-visible dupes only).
Progress print every 500 merges for long global runs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
merge_scenes never reassigned playback_sources → ON DELETE CASCADE dropped them
with the absorbed scene. Cross-source (canonical) merges rarely had tube playback
so it hid, but tube-dup merges silently LOST playback links. Add _move_playback_sources
(global unique (origin,page_url) guarantees no collision on reassign).
+ merge_exact_title_duration.py: catches missing-merge dupes bulk_dedup misses
(same performer + identical normalized title + identical duration_sec, no phash).
Bad Bella had 25 such pairs (bug-report ef92809d "duplikat, te same miniatury").
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
paradisehill multipart movies passed all N parts to Alert.alert, but Android's
native AlertDialog renders at most 3 buttons → a 35-part movie showed 3 (bug-report
2ebd0690 2026-06-07). Backend correctly returns all 35; the cap was client-side.
Reuse PlaybackQualityModal (now scrollable + title + preserveOrder props, hides
bogus "1p" for non-resolution labels). Also add the missing `raw` field to the
StreamLink type (backend sends it; part_label lives there).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Android FlatList defaults removeClippedSubviews=true, which detaches off-viewport
subviews; expo-image frequently fails to re-render them when they scroll back in →
blank thumbnails (bug-report f181d382 2026-06-07, recurring). Disable on all heavy
image grids: scene grids (Scenes/Site/Studio/Tag/Performer) + movie poster grids.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
'--workers 3' set limit=3 because the bare '3' also hit the isdigit() branch.
Skip flag-value positions when scanning for a positional LIMIT.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
xvideos renders the scene's models as `<a href="/models/slug">...<span class="name">
Display Name</span>...`. The old _MODEL_RE wanted text immediately after the anchor
`>` and never matched current markup → browse-scraped scenes landed with 0 performers
(bug-report 2026-06-07: "no actors, but Rebecca Johnson is on the page"). New regex
captures slug + nested span.name, bounded within the anchor. + backfill script for the
~11.9k existing zero-performer xvideos scenes (54% have a real /models/ link; resolver
merges names to canonical by name_normalized).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
captureMessage('mobile boot OK', info) fired an event every launch → 171 events
/13 users polluting the Sentry issue list. Diagnostic served its purpose (SDK
confirmed sending). addBreadcrumb keeps boot context attached to real errors
without creating standalone issues.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
_candidate used OR logic (studio OR date±7d OR dur±30s) → 938,950 pairs;
Etap-2 scoring at ~110/s never finished in 1800s → bulk_dedup_performers HUNG
every run, orphan thread leaked until restart. Require AND: same studio plus
(date±2d OR dur±30s). 939k→16k pairs, full run 213s. Real cross-source dup of
one master shares studio + near date/duration; rare studio_id-mismatch pairs
skipped on purpose — a job that COMPLETES beats one that times out merging nothing.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
hqporner search post-filter kept a scene if its slug contained ANY query token
(>=3 chars). For multi-word performer names this matched on a single common token
(e.g. "anna","mia"), so the performer-driven ingest attributed the scene to EVERY
performer sharing that token — scenes accumulated up to 503 wrong performers
(hqporner = 5659 of 5897 scenes with >30 performers; bug-reports 2026-06-07).
Switch ANY->ALL: the slug must contain every query token, requiring a full name
match before attribution. Single-word names still work. Precision over recall —
144 wrong performers is far worse than missing a few loose matches.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
User feedback (2026-06-07, report 4bdca61e) on the prior mute change: the always-
visible "Tap for sound" pill is redundant — the 🔇/🔊 toggle in the top controls
is enough. Removed the pill (+ its styles); video still starts muted and the
speaker toggle unmutes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Completes the literal-tag_id perf fix — the planner's MCV stats on tag_id are what
let it pick the index-walk for common tags. Default target (100) covers only the
top ~100 tags; 1000 extends correct cardinality estimates to mid-tier tags.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tag-filtered scene lists (e.g. blowjob + has_playback) took 4-12s. Root cause:
the filter joined scene_tags->tags on slug, so the actual tag_id was opaque to
the planner at plan time. It fell back to average per-tag cardinality
(8.4M/11541 ≈ 726) instead of the real 273k, chose to materialize ALL matching
scene_tags + check playback per row, then top-N sort.
Fix: resolve slug->tag_id in the app and filter on a LITERAL tag_id (no slug
join). With a constant, the planner uses MCV stats, knows the tag is huge, and
walks ix_scenes_created_at_desc probing scene_tags/playback per scene, stopping
at the page limit. Verified: blowjob list 3300ms -> 18ms (EXPLAIN), HTTP 4-12s ->
47ms. Unknown slug short-circuits to empty. (Pairs with the raised tag_id
statistics target so mid-tier tags also get correct estimates.)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Scene list returned the full SceneOut per item (nested tags/external_refs + all
playback_sources with page_url/embed/stream/quality) though SceneTile only reads
the thumbnail + title/duration/performer/studio, and SceneDetail re-fetches the
full scene via /scenes/{id}. Added light=True to _build_scenes_out_batch: skip the
tags + external_refs queries entirely and collapse playback_sources to one slim
entry (thumbnail_url + animated_thumbnail_url only).
Result: default list payload 78KB->48KB (-38%), ~28ms cached, less DB work per
list. Verified on emulator: grid thumbnails/durations/titles render unchanged.
No mobile change (tile reads the same fields); server-side, no OTA.
NOTE: the separate slow path — common-tag-filtered lists (4-12s; query expands all
matching scene_tags before sort/limit) — is structural (needs a denormalized
(tag_id, created_at) index) and deferred. VACUUM ANALYZE + raised tag_id stats
applied but the planner still can't avoid the materialization.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
audit_false_merges only auto-fixes n>=3 (majority disambiguates the outlier); n=2
was "needs human review" — but the merge-review UI is gone, nobody triages 500+.
Measured: of 535 n=2 duration-divergent scenes, ALL have a canonical scene.duration_sec
(TPDB/StashDB) and 531 have exactly one source matching canonical (±20%) + one >2x off
→ unambiguous false-merge. Kill the off source (works both directions since canonical is
corroborated by the matching keeper, unlike the Omar-case the n>=3 audit guards against).
Applied: 529 sources marked dead (4 ambiguous skipped). Reversible (dead_at).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
bug-report 2026-06-03 ("ten sam czas, ta sama miniaturka, czemu się nie mergują"):
duplicate scenes not merged at ingest. Exact phash alone is noisy here (95% are
collisions on shared thumbnails/intro frames — different scenes; bulk_dedup scorer
correctly gives 0 auto-merge). The safe subset is exact-phash AND same duration
(±3s) AND shared performer/title — near-certain same scene. Same-duration is key:
it excludes the false-merge pattern (short-clip-vs-full has DIFFERING durations).
- scripts/merge_phash_exact_dupes.py: one-off, dry-run by default, per-pair re-fetch
(handles clusters). Applied: 30 merged.
- bulk_dedup: add `_pairs_exact_phash` (SQL O(N log N), not the O(N²) Hamming scan)
+ strategy "phash_exact" — gated by the normal scorer (surfaces review candidates,
no risky auto-merge), schedulable for ongoing exact-collision review.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
bug-report 2026-06-01 (48d6cc6b): scene shows canonical duration from TPDB
(real 22min studio scene) but the only live playback_source is a short tube
teaser (xnxx 21s) → "shows 22m, plays <1m". When ALL live sources are a tiny
fraction (<15%) of a known canonical (>300s), the scene has no real playback;
mark those sources dead → scene becomes orphan → hidden (has_playback=false),
consistent with the orphan-hiding policy. Reversible (dead_at), conservative
(skips scenes with any unknown-duration or full-length live source).
Applied on prod: 182 sources dead across 174 scenes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The generic request<T>() always called res.json(), which throws on a 204 No
Content body. mark-dead endpoints (scene + movie "Mark as invalid"/broken)
return 204, so the call threw AFTER the backend had already marked the source
dead → user saw a "Failed" alert and the list didn't refresh, even though the
mark succeeded server-side (bug-reports 2026-05-28 Voe, 2026-06-03 scene
1e8dc190). Return undefined for 204 before parsing JSON.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The umbrella Source.name for all direct tube scrapers (deep-crawl, browse-latest,
performer-driven) was "pornapp" — a misleading leftover from the removed external
porn-app API. It read like a dependency on a third-party "pornapp" service; it is
not — these are our own scrapers hitting 25+ tubes directly (kind=scraper,
origin tube:<sitetag>). Renamed to "tube-scraper" via a single SCRAPER_SOURCE_NAME
constant; DB row renamed in place (UPDATE name, same id) so all ingest_runs +
external_records history stays linked. No behavior change — external_id keying
(sitetag:url) and dedup are unaffected.
NOTE: playback_sources.origin "pornapp:<sitetag>" prefix is a separate legacy
format (resolve_playback parses it) and is intentionally left untouched.
Verified on prod: row renamed (0 stray "pornapp"), new runs land on "tube-scraper".
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mypornerleak embeds luluvids.top (+ cdnstream.top/cdnvids.top) which are
luluvid/streamwish forks on new TLDs, all confirmed P.A.C.K.E.R.-JWPlayer. They
were missing from PACKER_HOSTS, so isPackerHoster() returned false → the phone-
side packer resolver never ran → WebView fallback landed on luluvids.top's
"disable Adblock and enable popup" wall (bug-report 2026-06-07, scene 75aa3316).
filemoon variant (bysezoxexe.com) was already covered.
Verified on emulator (live OTA): mypornerleak source → luluvids.top resolves
phone-side → native ExoPlayer PLAYING (position advancing), no adblock wall.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
pornxp.ph serves direct <source> mp4 (360/720/1080p) on st.pornxp.sh whose path
token is IP-bound to whoever fetched the PAGE (verified 2026-06-07: VPS-resolved
URL → 403 cross-IP). Backend resolve was therefore impossible, so pornxpph fell
to the WebView fallback which black-screened (bug-report fd06cd86).
Fix: resolve on-device (same pattern as getfileResolver/doodstream) — the phone
fetches the page, so tokens bind to the phone IP and play natively. New
pornxpResolver.ts extracts the <source> mp4s into multi-quality StreamLinks;
SceneDetail short-circuits tube:pornxpph to it before backend resolve, feeding
the existing quality-picker + native player.
Verified on emulator (live OTA): pornxpph scene → quality picker (1080/720/360)
→ native playback PLAYING (no WebView, no ads, no black screen).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two bug-report fixes (2026-06-07):
- sxyprn returns HTTP 200 "Post Not Found" for deleted posts (soft-404), so the
extractor returned None → resolve treated it as transient and never marked the
source dead, leaving a dead link offered forever. Now raise HosterDead on the
marker so resolve marks it dead.
- Scene playback sources were ordered alphabetically by origin, so a WebView-
fallback hoster (fpoxxx, IP-bound + ad-heavy) ranked above a working native
source (freshporno) on the same scene. Add is_vps_blocked_fallback() and sort
native-resolve origins ahead of WebView-fallback ones.
Verified on prod: sxyprn dead URL → HosterDead; scene sources reorder
freshpornoorg before fpoxxx.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Scenes/movies now start with sound OFF; user enables audio via a control
(UX request). NativeVideoPlayer: useVideoPlayer starts muted=true + speaker
toggle in top controls + always-visible "Tap for sound" pill while muted.
WebView path: injected autoplay sets muted=true (also makes muted autoplay
reliable per browser policy → faster CDN extraction); host player controls
handle unmute when the WebView is the actual surface.
Verified on emulator against the live runtime-1.1 OTA bundle: video starts
muted (pill shown), tap unmutes (pill clears).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
_job_bulk_dedup_performers called run_bulk_dedup(strategy="performers") without
the cross_source_only guard whose docstring exists precisely to prevent this OOM.
At current catalog scale the unguarded path materializes N²/2 pairs per prolific
performer into a list → worker hit 6GB RSS and was OOM-killed every 12h (05:00/
17:00), taking down concurrent tpdb/stashdb/movie ingests as killed_by_restart
(0 new movies). Verified in prod: 05:00 run now completes (885k pairs scored, no
OOM) and ingests succeed (stashdb +241, tpdb +175).
Also wrap in _run_with_timeout like tpdb/stashdb (job had no hard-timeout).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
TPDB taxonomy emits numbered-duplicate tags (name "Bubble Butt2"); slugify
yields "bubble-butt2" (no separator before digit), so resolve_tag created a
separate tag alongside "bubble-butt". Tube scenes inherited the dup via
scene-merge → 75 pairs, ~10k scene_tags on the wrong tag.
- resolve_tag: canonicalize "<base>2" -> "<base>" when base exists (handles
current + future; trailing-"2"+alpha guard leaves milf-30/teen18 intact)
- scripts/merge_dup2_tags.py: one-off bulk merge (scene_tags + movie_tags +
blacklist) and taxonomy-count refresh
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Postgres parallel workers (e.g. sitemap_index) need >64MB shared memory;
Docker's default /dev/shm cap raised DiskFull ("No space left on device").
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
porndish-only scenes had no tags and no description — the scraper only derived a
title from the URL slug. The scene page (g1/bimber WP theme) carries both: a
<p class="entry-tags"> list of /video2/<slug>/ links (the "#" tags the user sees,
categories + co-performers) and a prose description <p> in .entry-content.
Override _fetch_scene_metadata in PornDishScraper to pull both from one page
fetch. Extend the base hook to accept an optional 4th return element
(description) and thread it into RawScene.description — backward compatible with
the existing 3-tuple (pornhat). Strips leading embed-button labels
("Video Player N", "Server N") from the prose. Verified on live scenes: clean
tag lists + real descriptions.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
porndish scenes resolve only to playmogo.com embeds, which are DoodStream clones
(doodcdn.io + pass_md5 + Cloudflare Turnstile). The mobile resolver already
supported playmogo, but DoodStream is flaky from a single shot: the embed is
sometimes Turnstile-gated (no pass_md5), and the pass_md5 endpoint intermittently
returns the literal string "RELOAD" (stale/consumed token) instead of a base URL.
The old code built "RELOAD<suffix>?token=..." -> ExoPlayer "no extractors" ->
WebView -> loading forever (bug 62e78c9a).
Wrap resolveDoodStream in a 3-attempt retry that re-fetches the embed (fresh
token) on retryable failures (gate / RELOAD / empty / stale token), and reject a
non-http pass_md5 body as retryable instead of building a garbage URL. Verified
cross-IP that the pass_md5 -> base -> final flow yields 206 video/mp4 when not
gated; real carrier IPs are gated far less than the test proxy. Strict
improvement: worst case is the existing WebView fallback, best case native play.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Movies: the seekplayer-engine family (easyvidplayer/player4me/seekplayer/
embedseek/upns, ~322k sources) returns a time-bound master.m3u8 on a CDN with a
valid IP-SAN cert that plays cross-IP. Mark it mobile_direct in resolve, and make
MovieDetailScreen prefer direct_url with a proxy fallback (mirrors the scene
path) — previously every movie streamed through the VPS proxy. Paradisehill
multipart parts now go direct too. Device-verified: ExoPlayer plays the raw CDN
direct, zero proxy traffic, no flicker.
Scenes: the three blacklist NOT EXISTS clauses were appended to every filtered
list and evaluated per-row even when all blacklist tables are empty (~3.4s tax on
a deep mega-tag walk). Skip them when the tables are empty (cached check) —
mega-tag list 6.7s -> 3.3s, and every filtered list benefits.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Device logs (not assumptions) pinned the real cause of the hdporngg/fullmovies
flicker: the backend returns a get_file URL, but get_file is bound to the IP that
loaded the *page*. The backend (VPS) loads the page, so the get_file is VPS-bound;
the phone fetching that get_file gets HTTP 410 -> ExoPlayer errors -> falls back to
the proxy via nav.replace (the "flicker"), and ends up streaming through the proxy.
(My earlier "stateless/portable" test was from the VPS — same IP as the page load —
so it wrongly showed 206.)
Fix: when the direct_url is a get_file, the phone re-fetches the *page* itself
(resolveGetFilePage on source.page_url) so the get_file is bound to the phone IP,
picks the requested quality skipping 4K (dead on fpvcdn), follows to the CDN, and
hands ExoPlayer a working URL. On failure it keeps the original (proxy fallback).
Verified on device: [getfile] page-resolve -> get_file 206 -> ExoPlayer PLAYING,
position advancing, no error/proxy/flicker, real video frame rendered.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hdporn.gg/fullmovies.xxx return an unresolved get_file direct_url that 302-redirects
to fpvcdn.com with the requester IP baked in. The backend can't resolve it (would
bind fpvcdn to the VPS IP -> mobile 403), so the phone must follow the redirect. But
ExoPlayer errors on that cross-domain get_file->fpvcdn redirect (drops Referer / won't
complete it) -> the native player falls back to the proxy via nav.replace, which the
user sees as a screen-reload "flicker" before playback (and means it's actually playing
through the VPS proxy, not direct).
Fix: resolve the get_file 302 in JS on the phone (so fpvcdn binds to the phone IP)
before navigating to the player, and hand ExoPlayer the final fpvcdn URL directly —
no redirect, no error, no flicker, no proxy. Uses the same redirect:'manual' +
Location-header pattern as the doodstream resolver (works on RN Android). On resolve
failure it keeps the original get_file URL (current behaviour with proxy fallback).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>