Commit graph

1 commit

Author SHA1 Message Date
jtrzupek
b5d9473898 feat(scripts): merge tube dupes by thumbnail asset-id (hdporn.gg/fullmovies.xxx family)
These sibling platforms share one video-id space and ingest the same video under
different titles, which bulk_dedup misses (different titles, no phash). Match by the
asset-id in the thumbnail path (/<bucket>000/<id>/) on img.hdporn.gg|fullmovies.xxx plus
identical duration, and merge. Hard host restriction + duration guard: the bare number
is reused for unrelated videos on other CDNs (verified via dry-run), so cross-host or
different-duration grouping is excluded. Run scoped (studio id) or global; dry-run by
default. Reports 205b17d9 / 5a2944cb. Ran on Parasited: 43 pairs merged.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-14 14:18:44 +02:00