goon/app
jtrzupek f014a901de feat(scheduler): periodic title+duration dedup (missing-merge tube dupes)
Missing-merge duplicates (same performer + identical normalized title + identical duration-to-the-second) that bulk_dedup misses — tube re-scrapes and cross-tube re-ingests like porn00 pulling a video already present from xnxx (reports 28fe8181/32df33b1). Extracted the proven merge_exact_title_duration logic into app/scheduler/title_duration_dedup.py (script now a thin wrapper), wired a 12h scheduler job (playback-only = what users actually see, GOON_SCHED_TITLE_DEDUP_HOURS). Signal is near-certain (two different videos don't share byte-identical title AND exact duration); no shared performer = not merged (over-match guard). Verified: job registers (jobs=14), backlog currently 0 after the one-shot global merge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 11:20:48 +02:00
..
api feat(api): per-device saved searches (keyword favorites) 2026-06-16 13:52:18 +02:00
connectors fix(latestpornvideo): revive search via /actor/ listing + metadata 2026-06-16 23:20:02 +02:00
extractors fix(extractors): 4k69 direct okcdn extraction (replaces WebView fallback) 2026-06-14 11:39:36 +02:00
models feat(api): per-device saved searches (keyword favorites) 2026-06-16 13:52:18 +02:00
normalize feat(ingest): SQL phash match, tag inference + backfill, clip-store skip, browse tubes, watchdog 2026-06-01 15:07:35 +02:00
resolve fix(ingest): race-safe scene_tags insert (ON CONFLICT) — GOON-M 2026-06-19 11:09:06 +02:00
scheduler feat(scheduler): periodic title+duration dedup (missing-merge tube dupes) 2026-06-19 11:20:48 +02:00
templates feat(seo): public HTML SEO router + templates; add CLAUDE.md; ignore .nimbalyst 2026-05-31 16:29:59 +02:00
__init__.py Initial commit 2026-05-20 10:10:22 +02:00
auth.py Initial commit 2026-05-20 10:10:22 +02:00
config.py feat(scheduler): periodic title+duration dedup (missing-merge tube dupes) 2026-06-19 11:20:48 +02:00
db.py Initial commit 2026-05-20 10:10:22 +02:00
ingest.py fix(ingest): strip NUL bytes from raw payloads before Postgres write 2026-06-11 19:48:22 +02:00
main.py feat(api): per-device saved searches (keyword favorites) 2026-06-16 13:52:18 +02:00