fix(scheduler): bulk_dedup performers cross_source_only + hard-timeout (OOM)

_job_bulk_dedup_performers called run_bulk_dedup(strategy="performers") without
the cross_source_only guard whose docstring exists precisely to prevent this OOM.
At current catalog scale the unguarded path materializes N²/2 pairs per prolific
performer into a list → worker hit 6GB RSS and was OOM-killed every 12h (05:00/
17:00), taking down concurrent tpdb/stashdb/movie ingests as killed_by_restart
(0 new movies). Verified in prod: 05:00 run now completes (885k pairs scored, no
OOM) and ingests succeed (stashdb +241, tpdb +175).

Also wrap in _run_with_timeout like tpdb/stashdb (job had no hard-timeout).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
jtrzupek 2026-06-07 11:00:19 +02:00
parent fad72e9cd6
commit 9d0cb7f26e

View file

@ -210,8 +210,16 @@ def _job_bulk_dedup_performers() -> None:
log.info("[scheduler] bulk_dedup performers starting") log.info("[scheduler] bulk_dedup performers starting")
try: try:
from app.scheduler.bulk_dedup import run_bulk_dedup from app.scheduler.bulk_dedup import run_bulk_dedup
bc = run_bulk_dedup(strategy="performers", dry_run=False) # cross_source_only=True: bez tego flag pairwise generuje N²/2 par na płodnego
log.info("[scheduler] bulk_dedup performers done: %s", bc) # performera, materializowane w listę → worker OOM-killed co 12h (6GB RSS na
# 7.6GB boxie, 2026-06-06), ubijając przy okazji równoległe tpdb/stashdb/ingesty.
# Flag zawęża do cross-source kandydatów (TPDB↔StashDB) z pre-filtrem candidate.
# Timeout-wrap jak tpdb/stashdb — job nie ma własnego hard-timeoutu.
_run_with_timeout(
lambda: run_bulk_dedup(strategy="performers", dry_run=False, cross_source_only=True),
label="bulk-dedup-performers",
)
log.info("[scheduler] bulk_dedup performers done")
except Exception: except Exception:
log.exception("[scheduler] bulk_dedup performers failed") log.exception("[scheduler] bulk_dedup performers failed")