Opt-in remediation for the duration-inconsistent scenes found by the audit.
Scope is deliberately narrow and reversible:
- only scenes with >=3 duration-bearing sources AND max/min ratio > 3x
- anchored on scene.duration_sec (the canonical value), never the median of
sources (a median is wrong when several bogus short clips outvote the real
full-length source)
- marks dead ONLY sources that are >2x SHORTER than the canonical — a falsely
merged source is almost always a short SEO clip/preview. Sources longer than
the canonical are left alone, since an over-long outlier more often means the
canonical duration itself is too low (so killing the long source would drop
the real video); those stay for manual review.
- guards that at least one live source remains
- dry-run by default; --yes to apply; sets dead_at (reversible), not delete
First run marked 514 short-clip sources dead across 228 scenes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Read-only data-quality audit for scene merges made before the 2026-05-12
scoring hardening (which now caps weak-signal aggregator matches at 0.85 and
tightened the duration bump to <=3s). The auto-merge candidate log does not
record which external_ref was attached, so a merge cannot be reversed from the
log alone. Instead this detects false merges by their effect: a scene that
absorbed a different video ends up with playback_sources of inconsistent
durations (e.g. a 60s clip alongside a 2h source).
Reports counts + severity buckets by max/min duration ratio, can list the worst
offenders with a per-source breakdown, and can export suspects to JSON. Mutates
nothing — remediation (detach/mark-dead the outlier source) is left as an
explicit, separately-decided step because short durations can be legitimate
(previews) and n=2 scenes are ambiguous about which source is canonical.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>