Read-only data-quality audit for scene merges made before the 2026-05-12
scoring hardening (which now caps weak-signal aggregator matches at 0.85 and
tightened the duration bump to <=3s). The auto-merge candidate log does not
record which external_ref was attached, so a merge cannot be reversed from the
log alone. Instead this detects false merges by their effect: a scene that
absorbed a different video ends up with playback_sources of inconsistent
durations (e.g. a 60s clip alongside a 2h source).
Reports counts + severity buckets by max/min duration ratio, can list the worst
offenders with a per-source breakdown, and can export suspects to JSON. Mutates
nothing — remediation (detach/mark-dead the outlier source) is left as an
explicit, separately-decided step because short durations can be legitimate
(previews) and n=2 scenes are ambiguous about which source is canonical.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>