goon/app
jtrzupek 1ca503b7be feat(ingest): add xnxx browse scraper (JSON-LD only, alongside search)
Browse over /best/<YYYY-MM>/<page> (SSR; xnxx has no clean /new/ and its homepage is
JS-rendered) for a latest-feed freshness signal next to the performer-driven search
scraper. JSON-LD VideoObject only — xnxx detail (unlike its xvideos twin) doesn't
expose /models/ or /tags/ in SSR, so performers/tags come via canonical merge + the
search scraper. Title is html.unescaped (JSON-LD ships &comma;/&excl; entities).

xhamster and sxyprn intentionally left search-only: xhamster Cloudflare-blocks the
VPS on listing pages (1KB challenge), sxyprn has no clean SSR listing (IP-bound) —
a flaky browse scraper would be worse than the working search + 168h watchdog.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 15:52:32 +02:00
..
api feat(sources): remove 0dayxx + pornditt + pornhat entirely 2026-06-22 12:23:29 +02:00
connectors feat(ingest): add xnxx browse scraper (JSON-LD only, alongside search) 2026-06-24 15:52:32 +02:00
extractors feat(sources): remove 0dayxx + pornditt + pornhat entirely 2026-06-22 12:23:29 +02:00
models feat(sources): 0-5★ ranking on Sites (freshness/metadata/plays) + playback telemetry 2026-06-22 10:00:59 +02:00
normalize feat(ingest): SQL phash match, tag inference + backfill, clip-store skip, browse tubes, watchdog 2026-06-01 15:07:35 +02:00
resolve fix(ingest): race-safe scene_tags insert (ON CONFLICT) — GOON-M 2026-06-19 11:09:06 +02:00
scheduler feat(sources): 0-5★ ranking on Sites (freshness/metadata/plays) + playback telemetry 2026-06-22 10:00:59 +02:00
templates feat(seo): public HTML SEO router + templates; add CLAUDE.md; ignore .nimbalyst 2026-05-31 16:29:59 +02:00
__init__.py Initial commit 2026-05-20 10:10:22 +02:00
auth.py Initial commit 2026-05-20 10:10:22 +02:00
config.py feat(sources): 0-5★ ranking on Sites (freshness/metadata/plays) + playback telemetry 2026-06-22 10:00:59 +02:00
db.py Initial commit 2026-05-20 10:10:22 +02:00
ingest.py fix(ingest): strip NUL bytes from raw payloads before Postgres write 2026-06-11 19:48:22 +02:00
main.py feat(sources): 0-5★ ranking on Sites (freshness/metadata/plays) + playback telemetry 2026-06-22 10:00:59 +02:00