The filtered scene-list endpoints (default feed sends min_duration_sec=60, plus
has_playback / tag / q filters) took ~4.5s — and an idle server. Profiling showed
the entire cost was the bounded COUNT subquery over the EXISTS filters: Postgres
would not reliably early-terminate at the cap under psycopg bound params, scanning
the whole matching set (~858k for has_playback). Counting over the PK and using a
literal LIMIT helped some cases but the plan stayed unstable.
Fix: stop computing an exact count for filtered lists entirely. The mobile client
paginates by has_more (per_page+1 fetch), never by total — total is only the "N+"
UI counter. Derive total as a lower bound from the page + has_more after the fetch.
This removes the count query from every filtered request.
Result (end-to-end, authenticated): default feed 4.5s -> ~0.1s, has_playback
4.4s -> ~0.1s, q/studio/normal-tag filters all <0.3s. Also added index
scene_tags(tag_id, scene_id) (PK led with scene_id, so tag->scenes did a seq scan).
Remaining: a single enormous tag (e.g. "anal", ~163k scenes) ordered by recency
still gathers-all-then-sorts in the fetch (~5s); normal tags are <0.5s. Tracked
in #22 for a denormalized recency-ordered approach.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The bounded count for filtered scene lists ran `SELECT count(*) FROM (SELECT
scenes.* ... LIMIT 1001)` because the base query selects the full Scene entity.
Counting over all columns made the planner pick a far worse plan via psycopg
bound params (~4s for has_playback) than the same logic over the PK (~30-400ms).
Count semantics are unchanged — we only need rows to exist — so count over
`base.with_only_columns(Scene.id)`.
Partial: this fixes the count leg. The main ordered fetch on filtered lists
(has_playback / tags) can still pick a gather-all-then-sort plan under bound
params (fast with literal binds, slow parameterized) — tracked separately.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Filtered /scenes (tag/origin/q/studio/performer) ran exhaustive COUNT with
stub-filter EXISTS over 1.7M rows: TAG 5.1s, ORIGIN 4.9s, SEARCH 3.1s.
Mobile relied on `loaded < total` for infinite-scroll, making exact count
mandatory and ruling out approximate shortcuts.
Backend:
- SceneListOut gains has_more (bool) and total_capped (bool), both optional
for backward compat with old mobile
- Filtered count uses LIMIT _COUNT_CAP+1 (1000) subquery — cost is
O(min(matches, cap)) instead of O(all). Measured: TAG 5.1s→664ms,
SEARCH 3.1s→138ms, ORIGIN 4.9s→1.07s (also fixes SiteScenes showing
global count ~1M instead of per-site count)
- has_more from fetching per_page+1 rows (essentially free); extra row
stripped before serialisation
- Pure-default list (no filters at all) keeps TTL-cached full count
Mobile:
- getNextPageParam uses has_more ?? fallback to loaded<total
- Display shows "{total}+" when total_capped=true (5 screens)
Verified on emulator: tag "Big Tits" → "1000 scenes" loaded, no 500s,
backward compat confirmed (old APK works against new backend).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Counts for /tags, /performers, /studios and /favorites were computed live
per-request by aggregating scene_tags / scene_performers with an EXISTS to
playback_sources. As the catalog grew to ~1.7M scenes (6.3M scene_tags) this
ran ~4.3s for /tags?order=popular (x2 incl. the total count) and ~950ms for
the default /scenes count, making those screens load in several seconds.
- migration 0019: add scene_count (+ DESC index) to tags/performers/studios
- background job _job_refresh_taxonomy_counts (every 3h) recomputes the counts
in one UPDATE..FROM each (IS DISTINCT FROM to skip unchanged rows)
- /tags, /performers, /studios scenes path now read the column + ORDER BY the
indexed scene_count; for_movies paths keep live aggregation (small tables)
- favorites read denormalized scene_count instead of a grouped EXISTS aggregate
- /scenes default count: 10-min in-process TTL cache (header is approximate)
Measured: /tags?order=popular&per_page=500 ~8s -> 66ms incl. serialization.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- app/api/seo.py (+ app/templates/seo/*): publiczny HTML SEO router (programmatic
entity long-tail: performer/studio/scene/landing/2257), bez api-key. Importowany
przez main.py — wymagany do uruchomienia, dotąd untracked. Opsec-clean (brak
VPS IP/sekretów).
- CLAUDE.md: instrukcje projektu (dotąd untracked).
- .gitignore: .nimbalyst/ (lokalne tracker-tooling, nie dla OSS repo).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Dotąd nie dało się docelować sceny konkretnego hostera — search faworyzuje
xnxx/xvideos (dominują bazę), brak filtra po źródle. Diagnostyka per-hoster
(test cookie-fix, luluvid, porntrex) wymagała trafienia sceny danego tube'a.
- /scenes?origin=<substr> — exists() na PlaybackSource.origin ilike, np.
'hqporner' łapie tube:hqpornercom
- ScenesFilterModal: sekcja "Source / hoster" (TextInput) w FilterState.origin
- ScenesScreen: filter.origin → listScenes; liczone do activeCount
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Goon — self-hosted aggregator for adult-content scene metadata.
Indexes scenes from TPDB, StashDB, and 30+ public adult tube sites.
Cross-source deduplication via perceptual hash + Levenshtein distance.
FastAPI backend + APScheduler worker + React Native (Expo) mobile client.
FOSS, ad-free, donation-funded. See README for details.