perf(scenes): literal tag_id in filter — 4-12s tag lists -> ~20ms
Tag-filtered scene lists (e.g. blowjob + has_playback) took 4-12s. Root cause: the filter joined scene_tags->tags on slug, so the actual tag_id was opaque to the planner at plan time. It fell back to average per-tag cardinality (8.4M/11541 ≈ 726) instead of the real 273k, chose to materialize ALL matching scene_tags + check playback per row, then top-N sort. Fix: resolve slug->tag_id in the app and filter on a LITERAL tag_id (no slug join). With a constant, the planner uses MCV stats, knows the tag is huge, and walks ix_scenes_created_at_desc probing scene_tags/playback per scene, stopping at the page limit. Verified: blowjob list 3300ms -> 18ms (EXPLAIN), HTTP 4-12s -> 47ms. Unknown slug short-circuits to empty. (Pairs with the raised tag_id statistics target so mid-tier tags also get correct estimates.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
d52641774d
commit
43f7e1f7b2
1 changed files with 25 additions and 9 deletions
|
|
@ -8,7 +8,7 @@ from typing import Annotated
|
||||||
|
|
||||||
from fastapi import APIRouter, Depends, HTTPException, Query, status
|
from fastapi import APIRouter, Depends, HTTPException, Query, status
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel
|
||||||
from sqlalchemy import distinct, exists, func, literal_column, select
|
from sqlalchemy import distinct, exists, false, func, literal_column, select
|
||||||
from sqlalchemy.exc import IntegrityError
|
from sqlalchemy.exc import IntegrityError
|
||||||
from sqlalchemy.orm import Session
|
from sqlalchemy.orm import Session
|
||||||
|
|
||||||
|
|
@ -182,13 +182,29 @@ def list_scenes(
|
||||||
tag_slug_list = _split_csv(tags)
|
tag_slug_list = _split_csv(tags)
|
||||||
# AND między tagami: scena musi mieć WSZYSTKIE zaznaczone tagi. Każdy slug → osobny
|
# AND między tagami: scena musi mieć WSZYSTKIE zaznaczone tagi. Każdy slug → osobny
|
||||||
# exists() — zaznaczanie kolejnych filtrów zawęża wyniki, jak intuicja użytkownika.
|
# exists() — zaznaczanie kolejnych filtrów zawęża wyniki, jak intuicja użytkownika.
|
||||||
|
#
|
||||||
|
# PERF (2026-06-07): resolvujemy slug→tag_id w aplikacji i filtrujemy po LITERALNYM
|
||||||
|
# tag_id (NIE JOIN po Tag.slug). Z literałem planner zna kardynalność tagu ze
|
||||||
|
# statystyk (MCV) → dla popularnych tagów (blowjob ~273k scen) wybiera index-walk po
|
||||||
|
# ix_scenes_created_at_desc zamiast materializować wszystkie scene_tags. Slug-JOIN
|
||||||
|
# ukrywał tag_id przed plannerem → używał średniej (8.4M/11541≈726) → zły plan
|
||||||
|
# (4-12s). Z literałem: ~20ms. Zob. też _build... light mode.
|
||||||
|
if tag_slug_list:
|
||||||
|
id_by_slug = dict(
|
||||||
|
session.execute(
|
||||||
|
select(Tag.slug, Tag.id).where(Tag.slug.in_(tag_slug_list))
|
||||||
|
).all()
|
||||||
|
)
|
||||||
for slug in tag_slug_list:
|
for slug in tag_slug_list:
|
||||||
|
tag_id = id_by_slug.get(slug)
|
||||||
|
if tag_id is None:
|
||||||
|
base = base.where(false()) # nieznany slug → brak wyników
|
||||||
|
break
|
||||||
base = base.where(
|
base = base.where(
|
||||||
exists(
|
exists(
|
||||||
select(1)
|
select(1)
|
||||||
.select_from(SceneTag)
|
.select_from(SceneTag)
|
||||||
.join(Tag, Tag.id == SceneTag.tag_id)
|
.where(SceneTag.scene_id == Scene.id, SceneTag.tag_id == tag_id)
|
||||||
.where(SceneTag.scene_id == Scene.id, Tag.slug == slug)
|
|
||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue