goon/app/connectors/direct_scrapers
jtrzupek 210aec0536 feat(scrapers): extract tags + description from porndish scene pages
porndish-only scenes had no tags and no description — the scraper only derived a
title from the URL slug. The scene page (g1/bimber WP theme) carries both: a
<p class="entry-tags"> list of /video2/<slug>/ links (the "#" tags the user sees,
categories + co-performers) and a prose description <p> in .entry-content.

Override _fetch_scene_metadata in PornDishScraper to pull both from one page
fetch. Extend the base hook to accept an optional 4th return element
(description) and thread it into RawScene.description — backward compatible with
the existing 3-tuple (pornhat). Strips leading embed-button labels
("Video Player N", "Server N") from the prose. Verified on live scenes: clean
tag lists + real descriptions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 21:32:10 +02:00
..
__init__.py feat(deep-crawl): xvideos browse source (capped) + per-tube page cap 2026-06-03 11:16:44 +02:00
_browse_base.py feat(deep-crawl): eporner via JSON API as SSR-rich source (Phase 2b alternative) 2026-06-03 10:37:20 +02:00
_search_base.py feat(scrapers): extract tags + description from porndish scene pages 2026-06-06 21:32:10 +02:00
base.py Initial commit 2026-05-20 10:10:22 +02:00
eporner.py Initial commit 2026-05-20 10:10:22 +02:00
eporner_api.py feat(deep-crawl): eporner via JSON API as SSR-rich source (Phase 2b alternative) 2026-06-03 10:37:20 +02:00
fpoxxx.py Initial commit 2026-05-20 10:10:22 +02:00
freshporno.py Mobile 0.1.9: OTA enable, WebView cookie-dismiss fix, porndoe connector 2026-05-22 11:20:57 +02:00
fullmovies.py feat(ingest): SQL phash match, tag inference + backfill, clip-store skip, browse tubes, watchdog 2026-06-01 15:07:35 +02:00
hdporn92.py Initial commit 2026-05-20 10:10:22 +02:00
hdporngg.py feat(ingest): SQL phash match, tag inference + backfill, clip-store skip, browse tubes, watchdog 2026-06-01 15:07:35 +02:00
hqporner.py Initial commit 2026-05-20 10:10:22 +02:00
latestleaks.py Initial commit 2026-05-20 10:10:22 +02:00
latestpornvideo.py Initial commit 2026-05-20 10:10:22 +02:00
mypornerleak.py Initial commit 2026-05-20 10:10:22 +02:00
perverzija.py Initial commit 2026-05-20 10:10:22 +02:00
porn00.py Initial commit 2026-05-20 10:10:22 +02:00
porn4days.py Initial commit 2026-05-20 10:10:22 +02:00
porndish.py feat(scrapers): extract tags + description from porndish scene pages 2026-06-06 21:32:10 +02:00
pornditt.py Initial commit 2026-05-20 10:10:22 +02:00
porndoe.py Mobile 0.1.9: OTA enable, WebView cookie-dismiss fix, porndoe connector 2026-05-22 11:20:57 +02:00
pornhat.py Initial commit 2026-05-20 10:10:22 +02:00
pornhub.py Initial commit 2026-05-20 10:10:22 +02:00
porntrex.py Initial commit 2026-05-20 10:10:22 +02:00
pornxp.py Initial commit 2026-05-20 10:10:22 +02:00
redtube.py Initial commit 2026-05-20 10:10:22 +02:00
shyfap.py Initial commit 2026-05-20 10:10:22 +02:00
siska.py Initial commit 2026-05-20 10:10:22 +02:00
sxyland.py Initial commit 2026-05-20 10:10:22 +02:00
sxyprn.py Initial commit 2026-05-20 10:10:22 +02:00
watchporn.py Initial commit 2026-05-20 10:10:22 +02:00
xhamster.py Initial commit 2026-05-20 10:10:22 +02:00
xmoviesforyou.py Initial commit 2026-05-20 10:10:22 +02:00
xnxx.py Initial commit 2026-05-20 10:10:22 +02:00
xvideos.py Initial commit 2026-05-20 10:10:22 +02:00
xvideos_browse.py feat(deep-crawl): xvideos browse source (capped) + per-tube page cap 2026-06-03 11:16:44 +02:00
xxxfreewatch.py Initial commit 2026-05-20 10:10:22 +02:00
yesporn.py feat(deep-crawl): xvideos browse source (capped) + per-tube page cap 2026-06-03 11:16:44 +02:00
youporn.py Initial commit 2026-05-20 10:10:22 +02:00
zerodayxx.py Initial commit 2026-05-20 10:10:22 +02:00