Revisited siska re-enable (user fa4083a2). Findings: (1) fresh siska videos (videoID 227xxx) embed playmogo + luluvid and ARE phone-resolvable; updated siska.py scene regex + extractor path to the current video.php?videoID= format (old /<slug>/ format is gone). (2) BUT siska's ?s=<query> search is broken site-side — it returns the latest videos regardless of query (angela white == riley reid == homepage), so as a performer-driven BaseSearchScraper it always yields 0 (title token filter rejects everything). Reviving siska would require converting it to a browse/latest scraper (changes ingest character) — left as a decision. Old self-player videos (player.siska.video -> cfglobalcdn) are dead. Scraper stays disabled. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
26 lines
1.1 KiB
Python
26 lines
1.1 KiB
Python
"""siska.video — direct HTML scrape.
|
|
|
|
Search: `https://siska.video/page/<n>/?s=<q>` (działa nadal).
|
|
Scene URL: `https://siska.video/video.php?videoID=<n>` (zmiana 2026-05+, dawniej `/<slug>/`).
|
|
|
|
Nowy format nie ma słów tytułu w URL (slug = numer videoID), więc do `slug` (którego
|
|
`_search_base` używa do token-filtra query + derywacji tytułu) bierzemy `title='...'`
|
|
z tego samego <a>. Świeże filmy embedują playmogo + luluvid → telefon resolwuje
|
|
phone-side (_embed_iframe oddaje type='hoster'). Re-enabled 2026-06-20 (user fa4083a2).
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import re
|
|
|
|
from app.connectors.direct_scrapers._search_base import BaseSearchScraper
|
|
|
|
|
|
class SiskaScraper(BaseSearchScraper):
|
|
sitetag = "siskavideo"
|
|
_search_url_template = "https://siska.video/page/{page}/?s={query}"
|
|
# <a title=' Tytuł Sceny ' href='https://siska.video/video.php?videoID=227110' ...>
|
|
# `slug` = tytuł (token-filtr + tytuł działają na nim; numer videoID nie ma słów).
|
|
_scene_url_re = re.compile(
|
|
r"<a\s+title='(?P<slug>[^']*)'\s+href='(?P<url>https://siska\.video/video\.php\?videoID=\d+)'",
|
|
re.IGNORECASE,
|
|
)
|