hqfap migrated its JSON-LD contentUrl (and the *.workers.dev mirror) to /upload/videos/video_down.mp4, which serves a FIXED ~3.04MB file for EVERY scene regardless of declared length (verified 5/5 scenes at 14-47min all = 3.04MB, 2026-06-21). It is a placeholder/'server down' clip, not the content — the browser's own player streamed the same stub via MediaSource. We were handing users that 3MB stub (reports c382d441/ef10b946). Now reject the video_down.mp4 contentUrl and return no source, so scenes fall through to other sources or show no playback instead of a fake clip. Real older scenes (cdnde.com / okcdn.ru direct mp4) still resolve. This also makes the proxy-fallback question moot — there is no source to proxy. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
79 lines
3.1 KiB
Python
79 lines
3.1 KiB
Python
"""hqfap.com — direct stream extractor.
|
|
|
|
Scene page (SSR, za Cloudflare → curl_cffi w fetch_tube_html) ma JSON-LD
|
|
VideoObject z `contentUrl` = direct mp4. Dwie generacje hostingu w katalogu:
|
|
|
|
- nowsze sceny: `v4.cdnde.com/...?video=<b64>&time=<epoch>&ip=<addr>` — param
|
|
`ip` NIE jest egzekwowany (cross-IP test 2026-06-10: lokalny ISP i VPS Hetzner
|
|
oba 206), token time-bound → resolve on-demand daje świeży URL,
|
|
- starsze sceny: `vd*.okcdn.ru/?expires=...&srcIp=...&sig=...` (ok.ru) — również
|
|
portable cross-IP (206 z innego IP niż fetcher).
|
|
|
|
Mobile gra direct (mobile_direct auto-detect w playback.py), zero proxy/WebView.
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
import logging
|
|
import re
|
|
|
|
from app.extractors._fetch import fetch_tube_html
|
|
from app.extractors._models import StreamSource
|
|
|
|
log = logging.getLogger(__name__)
|
|
|
|
_JSONLD_RE = re.compile(
|
|
r'<script[^>]+type=["\']application/ld\+json["\'][^>]*>(.*?)</script>',
|
|
re.IGNORECASE | re.DOTALL,
|
|
)
|
|
# Fallback gdy JSON-LD nie parsuje się jako JSON (trailing comma itp.).
|
|
_CONTENT_URL_RE = re.compile(r'"contentUrl"\s*:\s*"([^"]+)"')
|
|
_QUALITY_RE = re.compile(r"_(\d{3,4})p\.mp4", re.IGNORECASE)
|
|
|
|
|
|
def extract(page_url: str, *, timeout: float = 60.0) -> list[StreamSource] | None:
|
|
html = fetch_tube_html(page_url, timeout=timeout)
|
|
|
|
content_url: str | None = None
|
|
for m in _JSONLD_RE.finditer(html):
|
|
raw = m.group(1).strip()
|
|
if not raw:
|
|
continue
|
|
try:
|
|
data = json.loads(raw)
|
|
except (json.JSONDecodeError, ValueError):
|
|
continue
|
|
items = data if isinstance(data, list) else [data]
|
|
for obj in items:
|
|
if isinstance(obj, dict) and obj.get("@type") == "VideoObject":
|
|
content_url = (obj.get("contentUrl") or "").strip() or None
|
|
break
|
|
if content_url:
|
|
break
|
|
if not content_url:
|
|
rm = _CONTENT_URL_RE.search(html)
|
|
content_url = rm.group(1).strip() if rm else None
|
|
if not content_url or not content_url.startswith("http"):
|
|
log.warning("hqfap: no contentUrl in JSON-LD for %s", page_url)
|
|
return None
|
|
|
|
# hqfap migrował: `/upload/videos/video_down.mp4` (+ mirror *.workers.dev) serwuje
|
|
# STAŁY ~3MB placeholder dla KAŻDEJ sceny, niezależnie od deklarowanej długości
|
|
# (5/5 scen = 3.04MB przy 14-47min, weryfikacja 2026-06-21, browser MediaSource grał
|
|
# ten sam stub; user-reports „server down" c382d441/ef10b946). To NIE jest realne
|
|
# wideo → traktujemy jak brak źródła (lepiej żadne niż 3MB „server down" clip).
|
|
# Realne starsze sceny (cdnde.com / okcdn.ru direct mp4) przechodzą normalnie.
|
|
if "/upload/videos/video_down.mp4" in content_url:
|
|
log.info("hqfap: stub video_down.mp4 (placeholder, no real video) on %s", page_url)
|
|
return None
|
|
|
|
qm = _QUALITY_RE.search(content_url)
|
|
quality = f"{qm.group(1)}p" if qm else None
|
|
return [
|
|
StreamSource(
|
|
link=content_url,
|
|
quality=quality,
|
|
type="mp4",
|
|
referer="https://hqfap.com/",
|
|
)
|
|
]
|