feat(extractors): native HLS for xhamster; hqporner flyflv player

xhamster: move from WebView fallback to server-side native HLS. The scene page
is fetchable server-side and the xhcdn master m3u8 (variants + segments) is
time-bound, not IP-bound (verified cross-IP), so mobile plays the HLS direct
with zero proxy bandwidth. New tubes/xhamster.py pulls the master m3u8 from
SSR HTML and returns type='m3u8' mobile_direct; registry remaps xhamstercom
off _vps_blocked_fallback.

hqporner: add flyflv to the player-iframe host whitelist. hqporner rotated
some players to flyflv.com; the CDN host was already whitelisted but the iframe
host was not, so those scenes returned no stream.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
jtrzupek 2026-06-09 09:35:58 +02:00
parent 7f36865b5a
commit 3e8a221981
3 changed files with 96 additions and 5 deletions

View file

@ -39,6 +39,7 @@ from app.extractors.tubes import (
pornhat,
porntrex,
sxyprn,
xhamster,
yespornvip,
)
@ -84,10 +85,6 @@ _REGISTRY: dict[str, Callable[[str], list[StreamSource] | None]] = {
# flashvars `video_url` → `get_file` 302 → CDN time-bound signed URL
# (`expires`+`md5`, NIE IP-bound) → mobile gra direct, zero VPS bandwidth.
"porntrexcom": porntrex.extract,
# VPS-blocked tubes — KVS / Cloudflare blokuje Hetzner IP, ale działają z residential
# IP (potwierdzone Chrome DevTools MCP 2026-05-15). Mobile WebView + INJECTED_JS
# (PlayerScreen.tsx:805) skanuje <video>.src + XHR — łapie URL po decode-ie player JS.
"xhamstercom": _vps_blocked_fallback.extract,
# pornditt — KVS jak yespornvip (function/0 + license). VPS dociera → resolve
# server-side (decode + follow 302 → portable twa.tgprn.com CDN). Wcześniej WebView
# fallback łapał VAST preroll (trafostatic) zamiast contentu. Patrz pornditt.py/_kvs.py.
@ -114,6 +111,13 @@ _REGISTRY: dict[str, Callable[[str], list[StreamSource] | None]] = {
# xxxfreewatch — DELISTED 2026-05-18. 790 solo-orphan scen, 0% match, CF-walled z VPS.
"latestleaksco": _embed_iframe.extract,
"mypornerleakcom": _embed_iframe.extract,
# xhamster — 2026-06-08 PRZEPIĘTE z _vps_blocked_fallback na natywny server-side HLS.
# Re-test (DevTools + cross-IP): VPS pobiera scene page bez CF challenge, master m3u8
# w SSR HTML, manifest+segmenty time-bound (portable, nie IP-bound). Mobile gra HLS
# direct, multi-quality, zero VPS proxy/WebView/reklam. Patrz tubes/xhamster.py.
# ~155k solo-scen upgrade z WebView-z-reklamami na natywne. Wcześniej WebView fallback
# ładował ad-heavy stronę z phone IP (działało, ale gorszy UX + preroll VAST).
"xhamstercom": xhamster.extract,
# PornHat — dedicated extractor: tylko `<source>` z player area (skip sidebar
# trailer URLs `_preview*.mp4`), dedupe po filename. Get_file 302 → CDN, proxy
# follow_redirects=True wymagane (fix w stream_proxy.py).

View file

@ -48,7 +48,7 @@ _PLAYER_IFRAME_RE = re.compile(r'<iframe[^>]+src=["\']([^"\']+)', re.IGNORECASE)
# smartpop, popcash, reebr) → reklama. Brak match = fail safe (return None),
# nie próbujemy go odpalić jako hostera bo to ad-redirect → pop-under.
_VIDEO_IFRAME_HOST_RE = re.compile(
r"//(?:[a-z0-9-]+\.)?(?:mydaddy|hqwo|hqporner)\.[a-z]{2,4}/",
r"//(?:[a-z0-9-]+\.)?(?:mydaddy|hqwo|hqporner|flyflv)\.[a-z]{2,4}/",
re.IGNORECASE,
)

View file

@ -0,0 +1,87 @@
"""xhamster.com — natywny server-side HLS extractor.
2026-06-08: re-test Chrome DevTools + cross-IP NAPRAWIA założenie z `_vps_blocked_fallback`.
Wcześniej `xhamstercom` szedł przez WebView fallback (założenie: Cloudflare blokuje Hetzner
IP). Re-test pokazał:
1. VPS pobiera scene page (HTTP 200, BEZ Cloudflare challenge blok się zdjął).
2. Master HLS URL jest w SSR HTML plain: `video-nss.xhcdn.com/<token>,<expiry>/media=hls4/
multi=.../...m3u8`. `<expiry>` to UNIX ts token TIME-BOUND, nie IP-bound.
3. Cross-IP test (VPS Hetzner): master m3u8 200, wariant playlist 200, segment .m4s
206 video/mp4. Cały łańcuch PORTABLE mobile gra HLS direct z residential IP,
zero VPS proxy bandwidth.
Dlatego resolvujemy SERVER-SIDE jak porntrex/freshporno: fetch page (curl_cffi chrome)
wyłuskaj master m3u8 oddaj jako type='m3u8' mobile_direct. ExoPlayer robi adaptive
multi-quality z jednego master URL.
NB `sources.standard.av1/h264` w HTML to ZASZYFROWANE hex-bloby (player deszyfruje w JS),
bezużyteczne server-side dlatego bierzemy HLS, nie mp4.
"""
from __future__ import annotations
import logging
import re
from app.extractors._fetch import _DEFAULT_IMPERSONATE, _DEFAULT_UA, _HAS_CURL_CFFI, fetch_tube_html
from app.extractors._models import HosterDead, StreamSource
log = logging.getLogger(__name__)
_BASE = "https://xhamster.com"
# Master HLS na xhcdn (video-nss.xhcdn.com / fallback inne sub-domeny). JSON w HTML
# escape'uje slashe (`https:\/\/...`), więc unescape przed matchowaniem.
_M3U8_RE = re.compile(r"https://[a-z0-9.\-]*xhcdn\.com/[^\"'\\ ]+?\.m3u8", re.IGNORECASE)
# Markery skasowanej sceny (strona istnieje, ale bez wideo) → HosterDead.
_DEAD_MARKERS = (
"this video has been deleted",
"this video was deleted",
"video is no longer available",
"has been removed",
)
def extract(page_url: str, *, timeout: float = 60.0) -> list[StreamSource] | None:
html = ""
if _HAS_CURL_CFFI:
from curl_cffi import requests as _cf_requests
session = _cf_requests.Session(impersonate=_DEFAULT_IMPERSONATE)
try:
resp = session.get(
page_url,
headers={"User-Agent": _DEFAULT_UA, "Accept": "text/html,application/xhtml+xml"},
timeout=timeout,
allow_redirects=True,
)
html = resp.text if resp.status_code < 400 else ""
except Exception as e:
log.info("xhamster: page fetch failed %s: %s", page_url, e)
html = ""
if not html:
# fetch_tube_html podnosi TubePageError dla 404/410 (caller → dead_at).
html = fetch_tube_html(page_url, timeout=timeout)
# JSON-escaped slashe → plain, żeby regex złapał master URL.
unescaped = html.replace("\\/", "/")
m = _M3U8_RE.search(unescaped)
if not m:
low = unescaped.lower()
if any(marker in low for marker in _DEAD_MARKERS):
raise HosterDead(f"xhamster: scene deleted {page_url}")
log.info("xhamster: no HLS master URL on %s", page_url)
return None
master = m.group(0)
return [
StreamSource(
link=master,
type="m3u8",
quality=None, # HLS master = adaptive multi-quality (ExoPlayer wybiera)
referer=_BASE + "/",
# Master + warianty + segmenty są time-bound (nie IP/cookie-bound),
# zweryfikowane cross-IP 2026-06-08 → mobile gra direct, zero VPS proxy.
raw={"mobile_direct_ok": True},
)
]