# cs.money worker (Python) The browser/Cloudflare layer for the cs.money scraper. .NET stays the **C2** (orchestration, proxy/IP allocation, DB, the sweep loop); this worker is the only component that drives a browser and defeats Cloudflare, because the effective anti-bot tooling (`nodriver`/`undetected-chromedriver`, TLS impersonation) only exists in Python/Go, not .NET. ## Why nodriver .NET Selenium got insta-challenged by Cloudflare's managed challenge because `msedgedriver` controls the browser via the DevTools protocol, leaving `navigator. webdriver` and chromedriver `cdc_` artifacts that Cloudflare keys on. `nodriver` drives a normal Chromium directly over CDP (no chromedriver) and patches those tells, so it passes where Selenium loops. ## Local setup ```powershell cd worker py -m venv .venv .venv\Scripts\Activate.ps1 pip install -r requirements.txt ``` If auto-detect can't find a browser, set `BROWSER_PATH` to Chrome or Edge (`C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe`). ## The pull fleet `csmoney_worker.py` holds one warm nodriver session and loops: poll the .NET C2 for a job (a skin+wear search), scrape that search's sell-orders via in-page fetch, and post the items back. The C2 (`BlueLaminate.C2`) picks the stalest skin+wear from the catalogue, and on result persists to `cs_money_listings` + `price_history` (`Source = "csmoney"`), stamping that band's per-site checkpoint (the `csmoney` row in `skin_condition_sweeps`). The checkpoint is per-site, so a band CSFloat already swept is still due for a cs.money sweep. Run the C2 (needs Postgres migrated), then the worker: ```powershell # terminal 1 — the C2 (from repo root) dotnet run --project BlueLaminate\BlueLaminate.C2 # serves http://localhost:5080 # terminal 2 — the worker cd worker; .venv\Scripts\Activate.ps1 $env:WORKER_TOKEN="dev-worker-token" # must match the C2's WorkerToken python csmoney_worker.py ``` The worker warms the session (you clear Cloudflare once), then runs continuously. Scale out by starting more workers (each with its own `PROXY`). ## Layout Both market scripts are thin: each subclasses `blworker.Worker` and fills in only its own scrape + cookie-consent steps. Everything shared lives in the `blworker/` package: | file | responsibility | | --- | --- | | `blworker/config.py` | `Settings` — every env knob, parsed once | | `blworker/log.py` | stdout logging, human or `LOG_JSON=1` (for Loki) | | `blworker/proxy.py` | IPRoyal forwarder + session/password helpers | | `blworker/c2.py` | `C2Client` — claim a job, post a result | | `blworker/runtime.py` | `Worker` base: proxy/browser bring-up, the poll→scrape→post loop, Cloudflare IP rotation, graceful shutdown | | `csmoney_worker.py` / `skinland_worker.py` | the per-market scrape strategies | To add a market: subclass `Worker`, set `name`/`jobs_path`/`default_market_url`, implement `scrape_job` + `describe_job` (+ `dismiss_consent` if it has a banner), and call `run(YourWorker)`. ## skin.land worker `skinland_worker.py` is the same pull model for **skin.land** (also Cloudflare-walled). It shares all the proxy/Cloudflare/C2 plumbing with the cs.money worker via `blworker`; only the scrape differs. The C2 hands out jobs from its **`/skinland/jobs`** group (the `skinland` rows in `skin_condition_sweeps`, so a band cs.money/CSFloat already swept is still due here) and on result persists to `skin_land_listings` + `price_history` (`Source = "skinland"`). How it scrapes (learned during discovery): - A job's target is the market **page URL**, e.g. `https://skin.land/market/csgo/ak-47-redline-field-tested/`. The slug is just `{weapon}-{skin}-{wear}` kebab-cased — the C2 builds it from the catalogue, no lookup. - skin.land is a Nuxt SSR app. The page embeds an internal numeric `skin_id`; the worker resolves it once from the `__NUXT__` payload (the skin object whose `url` == the slug), caches it per slug, then pages the clean JSON API `GET https://app.skin.land/api/v2/obtained-skins?skin_id={id}&page={n}` (a Laravel paginator `{data:[…offers], meta:{current_page,last_page,…}}`), walking to `last_page`. - Each offer carries a full-precision `item_float`, `final_withdrawal_price`, and the steam `item_link`. skin.land exposes **no paint seed**, so listings aren't fingerprinted to a `SkinInstance` (no cross-market roll-up / dupe detection here). StatTrak and Souvenir are separate pages (`stattrak-`/`souvenir-` slugs); v1 sweeps the base page per skin+wear. Run it alongside (or instead of) the cs.money worker — it points at the same C2: ```powershell cd worker; .venv\Scripts\Activate.ps1 $env:WORKER_TOKEN="dev-worker-token" python skinland_worker.py ``` Under Docker it's the `skinland-worker` service (same image, `WORKER_SCRIPT=skinland_worker.py`): ```powershell docker compose up --build --scale skinland-worker=5 ```