myAi/Jobs/cv-search-job at e38f40732f31360cdae768e57a0051bef51f4dc8 - myAi

AI/myAi

Fork 0

Files

T

History

claude e38f40732f

Build and Push Docker Images Staging / build (push) Successful in 5m20s

Details

feat(providers): add headless browser scraping via Playwright for SPA job sites

ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS
bundle. This change equips cv-search-job with a headless Chromium
(Playwright 1.60) so it can fully render SPA pages before extracting
job links.

- Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig,
  and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag
  is included in the session provider-config snapshot
- Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL
  (remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true
- HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync;
  plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial
  content rather than failing outright
- Dockerfile: download Playwright Chromium in the SDK build stage via
  npx; copy browser binaries to the final image; install Chromium system
  libs (Ubuntu noble t64 variants)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-29 13:42:52 +03:00