AI/myAi - myAi - Gitea: Git with a cup of tea

AI/myAi

Fork 0

Commit Graph

Author	SHA1	Message	Date
claude	e38f40732f	feat(providers): add headless browser scraping via Playwright for SPA job sites Build and Push Docker Images Staging / build (push) Successful in 5m20s Details ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS bundle. This change equips cv-search-job with a headless Chromium (Playwright 1.60) so it can fully render SPA pages before extracting job links. - Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig, and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag is included in the session provider-config snapshot - Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL (remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true - HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync; plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial content rather than failing outright - Dockerfile: download Playwright Chromium in the SDK build stage via npx; copy browser binaries to the final image; install Chromium system libs (Ubuntu noble t64 variants) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 13:42:52 +03:00
claude	7c09f5a871	feat(cv-search-data): add JobProviders table to cvSearch schema New JobProviderEntity persists provider config (name, URL template, link filter, initial keywords, max results, display order) in the DB instead of appsettings. Migration seeds three disabled defaults: ejobs.ro, bestjobs.eu, and linkedin.com. Closes #35 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 11:46:34 +03:00

Author

SHA1

Message

Date

claude

e38f40732f

feat(providers): add headless browser scraping via Playwright for SPA job sites

Build and Push Docker Images Staging / build (push) Successful in 5m20s

Details

ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS
bundle. This change equips cv-search-job with a headless Chromium
(Playwright 1.60) so it can fully render SPA pages before extracting
job links.

- Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig,
  and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag
  is included in the session provider-config snapshot
- Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL
  (remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true
- HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync;
  plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial
  content rather than failing outright
- Dockerfile: download Playwright Chromium in the SDK build stage via
  npx; copy browser binaries to the final image; install Chromium system
  libs (Ubuntu noble t64 variants)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-29 13:42:52 +03:00

claude

7c09f5a871

feat(cv-search-data): add JobProviders table to cvSearch schema

New JobProviderEntity persists provider config (name, URL template,
link filter, initial keywords, max results, display order) in the DB
instead of appsettings. Migration seeds three disabled defaults:
ejobs.ro, bestjobs.eu, and linkedin.com.

Closes #35

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-29 11:46:34 +03:00

2 Commits