Commit Graph

262 Commits

Author SHA1 Message Date
claude 8f58708cd9 Revert "Suppress environment prefix in email subjects on Production"
Build and Push Docker Images Staging / build (push) Successful in 1m35s
This reverts commit 06dd0140d6.
2026-06-08 21:45:45 +03:00
claude 06dd0140d6 Suppress environment prefix in email subjects on Production
[ENV_NAME] prefix is now only prepended in non-production environments
(Development, Staging, etc.). Production emails get a clean subject line.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 21:45:29 +03:00
claude 0aee7c4ed6 Changes
Build and Push Docker Images Staging / build (push) Successful in 45s
2026-06-08 21:31:35 +03:00
claude cd661fe613 Merge pull request 'Staging to Production' (#51) from main into production
Merge staging to production
2026-06-08 18:28:46 +00:00
claude 6f1d8992ab Merge pull request 'Link pageFetcher.PageFetches to cvSearch.JobSearchSessions' (#50) from feature/page-fetch-session-link into main
Build and Push Docker Images Staging / build (push) Successful in 17m39s
Merge PR #50: Link pageFetcher.PageFetches to cvSearch.JobSearchSessions
2026-06-08 16:57:07 +00:00
claude 2d9ffc9c2b Link pageFetcher.PageFetches to cvSearch.JobSearchSessions
Adds nullable JobSearchSessionId to PageFetchEntity and FetchPageRequest.
cv-search-job passes session.Id on every fetch so all Playwright page
loads for a job search session can be traced back to their session.
Includes index on JobSearchSessionId for efficient lookup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 19:56:13 +03:00
claude 9fbad722fc Merge pull request 'Add Email and ClientIpAddress audit fields to cvSearch.JobSearchSessions and JobSearchResults' (#48) from feature/job-search-audit-fields into main
Build and Push Docker Images Staging / build (push) Successful in 39s
Merge PR #48: Add Email and ClientIpAddress audit fields to cvSearch
2026-06-08 16:21:46 +00:00
claude 473c36d65f Store match-time ClientIpAddress on cvSearch.JobSearchTokens
Captures the IP when the user submits the CV match form and stores it on
the token, giving a full audit trail: token holds the match-site IP,
session holds the email link-click IP.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 19:20:02 +03:00
claude 292d19d5ed Populate JobText from fetched page content in JobSearchResults
Previously always stored empty string; now stores the full page text
returned by page-fetcher-api, which is already in scope at save time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 19:13:10 +03:00
claude d56729de42 Add Email and ClientIpAddress audit fields to cvSearch.JobSearchSessions and JobSearchResults
Captures client IP at job-search link-click time and threads it through to the session.
Both Email and ClientIpAddress are copied from session to each result row during processing.
Closes #47

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 19:11:50 +03:00
claude 79a3dec679 Merge pull request 'Add Email and ClientIpAddress audit fields to cvMatcher.Results' (#46) from feature/result-email-and-ip into main
Build and Push Docker Images Staging / build (push) Successful in 16m43s
Merge PR #46: Add Email and ClientIpAddress audit fields to cvMatcher.Results
2026-06-08 15:58:18 +00:00
claude 02d2b1e510 Add Email and ClientIpAddress audit fields to cvMatcher.Results
Threads the caller's email and client IP through the match pipeline so
every Results row records who triggered the match and from where.
Closes #45

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 18:56:36 +03:00
claude 3c3451b198 Merge branch 'main' into staging
Build and Push Docker Images Staging / build (push) Successful in 17m15s
2026-06-08 18:43:56 +03:00
claude a83f6f705f Remove UseHeadlessBrowser from JobProvider — all fetches now go via page-fetcher-api
page-fetcher-api always uses Playwright (networkidle by default), so the
per-provider flag that chose between headless and plain HTTP is obsolete.

- Removed from JobProviderEntity, CvSearchDbContext, JobProviderConfig, JobTokenService
- HtmlJobSearcher no longer passes WaitFor (uses page-fetcher-api default)
- EF migration drops the column from cvSearch.JobProviders

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 18:43:42 +03:00
claude b68cf942a8 Merge branch 'main' into staging
Build and Push Docker Images Staging / build (push) Successful in 28m0s
2026-06-08 18:37:00 +03:00
claude 61805e2fb5 Merge pull request 'feat: page-fetcher-api centralised Playwright page fetcher' (#44) from feature/page-fetcher-api into main
Merge feature/page-fetcher-api into main
2026-06-08 15:36:44 +00:00
claude dcfc50ff32 Fix Docker builds: upgrade Refit to 11.0.1, add page-fetcher-api-models to Dockerfiles
- Refit 10.1.6 signing certificate was revoked; upgraded to 11.0.1 in Directory.Packages.props
- cv-matcher-api/Dockerfile and cv-search-job/Dockerfile were missing COPY steps
  for page-fetcher-api-models (added in this feature branch)

All 8 images now build cleanly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 18:35:41 +03:00
claude b1ed1cb201 Rename EmailApi.Models.* namespace to Email.Models.* in email-api-models
Removes the spurious Api segment to match the pattern used by all other
models projects: CvMatcher.Models.*, Rag.Models.*, PageFetcher.Models.*.

Updated all consumers: email-api, api, cv-search-job.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 18:06:38 +03:00
claude e1f171168e Align email-api and page-fetcher-api namespaces to Api.* convention
Fixes inconsistency where email-api used EmailApi.* and page-fetcher-api
used PageFetcherApi.*, while cv-matcher-api and rag-api use the generic
Api.* namespace. All four API projects now follow the same pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 18:04:03 +03:00
claude ae2bc9b902 Move SmtpSettings and PageFetcherSettings into their respective models projects
Settings classes now live in the -models project alongside DTOs and client
interfaces, eliminating the Settings/ folder from both API projects.

- SmtpSettings: email-api/Settings/ → email-api-models/Settings/ (namespace EmailApi.Models.Settings)
- PageFetcherSettings: page-fetcher-api/Settings/ → page-fetcher-api-models/Settings/ (namespace PageFetcher.Models.Settings)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 18:00:44 +03:00
claude 30a8df431f Move PageFetcherSettings back to page-fetcher-api/Settings/, matching SmtpSettings pattern
Server-side-only settings (internal config not needed by callers) belong in
the API project itself, not in the models project. PageFetcherSettings
(DefaultWaitFor, TimeoutSeconds, MaxTextChars) mirrors SmtpSettings in
email-api/Settings/ — callers never reference these.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 17:56:21 +03:00
claude 95b0cfa0a9 Move PageFetcherSettings to page-fetcher-api-models, consistent with EmailApiSettings pattern
Settings class now lives in Apis/page-fetcher-api-models/Settings/ with
namespace PageFetcher.Models.Settings, matching how EmailApiSettings is
placed in email-api-models/Settings/.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 17:54:08 +03:00
claude 20b13647de Move PageFetcherSettings to Settings/ folder, consistent with email-api pattern
Settings classes belong in Settings/ with namespace PageFetcherApi.Settings,
not Services/. Matches the SmtpSettings placement in email-api.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 17:51:51 +03:00
claude df011f2a03 Fix PageFetcherApi BaseUrl default to use Docker service name, not container name
Use http://page-fetcher-api:8080 (the Compose service key) for Docker DNS
resolution, consistent with all other internal service URLs (rag-api,
email-api, cv-matcher-api).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 17:49:00 +03:00
claude 3414c61cea Commit 2026-06-08 17:47:17 +03:00
claude 898dd09d50 feat: add page-fetcher-api — centralised Playwright page fetcher
Introduces page-fetcher-api, a new internal ASP.NET Core service that
centralises all web-page fetching through a single Playwright (headless
Chromium) browser instance. All fetches are persisted to the pageFetcher
SQL schema for auditing.

New projects:
- Apis/page-fetcher-api-models: FetchPageRequest, FetchPageResponse, IPageFetcherApiClient
- Apis/page-fetcher-data: PageFetchDbContext, PageFetchEntity, InitialSchema migration (schema: pageFetcher)
- Apis/page-fetcher-api: PlaywrightBrowserService (singleton), PageFetcherService, PageController

Changes to existing services:
- cv-matcher-api: JobTextExtractor now calls IPageFetcherApiClient instead of HttpClient
- cv-search-job: HtmlJobSearcher uses IPageFetcherApiClient (removes inline Playwright);
  CvSearchJobTask fetches individual job pages and applies keyword pre-filter before
  LLM call; passes pre-fetched JobDescription to cv-matcher-api to skip re-fetch
- common: add PageFetcherApiSettings
- docker-compose.yml, build.yml: add new service + env vars for callers

Closes #43

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 17:43:56 +03:00
claude 4de6f1db45 Merge branch 'main' into staging
Build and Push Docker Images Staging / build (push) Successful in 7m25s
2026-06-08 17:30:49 +03:00
claude 1222a86eb7 Fix file:// URL bug in HtmlJobSearcher — skip non-HTTP(S) URLs
After resolving relative hrefs against the base search URL, some ejobs.ro
links were producing file:/// URIs (e.g. file:///user/locuri-de-munca/...).
These were sent to cv-matcher-api and rejected with HTTP 400, causing 0 matches.

Added a scheme guard after URI resolution to skip any URL that is not
http:// or https://, preventing malformed URLs from reaching the matcher.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 17:30:45 +03:00
claude 2e9069cbdb Fix file:// URL bug in HtmlJobSearcher — skip non-HTTP(S) URLs
Build and Push Docker Images Staging / build (push) Successful in 35s
After resolving relative hrefs against the base search URL, some ejobs.ro
links were producing file:/// URIs (e.g. file:///user/locuri-de-munca/...).
These were sent to cv-matcher-api and rejected with HTTP 400, causing 0 matches.

Added a scheme guard after URI resolution to skip any URL that is not
http:// or https://, preventing malformed URLs from reaching the matcher.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 16:57:52 +03:00
claude c89df975bd Add searched location to job search results email
Build and Push Docker Images Staging / build (push) Successful in 14m42s
Show the candidate's location in the scan summary block of the results email
alongside keywords and providers, for both en and ro templates.

- CvSearchEmailSender.SendResultsAsync accepts location and passes it to BuildScanSummary
- BuildScanSummary passes {{location}} to the template (falls back to '-' when absent)
- CvSearchJobTask passes session.Location to SendResultsAsync
- New migration AddLocationToScanSummaryTemplate updates both language variants of
  email.search-results.scan-summary to include a 'Location / Locație căutată' row

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 15:54:38 +03:00
claude 709c0ac4c3 Merge pull request 'Fix job search: location filtering, keyword quality, anchor filter bypass' (#42) from feature/job-search-location-keywords into main
Merge PR #42: Fix job search — location filtering, keyword quality, anchor filter bypass
2026-06-08 12:51:16 +00:00
claude 99e5cfb76b Fix job search: location filtering, keyword quality, anchor filter bypass
Closes #41

- Add RequireKeywordInAnchor per-provider flag (default true); set false for
  ejobs.ro and bestjobs.eu so Stage 2 anchor-text filter is skipped — their
  search URL already filters by relevance server-side
- Update AI system prompts (en + ro) to extract concise job-board-friendly
  keywords (role title + key tech, not abstract concepts) and candidate location
- Propagate location through JobMatchResponse -> CreateJobSearchTokenRequest ->
  JobSearchTokenEntity -> JobSearchSessionEntity
- Add {location} and {location-slug} substitution in HtmlJobSearcher
- Update provider SearchUrlTemplates to include location:
    ejobs.ro:    /locuri-de-munca/{location-slug}?q={keywords}
    bestjobs.eu: /ro/locuri-de-munca-in-{location-slug}?keywords={keywords}
    linkedin.com: ?keywords={keywords}&location={location}
- Three new migrations: AddRequireKeywordInAnchorAndLocation,
  ImproveKeywordsAndAddLocation, AddLocationToProviders

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 15:45:45 +03:00
claude 91b2baa445 Fix email-api middleware order: API key check before swagger
Build and Push Docker Images Staging / build (push) Successful in 17m14s
UseInternalApiKeyProtection was registered after UseSwaggerInDevelopment,
allowing unauthenticated access to /swagger. Swapped order to match
rag-api and cv-matcher-api.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 22:41:29 +03:00
claude 0f64cb8d99 Fix email-api Dockerfile: add missing shared-data COPY
email-data references shared-data but the email-api Dockerfile never
copied it into the build context, causing MSB9008 during Docker build.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 22:30:25 +03:00
claude b67e926c5f Fix Serilog email sink: configure in code, not JSON config
Serilog.Settings.Configuration cannot deserialize NetworkCredential or
MailKit's SecureSocketOptions from JSON, causing an InvalidOperationException
in the binder and preventing containers from starting.

Fix: remove Email from the WriteTo JSON array entirely and wire it in code
inside ConfigureJsonSerilog using a dedicated SerilogEmail:* config section.
The sink is skipped when From/To/Host are absent, so local dev is unaffected.

Also renames the docker-compose env vars from the verbose
Serilog__WriteTo__2__Args__* prefix to the clean SerilogEmail__* prefix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 22:25:26 +03:00
claude f7d856147e Escalate provider fetch failures to Error for alert emails
HTTP and Playwright fetch failures in HtmlJobSearcher are now logged at
Error so that Serilog's email sink triggers an alert when a job provider
is unreachable. Per-URL match failures remain at Warning (expected noise).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 22:03:46 +03:00
claude 8679bd1efd Fix Serilog email sink config for v4 API breaking changes
Serilog.Sinks.Email v4 renamed all configuration parameters from their
v2 names. The old names were silently ignored, so no error alert emails
were ever sent.

Parameter renames applied across all 6 appsettings.json and docker-compose:
  fromEmail → from
  toEmail   → to
  mailServer → host
  networkCredential → credentials
  enableSsl: true → connectionSecurity: StartTls
  emailSubject → subject
  outputTemplate → body
  batchPostingLimit / period removed (v4 batching uses a separate overload)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 21:57:06 +03:00
claude 1bcf95d8d4 Add download rate limit policy to template and docker-compose
Build and Push Docker Images Staging / build (push) Failing after 1m38s
5 requests / 1 min per IP. docker-compose.yml wired with ${VAR:-default}.
Staging and production .env files updated locally (gitignored).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 20:40:25 +03:00
claude 73f67d1342 Protect FileDownloadController with reCAPTCHA v3 and rate limiting
- Require captchaToken query param on initial (non-range) download requests
- Range requests (HTTP resume) bypass captcha — they are continuations of an already-validated download
- Add download rate limit policy: 5 requests / 1 min per IP (configured in .env)
- Inject ICaptchaVerifier; action name is file_download

UI change required: execute grecaptcha.execute(siteKey, {action: 'file_download'})
before triggering the download and append ?captchaToken=<token> to the URL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 20:37:44 +03:00
claude 650505c08d Merge pull request 'Fix Outlook email layout and move all HTML/prompts out of code' (#38) from main into staging
Build and Push Docker Images Staging / build (push) Successful in 29m52s
Merge main into staging
2026-06-01 17:30:18 +00:00
claude 4066ab5f3f Remove duplicate html.job-search.* rows from email.Templates
These templates belong to the myAi schema (myai-data) and are read by
CvMatcherController via ITemplateService. The email-data copies were
never read by any code — removing them to avoid confusion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 20:22:29 +03:00
claude 7a316b4a45 Move hardcoded HtmlPage shell into html.job-search.shell DB template
The job-search status page HTML wrapper was baked into a static helper
method in CvMatcherController. Extracted to a new template key
html.job-search.shell (*) with {{title}} and {{message}} placeholders.
Added to AddTemplates seed and a new AddHtmlJobSearchShell migration
for existing DBs. Controller now calls _templates.Render() for all paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 20:17:58 +03:00
claude 808a4901d9 Add keywords field to AI CV-match system prompt
The LLM JSON shape was missing the keywords array so res.Keywords was
always empty, causing "none detected" in job search emails. Both en/ro
prompts now include "keywords" in the required JSON shape so the LLM
extracts relevant job-search terms from the CV/job pair.

Note: the cvMatcher.CvMatchResults cache must be cleared on existing DBs
so cached responses (which lack keywords) are not served.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 20:13:00 +03:00
claude b5b654532c Fix HTML shell templates to use table-based layout (Outlook-safe)
Replace div/CSS-class approach with nested table layout so the 600px
container is enforced via HTML attributes, not a <style> block that
Outlook strips. Also removes border-radius and display:inline-block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 20:06:55 +03:00
claude 2838885e22 Fix email templates for Outlook compatibility and move HTML out of code
- Replace div-based layouts with table-based HTML throughout (max-width/border-radius/display:inline-block ignored by Outlook)
- email.match.body: width:100% table with per-cell borders and fixed 130px label column
- email.match.job-search-footer: table-based button with bgcolor attribute
- email.search-results.empty: div replaced with full-width table
- email.search-results.body: remove div wrapper around items
- Add email.search-results.scan-summary and email.search-results.item templates
- CvSearchEmailSender: remove all hardcoded HTML; render via IEmailTemplateService

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 20:01:58 +03:00
claude 8f90a4cfda Reduce email match table width to 500px max-width, centered
- Changed table width from 100% to max-width: 500px with margin: 0 auto
- Applies to both English and Romanian email.match.body templates
- Table now narrower and centered in email

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-06-01 19:18:26 +03:00
claude 978dd3a069 Update email templates to HTML format and fix EmailApiEmailSender
- Convert email.match.body, email.match.job-search-footer, email.search-results.body, and email.search-results.empty templates from plain text to proper HTML format in InitialSchema migration
- Update EmailApiEmailSender.BuildMatchEmailBody() to work with HTML templates instead of plain text
- Add WebUtility.HtmlEncode() for security when inserting dynamic content (summary)
- Templates now use semantic HTML tags (table, h2, h3, ul, li, p, div, hr, a) instead of plain text with newlines
- All 32 email template variants (16 keys × 2 languages) and 8 html.job-search.* templates seeded via migration

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-06-01 19:16:02 +03:00
claude f9530b168f Restore AddHtmlShellTemplates migration with copyright symbol fix
- Restored email.html-shell.start and email.html-shell.end templates to InitialSchema migration
- Fixed copyright symbol: changed © to &copy; HTML entity (avoids encoding issues in database)
- These templates wrap plain text email bodies in proper HTML structure
- Migration runs after InitialSchema, seeding the HTML wrapper templates

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-06-01 18:37:26 +03:00
claude 9cb38e5bc8 Create separate migration for HTML shell templates
Build and Push Docker Images Staging / build (push) Successful in 34s
- Remove html-shell entries from InitialSchema Seed() method
- Create new AddHtmlShellTemplates migration to insert html-shell templates
- Prevents duplicate key errors from having same data in two migrations
- InitialSchema seeds 32 templates (16 keys × 2 languages)
- AddHtmlShellTemplates seeds 2 html-shell templates (start, end)
- Total: 34 templates after both migrations run

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-06-01 17:55:10 +03:00
claude d4c05d7d44 Add HTML email shell templates to email InitialSchema migration
- Add email.html-shell.start: Opening HTML wrapper with blue header and MyAi.ro branding
- Add email.html-shell.end: Closing HTML wrapper with footer
- These templates wrap HTML email bodies before sending via SmtpEmailDispatcher
- Language key set to '*' (language-agnostic)
- Ensures email shell templates are seeded automatically on fresh database initialization

Fixes the "Email template not found: key='email.html-shell.start'" error that prevented email sending.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-06-01 17:36:40 +03:00