Tai Huynh
Engineering notes← back to demo

How Resume Tailor actually works.

One Vercel serverless function. No separate backend. Six steps from raw text to a structured, validated ATS analysis. Below: the full pipeline, the design decisions that aren’t obvious, and the gaps that are honestly disclosed.

Pipeline at a glance
Browser / API client
        │
        │  POST /api/tailor
        │  { jobDescription, resume }
        ▼
  ┌─────────────────────────────────┐
  │  1. Rate limit (Upstash Redis)  │
  │     15 req / IP / day           │
  └───────────────┬─────────────────┘
                  │
  ┌───────────────▼─────────────────┐
  │  2. Zod input validation        │
  │     JD: 50–15,000 chars         │
  │     Resume: 50–30,000 chars     │
  └───────────────┬─────────────────┘
                  │
  ┌───────────────▼─────────────────┐
  │  3. Keyword cross-check         │
  │     Deterministic, zero-LLM     │
  │     → roughScore seed for prompt│
  └───────────────┬─────────────────┘
                  │
  ┌───────────────▼─────────────────┐
  │  4. Build prompt                │
  │     JD + resume wrapped in      │
  │     <JOB_DESCRIPTION>/<RESUME>  │
  │     tags as DATA, not commands  │
  └───────────────┬─────────────────┘
                  │
  ┌───────────────▼─────────────────┐
  │  5. Gemini 2.5 Flash call       │
  │     responseMimeType: JSON      │
  │     responseSchema (OpenAPI)    │
  │     maxOutputTokens: 4096       │
  └───────────────┬─────────────────┘
                  │
  ┌───────────────▼─────────────────┐
  │  6. Zod output validation       │
  │     Reject malformed model resp │
  └───────────────┬─────────────────┘
                  │
        ┌─────────▼──────────┐
        │  TailorResponse    │
        │  matchScore 0–100  │
        │  matchedKeywords   │
        │  missingKeywords   │
        │  tailoredBullets   │
        │  summary           │
        └────────────────────┘
Step by step
  1. 01

    Rate limit

    Upstash sliding window — 15 requests per IP per 24 hours. Graceful no-op when Upstash is not configured, so the demo runs without Redis. The counter key is prefixed rl:resume to avoid collision with sibling projects in the same database.

  2. 02

    Zod input validation

    Every field validated at the API boundary before any expensive work begins. jobDescription: 50–15,000 chars; resume: 50–30,000 chars. Integer size caps prevent prompt-stuffing and control token costs. Vitest spec covers the boundary conditions.

  3. 03

    Deterministic keyword pre-check

    A zero-dependency tokeniser runs before the Gemini call. It strips stop words, recognises tech bigrams (e.g. "machine learning", "ci/cd", "next.js"), and cross-checks the job description vocabulary against the resume. The result is a rough score integer that seeds the prompt — this gives the model a concrete calibration point to reason from rather than producing unconstrained numbers.

  4. 04

    Prompt construction with data isolation

    The job description and resume are wrapped in XML-style delimiters (<JOB_DESCRIPTION>, <RESUME>). The system instruction explicitly tells the model to treat the enclosed text as untrusted user data and to ignore any directives embedded in it. This is the primary prompt-injection defence. The instruction portion of the prompt contains no user-controlled text.

  5. 05

    Gemini 2.5 Flash structured output

    responseMimeType: "application/json" forces the model into JSON mode. responseSchema (a hand-built OpenAPI v3 subset) constrains the exact shape of the output. Gemini requires nullable: true instead of a null type union; no $ref or anyOf. maxOutputTokens is capped at 4096 to bound cost and response time. temperature: 0.2 for reproducible scoring.

  6. 06

    Zod output validation — never trust the model

    Even with a responseSchema, Gemini can return unexpected types, missing fields, or out-of-range numbers. Every LLM response is parsed through tailorOutputSchema before the data is returned to the client. A malformed model response returns a typed AI_OUTPUT_INVALID error — no partial data leaks, no crashes.

  7. 07

    Typed error codes, no stack traces

    The API surface exposes a small union of error codes: INVALID_INPUT, RATE_LIMIT, AI_ERROR, AI_OUTPUT_INVALID, INTERNAL. Error messages are user-friendly strings. Server-side detail (model errors, validation traces) is console.error'd but never forwarded to the client.

Security stance

What is defended — and what is not.

Prompt injection

JD and resume wrapped in XML data tags; model explicitly told content is untrusted user input. No user text in the instruction block.

Input size cap

Zod enforces hard char limits before any processing. Prevents prompt-stuffing and token-cost attacks.

LLM output validation

All model output validated with Zod. Malformed/out-of-range responses rejected before reaching the client.

Rate limiting

Upstash sliding window, 15/day per IP. No-ops gracefully without Redis so the demo is still usable.

No stack traces to client

Typed error codes only. Internal errors logged server-side.

maxDuration + maxOutputTokens

Route budget capped at 60s; model output capped at 4,096 tokens.

Known gaps

Naming the gap is part of taking security and quality seriously.

  • [01]

    The keyword extractor is lexical (no NLP embeddings) — it can miss semantically equivalent phrases ("ML engineer" vs "machine learning specialist").

  • [02]

    Score calibration is model-dependent; a job description written with unusual vocabulary may produce a lower rough score that biases the model's final integer.

  • [03]

    No caching of identical {JD, resume} pairs — duplicate submissions burn tokens. A Redis hash of the inputs could eliminate redundant calls cheaply.

  • [04]

    Rate limit is per-IP, not per-account — VPN rotation can bypass it. The demo is not the production build.

  • [05]

    The model can still produce plausible but factually incorrect rewrites. Human review before submission is essential.

Next step

Want this for your product?

This demo is a portfolio piece, but the architecture is the same one I deploy for clients — production rate limits, custom scoring tuned to your ATS, and deeper integrations with HR platforms on request. If you’re building a hiring tool or a resume coaching product, email me with what you’re working on. I reply within 24 hours.