How Resume Tailor actually works.
One Vercel serverless function. No separate backend. Six steps from raw text to a structured, validated ATS analysis. Below: the full pipeline, the design decisions that aren’t obvious, and the gaps that are honestly disclosed.
Browser / API client
│
│ POST /api/tailor
│ { jobDescription, resume }
▼
┌─────────────────────────────────┐
│ 1. Rate limit (Upstash Redis) │
│ 15 req / IP / day │
└───────────────┬─────────────────┘
│
┌───────────────▼─────────────────┐
│ 2. Zod input validation │
│ JD: 50–15,000 chars │
│ Resume: 50–30,000 chars │
└───────────────┬─────────────────┘
│
┌───────────────▼─────────────────┐
│ 3. Keyword cross-check │
│ Deterministic, zero-LLM │
│ → roughScore seed for prompt│
└───────────────┬─────────────────┘
│
┌───────────────▼─────────────────┐
│ 4. Build prompt │
│ JD + resume wrapped in │
│ <JOB_DESCRIPTION>/<RESUME> │
│ tags as DATA, not commands │
└───────────────┬─────────────────┘
│
┌───────────────▼─────────────────┐
│ 5. Gemini 2.5 Flash call │
│ responseMimeType: JSON │
│ responseSchema (OpenAPI) │
│ maxOutputTokens: 4096 │
└───────────────┬─────────────────┘
│
┌───────────────▼─────────────────┐
│ 6. Zod output validation │
│ Reject malformed model resp │
└───────────────┬─────────────────┘
│
┌─────────▼──────────┐
│ TailorResponse │
│ matchScore 0–100 │
│ matchedKeywords │
│ missingKeywords │
│ tailoredBullets │
│ summary │
└────────────────────┘- 01
Rate limit
Upstash sliding window — 15 requests per IP per 24 hours. Graceful no-op when Upstash is not configured, so the demo runs without Redis. The counter key is prefixed rl:resume to avoid collision with sibling projects in the same database.
- 02
Zod input validation
Every field validated at the API boundary before any expensive work begins. jobDescription: 50–15,000 chars; resume: 50–30,000 chars. Integer size caps prevent prompt-stuffing and control token costs. Vitest spec covers the boundary conditions.
- 03
Deterministic keyword pre-check
A zero-dependency tokeniser runs before the Gemini call. It strips stop words, recognises tech bigrams (e.g. "machine learning", "ci/cd", "next.js"), and cross-checks the job description vocabulary against the resume. The result is a rough score integer that seeds the prompt — this gives the model a concrete calibration point to reason from rather than producing unconstrained numbers.
- 04
Prompt construction with data isolation
The job description and resume are wrapped in XML-style delimiters (<JOB_DESCRIPTION>, <RESUME>). The system instruction explicitly tells the model to treat the enclosed text as untrusted user data and to ignore any directives embedded in it. This is the primary prompt-injection defence. The instruction portion of the prompt contains no user-controlled text.
- 05
Gemini 2.5 Flash structured output
responseMimeType: "application/json" forces the model into JSON mode. responseSchema (a hand-built OpenAPI v3 subset) constrains the exact shape of the output. Gemini requires nullable: true instead of a null type union; no $ref or anyOf. maxOutputTokens is capped at 4096 to bound cost and response time. temperature: 0.2 for reproducible scoring.
- 06
Zod output validation — never trust the model
Even with a responseSchema, Gemini can return unexpected types, missing fields, or out-of-range numbers. Every LLM response is parsed through tailorOutputSchema before the data is returned to the client. A malformed model response returns a typed AI_OUTPUT_INVALID error — no partial data leaks, no crashes.
- 07
Typed error codes, no stack traces
The API surface exposes a small union of error codes: INVALID_INPUT, RATE_LIMIT, AI_ERROR, AI_OUTPUT_INVALID, INTERNAL. Error messages are user-friendly strings. Server-side detail (model errors, validation traces) is console.error'd but never forwarded to the client.
What is defended — and what is not.
JD and resume wrapped in XML data tags; model explicitly told content is untrusted user input. No user text in the instruction block.
Zod enforces hard char limits before any processing. Prevents prompt-stuffing and token-cost attacks.
All model output validated with Zod. Malformed/out-of-range responses rejected before reaching the client.
Upstash sliding window, 15/day per IP. No-ops gracefully without Redis so the demo is still usable.
Typed error codes only. Internal errors logged server-side.
Route budget capped at 60s; model output capped at 4,096 tokens.
Naming the gap is part of taking security and quality seriously.
- [01]
The keyword extractor is lexical (no NLP embeddings) — it can miss semantically equivalent phrases ("ML engineer" vs "machine learning specialist").
- [02]
Score calibration is model-dependent; a job description written with unusual vocabulary may produce a lower rough score that biases the model's final integer.
- [03]
No caching of identical {JD, resume} pairs — duplicate submissions burn tokens. A Redis hash of the inputs could eliminate redundant calls cheaply.
- [04]
Rate limit is per-IP, not per-account — VPN rotation can bypass it. The demo is not the production build.
- [05]
The model can still produce plausible but factually incorrect rewrites. Human review before submission is essential.
Want this for your product?
This demo is a portfolio piece, but the architecture is the same one I deploy for clients — production rate limits, custom scoring tuned to your ATS, and deeper integrations with HR platforms on request. If you’re building a hiring tool or a resume coaching product, email me with what you’re working on. I reply within 24 hours.