Engineering notes

How this analyser
actually works.

One Vercel serverless function with a 60-second budget. Reviews go in, structured JSON comes back from Gemini, pure math does the rest. Below: all ten steps, the non-obvious decisions, and the gaps that are honestly disclosed.

Pipeline at a glance

┌─────────────────────────────────────────────────────────────────┐
  │                        POST /api/analyze                        │
  └────────────────────────────┬────────────────────────────────────┘
                               │
                    ┌──────────▼──────────┐
                    │   Rate limit (IP)    │  25 req/day — Upstash
                    │   rl:reviews prefix  │  sliding window
                    └──────────┬──────────┘
                               │ ok
                    ┌──────────▼──────────┐
                    │  Zod input validate  │  { reviews: string[] }
                    │  1–200 · ≤2000 each  │  hard caps server-side
                    └──────────┬──────────┘
                               │ valid
                    ┌──────────▼──────────┐
                    │  Build prompt        │  reviews in DATA blocks
                    │  (prompt.ts)         │  injection-resistant
                    └──────────┬──────────┘
                               │
               ┌───────────────▼───────────────┐
               │  Gemini 2.5 Flash              │
               │  responseSchema (structured)   │  batched: ≤50/call
               │  temperature=0, max 8k tokens  │
               └───────────────┬───────────────┘
                               │ raw JSON
                    ┌──────────▼──────────┐
                    │  Zod output validate │  llmOutputSchema
                    │  reject on bad shape │  score clamped [-1,1]
                    └──────────┬──────────┘
                               │ typed LlmOutput
                    ┌──────────▼──────────┐
                    │  Aggregate (pure fn) │  distribution, avgScore
                    │  (aggregate.ts)      │  overall verdict, themes
                    └──────────┬──────────┘
                               │ AnalysisResult
                    ┌──────────▼──────────┐
                    │  JSON response       │  typed AnalyzeResponse
                    └─────────────────────┘

  Client export path:
  AnalysisResult → csv.ts → escapeCsvCell() → download
                 → JSON.stringify()         → download

Step by step

01
Rate limiting
Upstash sliding-window rate limiter — 25 requests per IP per day, keyed under prefix rl:reviews. Degrades gracefully to a no-op when Upstash is not configured, so the demo works in development without Redis.
02
Input validation (zod)
The request body is zod-parsed before any AI call. reviews must be a string array with 1–200 items, each ≤ 2000 characters. Oversized arrays or strings are rejected with a 400 before anything reaches the model. Same caps are enforced client-side for immediate feedback, but the server is the authoritative check.
03
Prompt injection defence
Review text is untrusted user data. Each review is wrapped in numbered [REVIEW N] / [/REVIEW N] delimiters inside a clearly labelled DATA SECTION. The system instruction explicitly tells the model not to follow directives embedded inside reviews. This is defence-in-depth — large models are still susceptible to sophisticated injections, but naive "ignore previous instructions" payloads are cleanly rejected.
04
Gemini 2.5 Flash — structured output
A single batched call (or multiple calls for >50 reviews) using responseSchema — an OpenAPI-subset object schema. Structured output means the model produces JSON directly without wrapping prose, with schema-enforced field types and enum constraints on sentiment values. Temperature is 0 for determinism. maxOutputTokens is 8192 — generous but capped.
05
Batching strategy
Reviews are chunked to ≤50 per Gemini call. For a 200-review batch that is 4 calls, all sequential (Vertex AI on the free tier has per-minute token limits). Themes across batches are merged by case-insensitive label matching and re-sorted by count. A future improvement would be a second "theme consolidation" call to merge semantically similar labels (e.g. "delivery speed" vs "shipping time").
06
LLM output validation (zod)
The raw JSON from Gemini is zod-parsed against llmOutputSchema. Scores are clamped to [-1, 1], theme labels are trimmed, example quotes are capped at 120 characters. Any structural mismatch (missing field, wrong type, empty items array) returns a typed PARSE_ERROR to the client — never a raw exception stack.
07
Aggregation (pure math)
aggregate.ts is a pure function — no network calls, no side effects. It computes distribution counts, average score (rounded to 3dp), and overall verdict (positive if >60% positive and avgScore > 0.2; negative if >60% negative and avgScore < -0.2; mixed otherwise). This is 100% deterministic and covered by Vitest.
08
CSV export — formula injection escaping
When exporting to CSV, every string cell is passed through escapeCsvCell(). Cells starting with = + - @ are prefixed with a single quote, per OWASP's CSV injection guidance. This prevents a reviewer from embedding a malicious Excel formula like =HYPERLINK("http://evil.com","Click") that would execute when the analyst opens the file in a spreadsheet. The escaping is tested in Vitest.
09
No stack traces to the client
All catch blocks log the full error server-side with console.error and return a typed error code (AI_ERROR, PARSE_ERROR, INTERNAL_ERROR) with a human-readable message. The raw exception — which could contain model output, prompt internals, or GCP credentials path — never leaves the server.
10
maxDuration + token cap
The route declares export const maxDuration = 60, giving a 60-second wall-clock budget on Vercel Pro. Gemini is called with maxOutputTokens: 8192 — enough for 200 short reviews but prevents runaway generation. For the 4-batch worst case, each call gets a natural per-call timeout from the Vertex SDK.

Security stance

Input capped server-side (zod). LLM output zod-validated. Rate limited by IP. Reviews treated as untrusted data in the prompt. No stack traces to client. CSV export escapes formula-injection characters.

What this does not cover: Prompt injection from a sophisticated adversary is not fully mitigated — delimiter-based separation is heuristic. The model could theoretically be manipulated by a sufficiently crafted review. Full mitigation requires a separate classification pass or fine-tuned intent detection. Flagging the gap rather than hiding it.

Theme quality

Gemini identifies themes within each batch independently. Cross-batch merging is by case-insensitive string match — “shipping speed” and “delivery time” would not merge. A second consolidation call or embedding-based clustering would improve quality for large review sets. Out of scope for the demo.

Cost & limits

Gemini 2.5 Flash, temperature 0, max 8k tokens per call. 200 reviews in 4 batches ≈ 4 calls ≈ ~$0.003 total. Rate limit: 25/day/IP. No caching of results (results are deterministic enough that caching by input hash would save tokens for repeat runs — future work).

Tests

Vitest covers csv.ts (formula-injection escaping, CSV parsing, export format) and aggregate.ts (distribution math, score averaging, overall verdict logic, theme sorting). Run with pnpm test.

Next step

Want this for your product?

This demo is a portfolio piece, but the architecture ships in client builds — wired to internal review platforms, app store scrapes, NPS exports. If you have a customer feedback problem and need structured insight at scale, email me.

Email me →← Back to demo

Rate limiting

Input validation (zod)

Prompt injection defence

Gemini 2.5 Flash — structured output

Batching strategy

LLM output validation (zod)

Aggregation (pure math)

CSV export — formula injection escaping

No stack traces to the client

maxDuration + token cap

Want this for your product?