Engineering notes

How this analyser
actually works.

One Vercel serverless function with a 60-second budget. Reviews go in, structured JSON comes back from Gemini, pure math does the rest. Below: all ten steps, the non-obvious decisions, and the gaps that are honestly disclosed.

Pipeline at a glance
┌─────────────────────────────────────────────────────────────────┐
  │                        POST /api/analyze                        │
  └────────────────────────────┬────────────────────────────────────┘
                               │
                    ┌──────────▼──────────┐
                    │   Rate limit (IP)    │  25 req/day — Upstash
                    │   rl:reviews prefix  │  sliding window
                    └──────────┬──────────┘
                               │ ok
                    ┌──────────▼──────────┐
                    │  Zod input validate  │  { reviews: string[] }
                    │  1–200 · ≤2000 each  │  hard caps server-side
                    └──────────┬──────────┘
                               │ valid
                    ┌──────────▼──────────┐
                    │  Build prompt        │  reviews in DATA blocks
                    │  (prompt.ts)         │  injection-resistant
                    └──────────┬──────────┘
                               │
               ┌───────────────▼───────────────┐
               │  Gemini 2.5 Flash              │
               │  responseSchema (structured)   │  batched: ≤50/call
               │  temperature=0, max 8k tokens  │
               └───────────────┬───────────────┘
                               │ raw JSON
                    ┌──────────▼──────────┐
                    │  Zod output validate │  llmOutputSchema
                    │  reject on bad shape │  score clamped [-1,1]
                    └──────────┬──────────┘
                               │ typed LlmOutput
                    ┌──────────▼──────────┐
                    │  Aggregate (pure fn) │  distribution, avgScore
                    │  (aggregate.ts)      │  overall verdict, themes
                    └──────────┬──────────┘
                               │ AnalysisResult
                    ┌──────────▼──────────┐
                    │  JSON response       │  typed AnalyzeResponse
                    └─────────────────────┘

  Client export path:
  AnalysisResult → csv.ts → escapeCsvCell() → download
                 → JSON.stringify()         → download
Step by step
  1. 01

    Rate limiting

    Upstash sliding-window rate limiter — 25 requests per IP per day, keyed under prefix rl:reviews. Degrades gracefully to a no-op when Upstash is not configured, so the demo works in development without Redis.

  2. 02

    Input validation (zod)

    The request body is zod-parsed before any AI call. reviews must be a string array with 1–200 items, each ≤ 2000 characters. Oversized arrays or strings are rejected with a 400 before anything reaches the model. Same caps are enforced client-side for immediate feedback, but the server is the authoritative check.

  3. 03

    Prompt injection defence

    Review text is untrusted user data. Each review is wrapped in numbered [REVIEW N] / [/REVIEW N] delimiters inside a clearly labelled DATA SECTION. The system instruction explicitly tells the model not to follow directives embedded inside reviews. This is defence-in-depth — large models are still susceptible to sophisticated injections, but naive "ignore previous instructions" payloads are cleanly rejected.

  4. 04

    Gemini 2.5 Flash — structured output

    A single batched call (or multiple calls for >50 reviews) using responseSchema — an OpenAPI-subset object schema. Structured output means the model produces JSON directly without wrapping prose, with schema-enforced field types and enum constraints on sentiment values. Temperature is 0 for determinism. maxOutputTokens is 8192 — generous but capped.

  5. 05

    Batching strategy

    Reviews are chunked to ≤50 per Gemini call. For a 200-review batch that is 4 calls, all sequential (Vertex AI on the free tier has per-minute token limits). Themes across batches are merged by case-insensitive label matching and re-sorted by count. A future improvement would be a second "theme consolidation" call to merge semantically similar labels (e.g. "delivery speed" vs "shipping time").

  6. 06

    LLM output validation (zod)

    The raw JSON from Gemini is zod-parsed against llmOutputSchema. Scores are clamped to [-1, 1], theme labels are trimmed, example quotes are capped at 120 characters. Any structural mismatch (missing field, wrong type, empty items array) returns a typed PARSE_ERROR to the client — never a raw exception stack.

  7. 07

    Aggregation (pure math)

    aggregate.ts is a pure function — no network calls, no side effects. It computes distribution counts, average score (rounded to 3dp), and overall verdict (positive if >60% positive and avgScore > 0.2; negative if >60% negative and avgScore < -0.2; mixed otherwise). This is 100% deterministic and covered by Vitest.

  8. 08

    CSV export — formula injection escaping

    When exporting to CSV, every string cell is passed through escapeCsvCell(). Cells starting with = + - @ are prefixed with a single quote, per OWASP's CSV injection guidance. This prevents a reviewer from embedding a malicious Excel formula like =HYPERLINK("http://evil.com","Click") that would execute when the analyst opens the file in a spreadsheet. The escaping is tested in Vitest.

  9. 09

    No stack traces to the client

    All catch blocks log the full error server-side with console.error and return a typed error code (AI_ERROR, PARSE_ERROR, INTERNAL_ERROR) with a human-readable message. The raw exception — which could contain model output, prompt internals, or GCP credentials path — never leaves the server.

  10. 10

    maxDuration + token cap

    The route declares export const maxDuration = 60, giving a 60-second wall-clock budget on Vercel Pro. Gemini is called with maxOutputTokens: 8192 — enough for 200 short reviews but prevents runaway generation. For the 4-batch worst case, each call gets a natural per-call timeout from the Vertex SDK.

Next step

Want this for your product?

This demo is a portfolio piece, but the architecture ships in client builds — wired to internal review platforms, app store scrapes, NPS exports. If you have a customer feedback problem and need structured insight at scale, email me.