nextjsperformanceedge

Scaling Next.js to 100k RPS: Edge, ISR & Caching Patterns

A practical, architecture-first guide to getting sub-50ms global TTFB and sustaining 100k RPS with Next.js (Edge, ISR, CDN patterns, caching headers, and chaos tests).

Kuldeep
Kuldeep
Principal Systems Architect – Ex-Vercel
Scaling Next.js to 100k RPS: Edge, ISR & Caching Patterns
December 8, 2025
10 min read

Scaling Next.js to 100k RPS: Edge, ISR & Caching Patterns

Goal: predictable sub-50ms global TTFB and sustainable 100k RPS for marketing pages and high-read APIs using Next.js.

This post covers:

  • Edge-first architecture (Edge runtime + CDN)
  • Content caching strategies (Cache-Control, ISR, On-Demand ISR)
  • Data caching & stale-while-revalidate patterns
  • Instrumentation (RUM + Synthetic)
  • Load and chaos testing

1 — Architecture overview (Edge-first)

Principle: serve as much as possible from the network (Edge/CDN) and generate only what's necessary near the user.

A recommended stack:

  • Next.js 15 (App Router, edge & node runtimes)
  • Vercel Edge (or Cloud CDN + edge compute)
  • Redis (Edge/managed / Redis on the application layer)
  • Postgres (primary data; use read replicas)
  • RUM / tracing (OpenTelemetry + RUM)

Diagram (simple):

  • CDN/Edge -> Edge Function (Next.js edge runtime) -> Cache or Origin
  • Fast path: CDN hit -> HTML/asset served directly
  • Slow path: Edge function computes SSR or revalidates, stores to CDN.

2 — Static + ISR patterns

Use static generation for marketing pages and ISR where content changes periodically.

Example page with ISR in App Router (Next.js 15):

// app/(marketing)/page.tsx
export const revalidate = 60; // seconds

export default async function Page() {
  const data = await fetch("https://api.example.com/marketing", { next: { revalidate: 60 }});
  return <MarketingLanding data={await data.json()} />;
}

On-demand revalidation (serverless API):

// pages/api/revalidate.ts
import type { NextApiRequest, NextApiResponse } from 'next'

export default async function handler(req: NextApiRequest, res: NextApiResponse) {
  const { secret, path } = req.body
  if (secret !== process.env.REVALIDATE_SECRET) return res.status(401).end()
  await res.revalidate(path)
  return res.json({ revalidated: true })
}

Use on-demand revalidation for editorial workflows and large catalog updates.


3 — Cache-Control & CDN configuration

Set cache-control carefully for the three layers:

  • Browser (short)
  • CDN/Edge (longer)
  • Origin (very short or bypass)

Example headers for marketing HTML delivered from Edge:

Cache-Control: public, max-age=0, s-maxage=60, stale-while-revalidate=86400

Explanation:

  • s-maxage=60 — CDN keeps a copy for 60s.
  • stale-while-revalidate=86400 — serve stale while revalidation happens in background (great UX).

For assets (images, fonts):

Cache-Control: public, max-age=31536000, immutable

4 — Edge middleware & routing

Use middleware for auth gating and small A/B experiments at the Edge — keep logic tiny and deterministic.

`` s // middleware.ts import { NextResponse } from 'next/server'

export function middleware(req) { const country = req.geo?.country || 'US' if (country === 'FR') { // route to country-specific page return NextResponse.rewrite(new URL('/fr', req.url)) } return NextResponse.next() } `` Keep heavy CPU/IO work out of middleware.


5 — Data caching & stale-while-revalidate

For API endpoints, combine Redis / edge-cache with stale-while-revalidate. Pattern:

  1. Try edge/redis cache
  2. If miss, fetch origin with a short timeout
  3. Return cached stale copy if fetch fails, and schedule background refresh

Pseudo:

js const cached = await redis.get(key) if (cached) return cached const fresh = await fetchWithTimeout(origin) await redis.set(key, fresh, { EX: 60 }) // short TTL return fresh

6 — Observability: RUM + Synthetic + Alerts

Track:

  • LCP / FID / CLS via RUM (field data)
  • Synthetic Lighthouse (CI)
  • TTFB and backend traces (OpenTelemetry)

Create CI checks:

  • Lighthouse CI gating on PRs
  • RUM SLOs fed to your alerts

7 — Load & chaos testing

Load test at >= 5× expected peak. Tools:

  • k6 for steady-state load
  • Vegeta for spike testing
  • Chaos: terminate origin instances while CDN serves stale pages to validate resilience

Example k6 scenario: simulate 100k RPS on distributed runners (this requires cloud runners):

  • Warm caches
  • Ramp to target over 10m
  • Run sustained window

8 — Operational checklist

  • Identify fast-path pages and mark as SSG/ISR
  • Configure CDN cache policies (s-maxage + stale-while-revalidate)
  • Implement on-demand revalidation webhooks for CMS
  • Add RUM and synthetic Lighthouse checks
  • Run 5× load tests and validate error budgets
  • Create runbooks for cache invalidation & incident recovery

9 — Key takeaways

  • Prioritize CDN/Edge for static content.
  • Use ISR + on-demand revalidation for frequently updated content.
  • Instrument front- and back-ends with RUM + tracing.
  • Practice with chaos/load tests — cache-first systems must be validated under failure.

If you want, I can convert these patterns into a production-ready RFC (caching headers, infra config, and deploy checklist) tailored to your stack.

Enjoyed this post?

Join 10,000+ developers receiving our latest engineering deep dives and tutorials directly in their inbox.