BACKEND SYSTEM

Distributed URL Shortener

Production-Grade URL Shortening Platform

MISSASYNCUSERAPIREDISPOSTGRESANALYTICS99.98% CACHE HITHIT 4.13ms302HOT PATH — DB UNTOUCHED ON CACHE HIT
Redirect p99
4.13ms
Cache hit rate
99.98%
Throughput
100+ RPS
ID block size
1,000

The Problem

Shortening a URL is trivial; serving redirects at speed while surviving abuse and capturing analytics is not. The system had to keep redirect latency near-zero, protect itself from hot-key floods and scrapers, and record every click without slowing the redirect path.

Architecture & System Design

  1. 01

    Redirect hot path: request → Redis Lua token-bucket rate limiter → Redis lookup → 302 redirect. The database is never touched on a warm-cache hit.

  2. 02

    Base62 short-code generation uses PostgreSQL sequence blocks of 1,000 IDs per instance, dispensed via AtomicLong — eliminating per-request database calls for ID generation.

  3. 03

    Read-through Redis caching with 1-hour TTL; cache misses fall back to PostgreSQL, then warm Redis. Cache self-heals after restarts with zero operator intervention.

  4. 04

    Click analytics are fire-and-forget: events enqueue into a bounded LinkedBlockingQueue (capacity 10,000), then a scheduler batch-inserts 500-row chunks to PostgreSQL — analytics load never touches redirect latency.

  5. 05

    Redis Lua script guarantees atomic two-tier quota enforcement: 100 req/min for anonymous IPs, 1,000 req/min for authenticated users — abuse becomes cheap 429s, not database load.

  6. 06

    Micrometer + Prometheus metrics exposed through authenticated /actuator/prometheus; GitHub Actions CI with Docker-based health checks validates each commit.

Engineering Decisions

Redis in the read path, not as decoration

Redirects are a 99%-read workload, so the design optimizes for the cache hit: sub-millisecond Redis lookup with PostgreSQL as durable fallback. Cache-aside (vs write-through) keeps writes simple and tolerates cold starts without complex invalidation.

Analytics must never block a redirect

Click recording is fully decoupled from the response path. Increment in Redis, flush in batches. A slow analytics write can cost a data point — it can never cost a user-facing millisecond.

Batch ID allocation for high throughput and short codes

Allocating 1,000-ID blocks per instance trades small crash gaps (lost IDs between block boundary and crash) for high throughput — no per-redirect database round-trips, short Base62 codes, and no distributed locking.

Tech Stack

Java 17Spring Boot 3PostgreSQL 16Redis 7Docker ComposeGitHub ActionsFlywayJJWTBCryptMicrometerPrometheusJUnit 5MockitoTestcontainers

Lessons Learned

  • Design for the dominant access pattern — a 99%-read system should look nothing like a CRUD app; put Redis in the read path, not as a nice-to-have.
  • Every external call on a hot path is a liability. Move everything non-essential off it — analytics, metrics, logging all run async.
  • Lua scripts in Redis are the right tool for atomic rate limiting — they execute transactionally, no race conditions, no distributed lock overhead.

Want the full system walkthrough?

Get in touch