May 10, 2026·17 min read

Claude Code with Redis: Caching, Sessions, Rate Limiting

Claude CodeRedisCacheWorkflow

Using Claude Code with Redis

Redis is the closest thing the backend world has to a Swiss army knife. It is a cache, a session store, a rate limiter, a message broker, a job queue, a leaderboard, a distributed lock, and a stream processor. That breadth is the reason it shows up in almost every production stack. It is also the reason AI coding tools so often produce code that compiles, runs in development, and quietly breaks in production three weeks later.

The failure modes are specific. Claude defaults to caching every read it sees because that is what most tutorials demonstrate, never thinks about invalidation, and writes keys without TTLs. It mixes ioredis and node-redis APIs in the same file because both libraries appear in its training data with similar weight. It uses KEYS * to find things, which blocks the server. It writes a rate limiter with a race condition. It calls the cache "the cache" rather than naming the namespace, so two unrelated features stomp on each other's keys.

None of these are model limitations. They are context gaps. With a project-specific CLAUDE.md that names your client library, key conventions, and TTL policy, Claude generates correct Redis code that fits your existing patterns. Without one, it generates plausible Redis code that produces incidents. If you are new to Claude Code, the Claude Code setup guide covers installation and authentication before any of this applies.

The Redis CLAUDE.md template

The CLAUDE.md at your project root is read at the start of every session. For a Redis project, it has to answer: which client library, which Redis flavour (self-hosted, Upstash, ElastiCache, Redis Cloud), what the key namespace looks like, what the default TTL is, and which operations are gated. The CLAUDE.md explained guide covers the broader anatomy. Below is a template that has held up across production codebases serving caching, sessions, and rate limits.

# Redis project rules

## Stack
- Redis: 7.4 self-hosted (production), 7.4 via Docker (local)
- Edge: Upstash Redis for serverless functions (Vercel, Cloudflare Workers)
- Client library: ioredis 5.x (server), @upstash/redis (edge only)
- Connection: single client per process, lazy-loaded module export
- Local dev: docker compose service `redis`, port 6379, no auth
- Production: TLS, AUTH password from REDIS_URL, ACL user `app`

## Keyspace conventions
- Format: `{env}:{domain}:{entity}:{id}[:{field}]`
- Examples:
  - `prod:user:session:abc123`
  - `prod:user:profile:abc123`
  - `prod:rate:login:ip:1.2.3.4`
  - `prod:cache:product:sku-42`
- ALL keys MUST set a TTL via EXPIRE, EX, or PX. No bare SET without TTL.
- Default TTL: 3600 (1 hour). Override per-namespace in db/redis/keys.ts.
- Use colon (:) as namespace separator. Never use slash, dot, or hyphen.
- Never use KEYS in production code. Use SCAN with a cursor.

## Project structure
- db/redis/index.ts: Single ioredis client, exported as `redis`
- db/redis/keys.ts: Centralised key builders and TTL constants
- db/redis/cache.ts: Generic cache helpers (get-or-set, write-through)
- db/redis/rate-limit.ts: Token bucket and sliding window limiters
- db/redis/sessions.ts: Session read/write/extend
- db/redis/streams.ts: Stream producers and consumer groups

## Caching rules
- NEVER cache without an explicit invalidation strategy. Document it inline.
- Cache keys MUST include a version suffix when the value shape can change.
- Read-through caches use `getOrSet(key, ttl, () => fetch())`.
- Cache-aside writes invalidate by key, not by SCAN pattern.
- For multi-key invalidation, maintain an explicit dependency set.

## Rate limiting rules
- All limiters use atomic Lua scripts or MULTI/EXEC pipelines, never individual commands.
- Limiter keys include the dimension being limited (ip, userId, route).
- Default policy: 100 requests / 60 seconds per dimension, override per route.

## Hard rules
- NEVER use KEYS in any code path that runs in a request handler.
- NEVER call FLUSHDB or FLUSHALL outside a dedicated reset script.
- NEVER hardcode connection strings; read from REDIS_URL only.
- All scripts that mutate keys at scale require --confirm and a dry-run output.

Three rules in this CLAUDE.md prevent the most common Redis incidents.

The TTL on every key rule prevents the slow-motion outage of an unbounded Redis instance. Tutorials demonstrate SET key value without EX because the focus is on the data structure, not the operational reality. In production, a key without a TTL stays forever. Claude generates set(key, value, "EX", ttl) consistently when this rule is in CLAUDE.md, and surfaces a question when a use case genuinely needs a permanent key (rare; usually a config flag).

The namespace format rule prevents cross-feature collisions. Without it, Claude names the cache key user:abc123 in one file and users:abc123 in another, and the bug only shows up when one feature invalidates and the other does not. Pinning {env}:{domain}:{entity}:{id} and centralising key construction in db/redis/keys.ts removes the choice from each call site.

The never use KEYS rule prevents production-blocking incidents. KEYS * is O(N) over the whole keyspace and blocks Redis until it completes. On a development instance with 100 keys, it returns instantly. On production with 50 million keys, it stalls every other operation for seconds. Claude reaches for KEYS because it is the simplest tutorial example. Banning it in CLAUDE.md and pointing at SCAN shifts the default.

Key naming conventions

Key names are the schema of a Redis database. Get them right at the start and migrations are easy. Get them wrong and you spend years carrying inconsistencies. The convention worth standardising on is {env}:{domain}:{entity}:{id}[:{field}], with colons as the only separator.

// db/redis/keys.ts
export const TTL = {
  session: 60 * 60 * 24 * 7,        // 7 days
  cacheShort: 60,                    // 1 minute
  cacheMedium: 60 * 5,               // 5 minutes
  cacheLong: 60 * 60,                // 1 hour
  rateLimitWindow: 60,               // 1 minute
  idempotency: 60 * 60 * 24,         // 24 hours
} as const;

const env = process.env.NODE_ENV === "production" ? "prod" : "dev";

export const keys = {
  session: (id: string) => `${env}:user:session:${id}`,
  userProfile: (userId: string) => `${env}:user:profile:${userId}:v2`,
  productCache: (sku: string) => `${env}:cache:product:${sku}:v1`,
  rateLimitIp: (route: string, ip: string) => `${env}:rate:${route}:ip:${ip}`,
  rateLimitUser: (route: string, userId: string) => `${env}:rate:${route}:user:${userId}`,
  idempotency: (key: string) => `${env}:idem:${key}`,
  jobLock: (jobId: string) => `${env}:lock:job:${jobId}`,
};

Three things make this work in practice. First, the environment prefix means a developer running migrations against staging cannot accidentally touch production keys, even if a connection string points the wrong way. Second, the version suffix on cache keys (:v2, :v1) is the cache invalidation strategy for shape changes. When the user profile schema changes, bump the suffix and the old keys expire naturally on their TTL. No SCAN, no DEL loop, no risk of half-migrated data sitting in the cache while half-migrated readers misinterpret it. Third, every key construction goes through this file, so a grep for redis.set(\`` returns zero results. Claude writes code that calls keys.userProfile(userId)` rather than constructing strings inline, because that is the pattern in CLAUDE.md.

The TTL constants matter for the same reason. A function that takes 60 as an argument is opaque; a function that takes TTL.cacheShort is self-documenting and consistent across the codebase. When the team decides one-minute caches are too aggressive, the change is one constant rather than a hunt through every call site. Claude reads the TTL constants and uses the named version when generating new code, which compounds: the longer this file is the canonical source, the cleaner every new feature becomes.

The dimension naming on rate limiter keys (rate:{route}:ip:{ip} and rate:{route}:user:{userId}) means you can run two limiters on the same route, one per IP and one per user, without collision. The product cache key includes both the entity type and the SKU, so when a SKU is invalidated you know the exact key without scanning. The Claude Code TypeScript guide covers the broader pattern of moving runtime strings into typed builders.

Caching patterns: read-through, write-behind, and when to use each

The phrase "add a cache" hides a half-dozen distinct patterns, each with different consistency guarantees and operational costs. Three are worth standardising on. The fourth is a trap.

Read-through (cache-aside) is the default for ninety percent of caches. The application reads from the cache. On miss, it reads from the source of truth, writes to the cache, and returns. On write, the application updates the source of truth and invalidates the cache key.

// db/redis/cache.ts
import { redis } from "./index";

export async function getOrSet<T>(
  key: string,
  ttlSeconds: number,
  fetcher: () => Promise<T>
): Promise<T> {
  const cached = await redis.get(key);
  if (cached !== null) {
    return JSON.parse(cached) as T;
  }
  const fresh = await fetcher();
  await redis.set(key, JSON.stringify(fresh), "EX", ttlSeconds);
  return fresh;
}

export async function invalidate(key: string): Promise<void> {
  await redis.del(key);
}

This is the pattern Claude should generate for any read-heavy resource that tolerates brief staleness: product detail pages, user profiles, public configuration, search results. Cache lifetime is bounded by TTL, invalidation is a single DEL, and a cache miss costs one round trip plus the source query. The only failure mode is a thundering herd if the key is hot and expires during a traffic spike, addressed by setting a slightly randomised TTL or using SET key value NX EX ttl with a single-flight pattern.

Write-through is for caches that must stay consistent with writes. The application writes to both the source of truth and the cache atomically (or in a transaction that retries on failure). Reads always hit the cache and never fall back to the source. This pattern is right when you cannot tolerate stale reads and the source of truth is too slow to query directly.

export async function writeThrough<T>(
  key: string,
  ttlSeconds: number,
  value: T,
  persist: (value: T) => Promise<void>
): Promise<void> {
  await persist(value);
  await redis.set(key, JSON.stringify(value), "EX", ttlSeconds);
}

The cost is that every write now depends on Redis being available. The benefit is that reads have a single failure mode (cache down) instead of two (cache down or source down). Use this for session data, user preferences, and feature flags read on every request.

Write-behind (write-back) is rarely worth it. The application writes to the cache, returns, and a background worker drains the cache to the source of truth asynchronously. The latency win is real for write-heavy workloads, but the cost is that a Redis crash before the drain is data loss. Only use it for telemetry, view counts, and other data where loss is acceptable. Document the tolerance explicitly in CLAUDE.md.

The trap pattern is caching without invalidation. Claude generates this constantly without guidance: a 30-minute cache around an expensive query, no invalidation on write, justified by "it will expire eventually". For most data that is wrong. A user updating their profile and seeing stale data for 30 minutes is a bug report. The CLAUDE.md rule that prevents this is NEVER cache without an explicit invalidation strategy. Document it inline.

// Good: explicit invalidation comment
/**
 * Cache invalidation: invalidate(keys.userProfile(userId)) is called from
 * api/user/update.ts after every profile mutation.
 */
export async function getUserProfile(userId: string) {
  return getOrSet(
    keys.userProfile(userId),
    TTL.cacheLong,
    () => db.select().from(users).where(eq(users.id, userId))
  );
}

The comment is the contract. When Claude generates a new mutation endpoint for users, it reads this comment, finds the invalidation call site, and adds the matching invalidation. Without the comment, the contract lives in someone's head and breaks the next time the code is touched. The Claude Code Postgres guide covers the source-of-truth patterns these caches sit in front of.

Rate limiting and session storage with real ioredis code

Rate limiting is the single most common bug-rich Redis use case. The naive implementation looks correct, passes a unit test, and fails under concurrent load. The fix is atomicity, achieved either through a Lua script or MULTI/EXEC pipeline. The pattern below is a sliding window limiter using a single Lua script.

// db/redis/rate-limit.ts
import { redis } from "./index";
import { keys, TTL } from "./keys";

const SLIDING_WINDOW_LUA = `
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])

redis.call('ZREMRANGEBYSCORE', key, 0, now - window * 1000)
local count = redis.call('ZCARD', key)
if count >= limit then
  return {0, count}
end
redis.call('ZADD', key, now, now)
redis.call('PEXPIRE', key, window * 1000)
return {1, count + 1}
`;

export async function checkRateLimit(
  dimension: string,
  route: string,
  windowSeconds: number,
  maxRequests: number
): Promise<{ allowed: boolean; current: number }> {
  const key = `${keys.rateLimitIp(route, dimension)}`;
  const now = Date.now();
  const result = (await redis.eval(
    SLIDING_WINDOW_LUA,
    1,
    key,
    now.toString(),
    windowSeconds.toString(),
    maxRequests.toString()
  )) as [number, number];

  return { allowed: result[0] === 1, current: result[1] };
}

The script runs atomically on the Redis server, so concurrent requests cannot race past the limit. ZREMRANGEBYSCORE evicts entries older than the window, ZCARD counts what is left, and the script either rejects (returning 0) or appends a timestamp and accepts. PEXPIRE ensures the key cleans itself up if the dimension goes idle. The choice of a sorted set over a fixed-window counter is what makes it a sliding window: a burst at the boundary of two windows cannot double the effective limit.

The Lua script lives in code rather than SCRIPT LOAD because it is short, version-controlled with the application, and deployment-safe. Long scripts go in a db/redis/lua/ directory and are loaded with SCRIPT LOAD at startup, then invoked via EVALSHA. Claude generates this pattern when the rule is in CLAUDE.md.

The mistake to avoid is the increment-and-check pattern: read the counter, decide if the request is allowed, increment if it is. Two requests arriving at the boundary both read the same value, both decide they are under limit, both increment, and the limit silently doubles. This is the bug that ships in tutorials, in blog posts, and in production. It is invisible during a manual test and obvious during a load test. Atomicity in Redis means everything happens inside a single command, a Lua script, or a MULTI/EXEC block. The CLAUDE.md rule that prevents the bug: All limiters use atomic Lua scripts or MULTI/EXEC pipelines, never individual commands.

Session storage is simpler but has its own failure modes. The two questions to answer before writing any session code: what is the session lifetime, and how is it extended? A session that never extends logs the user out mid-task. A session that extends on every request never expires for active users but bloats the keyspace.

// db/redis/sessions.ts
import { redis } from "./index";
import { keys, TTL } from "./keys";

export type SessionData = {
  userId: string;
  createdAt: number;
  lastSeenAt: number;
  metadata?: Record<string, unknown>;
};

export async function createSession(userId: string, sessionId: string): Promise<void> {
  const session: SessionData = {
    userId,
    createdAt: Date.now(),
    lastSeenAt: Date.now(),
  };
  await redis.set(keys.session(sessionId), JSON.stringify(session), "EX", TTL.session);
}

export async function getSession(sessionId: string): Promise<SessionData | null> {
  const raw = await redis.get(keys.session(sessionId));
  return raw ? (JSON.parse(raw) as SessionData) : null;
}

export async function extendSession(sessionId: string): Promise<void> {
  const session = await getSession(sessionId);
  if (!session) return;
  session.lastSeenAt = Date.now();
  await redis.set(keys.session(sessionId), JSON.stringify(session), "EX", TTL.session);
}

export async function destroySession(sessionId: string): Promise<void> {
  await redis.del(keys.session(sessionId));
}

The extendSession call resets the TTL on each authenticated request, so active users stay logged in indefinitely while idle sessions expire. The trade-off is one extra Redis write per request, which is cheap but worth budgeting if you serve millions of requests per minute. For very high throughput, batch the extension by writing only when the last extension was more than a minute ago. The Claude Code environment variables guide covers the conventions for switching between local Redis and production REDIS_URL.

ioredis vs node-redis, and the recommendation

Both libraries are mature and well-maintained. The differences matter for what Claude generates. ioredis is the recommendation for any Node.js Redis project. Its API is more consistent, its TypeScript types are tighter, its support for pipelines and transactions is more ergonomic, and its handling of Cluster mode is built-in rather than bolted on. Pin it in CLAUDE.md and Claude generates idiomatic code.

// ioredis: clean pipeline API
const pipeline = redis.pipeline();
pipeline.set("a", 1);
pipeline.incr("b");
pipeline.expire("a", 60);
const results = await pipeline.exec();

// node-redis: similar but with different patterns for v4+ (multi vs pipeline)
// const multi = redis.multi();
// multi.set("a", 1);
// multi.incr("b");
// const results = await multi.exec();

The reason this matters with Claude: training data contains both libraries in roughly equal volume, so without pinning you get a file that imports ioredis and calls node-redis methods, or vice versa. Both are real bugs. Claude generates clean ioredis code when CLAUDE.md says Client library: ioredis 5.x (server), @upstash/redis (edge only) and the project structure has db/redis/index.ts as the single connection point.

For serverless edge functions (Vercel Edge, Cloudflare Workers), ioredis does not work because it depends on Node's TCP socket APIs. Use Upstash Redis with @upstash/redis instead. Upstash provides a REST-based client that runs in any JavaScript runtime, and the API surface is small enough that Claude generates correct calls without confusion.

// edge/redis.ts (Vercel Edge or Cloudflare Workers)
import { Redis } from "@upstash/redis";

export const redis = Redis.fromEnv(); // reads UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN

export async function checkUpstashRateLimit(ip: string, route: string) {
  const key = `prod:rate:${route}:ip:${ip}`;
  const count = await redis.incr(key);
  if (count === 1) {
    await redis.expire(key, 60);
  }
  return { allowed: count <= 100, current: count };
}

The pattern is the same in concept (atomic increment, expire on first write) but the API differs. Keep edge and server Redis code in separate directories so they cannot accidentally cross-import. The Claude Code monorepo guide covers the package boundary conventions that make this clean.

Pub/sub and Redis Streams

Redis ships two messaging primitives. Pub/sub is fire-and-forget: a publisher sends to a channel, every subscriber currently connected receives, anyone not connected misses the message. Streams are durable: messages are appended to a log, consumer groups track their position, and a slow consumer catches up rather than dropping events.

The choice is not aesthetic. Use pub/sub for non-essential live updates, like cache invalidation broadcasts to other application instances. A miss is recoverable because the cache will refill on the next read. Use Streams for anything that drives a side effect, like emitting a domain event that triggers an email or a webhook. A dropped event is a missed email.

// db/redis/pubsub.ts
import { redis } from "./index";
import Redis from "ioredis";

const subscriber = new Redis(process.env.REDIS_URL!);

export async function publishCacheInvalidation(key: string): Promise<void> {
  await redis.publish("cache:invalidations", key);
}

export function subscribeToInvalidations(
  onInvalidate: (key: string) => void | Promise<void>
): () => void {
  subscriber.subscribe("cache:invalidations");
  const handler = async (channel: string, message: string) => {
    if (channel === "cache:invalidations") {
      await onInvalidate(message);
    }
  };
  subscriber.on("message", handler);
  return () => {
    subscriber.off("message", handler);
    subscriber.unsubscribe("cache:invalidations");
  };
}

Pub/sub requires a separate ioredis client because subscriber connections cannot issue normal commands. The pattern of one publisher client and one subscriber client is the convention to pin in CLAUDE.md, and Claude follows it consistently when it is documented.

Streams are richer and worth investing in for any event-driven workflow. The producer appends with XADD, the consumer reads with XREADGROUP, and the consumer group tracks acknowledged versus pending messages.

// db/redis/streams.ts
import { redis } from "./index";

const STREAM = "events:domain";
const GROUP = "workers";

export async function emitDomainEvent(type: string, data: Record<string, unknown>): Promise<string> {
  return redis.xadd(STREAM, "*", "type", type, "data", JSON.stringify(data));
}

export async function ensureConsumerGroup(): Promise<void> {
  try {
    await redis.xgroup("CREATE", STREAM, GROUP, "$", "MKSTREAM");
  } catch (err) {
    // BUSYGROUP means the group already exists; ignore
    if (!(err as Error).message.includes("BUSYGROUP")) throw err;
  }
}

export async function consumeBatch(
  consumerName: string,
  handler: (id: string, fields: Record<string, string>) => Promise<void>
): Promise<number> {
  const result = await redis.xreadgroup(
    "GROUP", GROUP, consumerName,
    "COUNT", 32,
    "BLOCK", 5000,
    "STREAMS", STREAM, ">"
  );
  if (!result) return 0;

  let processed = 0;
  for (const [, messages] of result as Array<[string, Array<[string, string[]]>]>) {
    for (const [id, fieldList] of messages) {
      const fields: Record<string, string> = {};
      for (let i = 0; i < fieldList.length; i += 2) {
        fields[fieldList[i]] = fieldList[i + 1];
      }
      await handler(id, fields);
      await redis.xack(STREAM, GROUP, id);
      processed++;
    }
  }
  return processed;
}

Three things make this production-ready. ensureConsumerGroup is idempotent, so a fresh deployment does not need a manual setup step. The consumer takes a consumerName so multiple worker processes can share the load while the group tracks pending messages per-consumer. And xack confirms processing only after the handler completes, so a crashed worker leaves its messages in the pending list for another consumer to claim with XCLAIM.

A separate periodic worker should scan the pending list for messages held longer than a threshold and reclaim them. Without that reaper, a worker that died mid-processing leaves its messages stranded forever. Ten lines of code prevent the failure: list pending messages older than five minutes, claim them to a fresh consumer, retry. Claude generates this when the rule is documented in CLAUDE.md, but does not invent it from a generic Streams prompt.

The trap with Streams is unbounded growth. Every event sits in the stream forever unless trimmed. The convention is to call XADD STREAM MAXLEN ~ 10000 * type ... with the approximate maxlen flag, which trims to roughly 10,000 entries with each append. Adjust the bound based on retention needs and the size of your events. Without trimming, a stream that takes 100 events per second is several gigabytes within a week. The Claude Code MongoDB guide covers the equivalent retention patterns for document stores.

Hard rules and conclusion

Five rules separate Redis projects where Claude Code is a productivity multiplier from projects where it is a liability.

First, every key gets a TTL. No exceptions outside dedicated config. Unbounded keys are how Redis instances quietly fill up over months until the OOM killer fires.

Second, never use KEYS in request paths. Use SCAN with a cursor, or maintain an explicit index set. KEYS is the first command to remove from the vocabulary.

Third, document invalidation inline with the cache. Every getOrSet call should be paired with a comment naming the call site that invalidates it. Caches without invalidation are bugs waiting to be filed.

Fourth, rate limiters are atomic. Lua scripts or MULTI/EXEC, never individual commands. Race conditions in rate limiters look like working code under light load and produce billing incidents under real traffic.

Fifth, pin the client library and key format. ioredis on the server, @upstash/redis on the edge, the namespace {env}:{domain}:{entity}:{id}, and centralised key construction in db/redis/keys.ts. Choices removed from each call site become rules Claude follows automatically.

The Redis CLAUDE.md template above produces a development environment where keys are namespaced consistently, every cache has an explicit invalidation contract, rate limiters are atomic, sessions extend predictably, and pub/sub versus Streams selection follows the durability requirement rather than the closest tutorial. The Claude Code best practices guide covers the workflow conventions that scale these patterns across teams. Claudify includes a Redis-specific CLAUDE.md template as part of the Claude Code workflow kit, pre-configured for ioredis, Upstash, and the safety conventions that keep keyspaces bounded and invalidation correct.