Claude Code Context Window Guide
What is the Claude Code context window?
The Claude Code context window is the total amount of information that fits in a single conversation. Every prompt you type, every file Claude reads, every tool result, and every response Claude generates counts toward this limit. When the window fills up, quality degrades and eventually the session auto-compacts.
Understanding how this works is the difference between sessions that stay productive for hours and sessions that fall apart after 30 minutes.
How big is the context limit?
Claude Code uses the full context window of whichever model you are running:
| Model | Raw Context | Usable Working Space |
|---|---|---|
| Claude Sonnet 4 | 200K tokens | ~150K tokens |
| Claude Opus 4 | 200K tokens | ~150K tokens |
| Claude Haiku 4.5 | 200K tokens | ~150K tokens |
The gap between raw and usable exists because the system prompt, your CLAUDE.md file, MCP tool definitions, and project configuration consume tokens before you type anything. On a project with 15 MCP servers and a detailed CLAUDE.md, your system prompt alone can consume 30-50K tokens.
That leaves roughly 150K tokens for actual work. Sounds generous until you see how fast it goes.
What counts toward the context limit
Not all operations cost the same. Here is what consumes the most context, ranked by typical token cost:
| Operation | Typical Token Cost | Notes |
|---|---|---|
| Large file read (1000+ lines) | 10-25K tokens | Biggest single consumer |
| Tool call with output | 2-5K tokens per call | Compounds fast across 20+ calls |
| Your prompt messages | 100-500 tokens each | Small individually, adds up over long sessions |
| Claude's responses | 500-3K tokens each | Longer reasoning means more tokens |
| CLAUDE.md (loaded at start) | 2-10K tokens | Depends on file size |
| MCP tool definitions | 5-20K tokens total | Each enabled server adds definitions |
A practical example: reading 5 medium files (50K tokens), 15 tool calls with results (45K tokens), plus conversation history (30K tokens) puts you at 125K. That is one focused task and you are already at 80% capacity.
The CLAUDE.md token tradeoff
Your CLAUDE.md file loads into every session automatically. A 200-line CLAUDE.md might cost 5K tokens. A 500-line one might cost 15K. Those tokens are consumed before you do anything.
This creates a real tradeoff. More instructions mean better behavior but less working space. The best approach:
- Keep CLAUDE.md under 200 lines for the core file
- Move detailed procedures into commands and skills that load on demand
- Use a retrieval map (a table of "you need X, check Y") instead of inlining everything
The goal is a lean core that loads fast, with deeper knowledge pulled in only when relevant.
Signs your context is filling up
Watch for these patterns in order of severity:
- Instruction drift. Claude stops following rules from your CLAUDE.md or earlier in the conversation.
- Repetition. Claude suggests changes you already made, or re-reads files it already read.
- Generic responses. Answers become less specific to your codebase and more like generic advice.
- Missing connections. Claude fails to connect information from different parts of the session.
- Compaction warning. The "compacting conversation" message means you have already hit the ceiling.
If you catch symptoms 1 through 3 early, you can clear proactively. If you hit 5, you have already lost context you cannot recover.
How auto-compaction works
When the context window fills, Claude Code automatically compacts the conversation:
- The full message history is summarized into a compressed form
- Detailed messages are dropped
- The summary replaces the history as context for continuing
The problem is that compaction is lossy. You do not control what survives. Specific file paths, error messages, nuanced decisions about architecture, partially completed plans can all be lost or simplified beyond usefulness.
Auto-compaction is a safety net, not a strategy. Relying on it means losing information at unpredictable moments during critical work.
Built-in context commands
Claude Code ships with two context management commands:
/compact compresses the conversation while keeping key information. You can guide what it preserves:
/compact keep the migration plan and list of modified files
Use this when you are mid-task and need breathing room without starting over.
/clear wipes the entire conversation. CLAUDE.md reloads, but all session state is gone. No undo.
The problem with /clear is that it is destructive. Whatever plan you were following, whatever files you changed, whatever decisions you made during the session vanish.
Building a safe clear
The solution is to save state before clearing. The pattern:
- Distill the current session into a handoff (what was done, what remains, key files, next action)
- Write the handoff to a daily note or memory file
- Clear the context
- Reload the handoff and resume
This turns a destructive clear into a restorable compression. The session boundary becomes invisible because work continues from exactly where it stopped.
A custom /safe-clear command automates this entire flow. Claudify ships with one that handles distillation, persistence, and auto-resume in about 5 seconds.
Proactive context strategies
Do not wait for the warning. Manage context before it becomes a problem:
Clear between unrelated tasks
Context from task A pollutes task B. Finishing a database migration and switching to frontend components? Clear first. The migration context is not helping. It is consuming tokens and potentially biasing suggestions toward backend patterns.
Read only what you need
Every file read costs tokens. Instead of "read the entire src directory," point Claude at specific files:
Read src/api/webhook.js lines 45-80
Line ranges on large files save thousands of tokens per read.
Use a memory system
A persistent memory system means context clears do not lose information. Priorities, decisions, and learnings survive in memory files that reload after each clear.
Without memory, every clear is a hard reset. With memory, it is a soft refresh.
Monitor tool call count
After roughly 30 tool calls in a session, context is getting heavy. At 50 or more, quality is almost certainly degraded. Use tool call count as a rough gauge for when to clear.
Batch related work together
Structure sessions around related tasks:
/start -> Task A (related files) -> /safe-clear -> Task B (different area) -> /safe-clear -> Task C
Each task gets clean, focused context. No cross-contamination from unrelated work.
Hooks for automatic safety
Claude Code hooks can protect you automatically:
{
"hooks": {
"PreCompact": [{
"type": "command",
"command": "save-state-before-compaction.sh"
}],
"SessionStart(compact)": [{
"type": "command",
"command": "restore-after-compaction.sh"
}]
}
}
A PreCompact hook fires before auto-compaction, saving your state. A SessionStart(compact) hook fires after, restoring it. Even if you forget to manage context manually, the system catches it.
Context-efficient project structure
How you organize your project affects context efficiency:
- Small, focused CLAUDE.md with a retrieval map pointing to detailed files
- Commands (
.claude/commands/) that load instructions on demand instead of keeping everything in the system prompt - Skills (
.claude/skills/) for domain knowledge that Claude reads only when relevant - Memory files that persist state across clears so you never need to keep everything in context simultaneously
This is the architecture Claudify uses. The core CLAUDE.md is lean. The 21 commands, 9 agents, and 1,727 skills load only when invoked. Context stays clean because knowledge is distributed, not centralized.
The right mental model
Think of context as RAM, not storage. It is a working space for the current task, not a permanent record. Your files are storage. Your memory system is your persistence layer. Context is just the active workspace.
Treat it that way:
- Do not hoard context. Clear proactively.
- Do not rely on context for memory. Write important things down.
- Do not fear the clear. With state preservation, it costs almost nothing.
The developers who fight context limits are the ones trying to do everything in one session. The developers who work with the constraint, clearing often with state preservation and focused tasks, rarely hit the limit at all.
FAQ
How many tokens does a Claude Code session actually use?
A typical productive session uses 80-120K tokens before needing a clear. Heavy sessions with multiple large file reads and 30+ tool calls can burn through 150K in under an hour. Light sessions with targeted reads and focused prompts can last several hours on 200K.
Can I increase the Claude Code context window?
The context window size is set by the model (200K tokens for Claude Sonnet 4, Opus 4, and Haiku 4.5). You cannot increase it directly. But you can maximize usable space by reducing system prompt overhead: disable unused MCP servers, keep CLAUDE.md lean, and use on-demand loading for commands and skills.
What is the difference between /compact and /clear?
/compact compresses your conversation while preserving key information. You stay in the same session and can continue working. /clear wipes everything and starts fresh. With a custom /safe-clear command, you get the clean slate of /clear with the continuity of /compact because state is saved before clearing and restored after.
Build a complete context system
Context management works best alongside CLAUDE.md configuration, persistent memory, hooks for automated protection, and commands for structured workflows. Each piece handles a different part of the problem.
Get Claudify -- Context management, memory, 21 commands, and 9 agents. Installed in one command.
More like this
Ready to upgrade your Claude Code setup?
Get Claudify