Download Beam

Claude Code /compact and Auto-Memory: Master Context Window Management

February 2026 • 10 min read

Every Claude Code session has a hard ceiling: 200,000 tokens of context. That sounds like a lot until you are deep in a complex refactor, reading dozens of files, running test suites, and iterating on implementations. At that point, the context window fills up fast, and when it does, the agent's quality degrades -- it forgets earlier decisions, repeats work, or contradicts itself.

Managing context is not optional. It is the difference between a productive 2-hour session and one where you restart three times because the agent lost track of what it was doing. Here is how to do it right.

Understanding the 200K Token Budget

Before optimizing, you need to understand where your tokens go. A typical Claude Code session consumes context in four categories:

Token Consumption Breakdown

  • System prompt and CLAUDE.md: 2,000-5,000 tokens (loaded once at session start)
  • File reads: 500-5,000 tokens per file (cumulative -- every file the agent reads stays in context)
  • Tool use and outputs: 200-2,000 tokens per tool call (bash commands, grep results, file edits)
  • Conversation history: Everything you type plus every response the agent generates

A session that reads 30 files, runs 20 commands, and has 15 back-and-forth exchanges can easily consume 150,000 tokens. That leaves very little room for the agent to continue working effectively.

You can check your current usage at any time by running /cost in your Claude Code session. This shows token consumption and estimated spend -- both useful signals for knowing when to intervene.

The /compact Command: What It Does and When to Use It

The /compact command is your primary tool for context management. When you run it, Claude Code summarizes the entire conversation history into a compressed representation, typically achieving a 60-70% reduction in token usage. The agent retains the key decisions, file modifications, and current state while discarding the verbose intermediate steps.

Think of it as the agent writing itself a set of notes, then starting fresh with those notes as context. The notes contain what matters -- which files were changed, what the current objective is, what patterns to follow -- without the full transcript of how it got there.

When to Compact

When Not to Compact

Pro tip: You can provide a custom focus to /compact by running /compact focus on the authentication refactor. This tells the summarizer to prioritize retaining context about a specific topic, which is useful when you know what the next phase of work will be.

Auto-Memory: Persistence Between Sessions

Compaction solves the within-session problem. But what happens when you close the terminal and come back tomorrow? Without explicit memory, the next session starts from zero -- no knowledge of yesterday's decisions, no awareness of which files were changed, no recollection of the architectural direction you chose.

Claude Code's auto-memory feature addresses this through the CLAUDE.md file system. Here is how it works:

  1. Project-level memory: A CLAUDE.md file in your project root is automatically loaded at session start. It contains project conventions, architecture decisions, and current state.
  2. User-level memory: A ~/.claude/CLAUDE.md file persists across all projects. Use this for personal preferences, common patterns, and cross-project conventions.
  3. Session memory: During a session, the agent can update the CLAUDE.md file with new decisions and state. This is the "auto-save" mechanism -- tell the agent to "save this decision to memory" and it writes to the file.

Effective Memory File Structure

# Project: MyApp

## Architecture
- Next.js 14, App Router, TypeScript
- Prisma ORM with PostgreSQL
- Tailwind CSS for styling

## Conventions
- Named exports only
- API routes return { data, error } shape
- Tests colocated with source files

## Current State (updated 2026-02-28)
- Auth refactor: COMPLETE
- Dashboard redesign: IN PROGRESS
  - Header component done
  - Sidebar navigation done
  - Main content area: next task
- Known issues: #142 (race condition in WebSocket handler)

The "Current State" section is the most important part. Update it at the end of every session. When the next session starts, the agent immediately knows where you left off and what to work on next.

Cost Monitoring: The /cost Command

Context management is also cost management. Every token consumed costs money, and long sessions with bloated context are expensive. The /cost command gives you real-time visibility into your spending.

Develop a habit of checking /cost at regular intervals. Here is a practical monitoring schedule:

Compact vs. Start Fresh: Decision Framework

Sometimes /compact is not enough. Sometimes you need a completely fresh session. Here is how to decide:

Use /compact When:

  • You are continuing the same project and general task area
  • The agent has made decisions you want it to remember
  • You are switching sub-tasks within the same feature
  • Context usage is between 60-85% of capacity

Start a Fresh Session When:

  • You are switching to a completely different project or feature
  • The agent's behavior has degraded significantly despite compaction
  • You want to change the fundamental approach (compacted context carries forward old assumptions)
  • Context has exceeded 90% and even compaction will not free enough space for the next task

When starting fresh, make sure your memory file is up to date first. Run /compact one final time, ask the agent to update the CLAUDE.md with current state, then close the session and open a new one. The new session loads the updated memory file and picks up seamlessly.

Shared-Context Patterns for Multi-Agent Workflows

When running multiple agents on the same project -- one for frontend, one for backend, one for tests -- context management becomes a coordination problem. Each agent has its own 200K token window, but they need to stay aware of each other's changes.

The solution is shared memory through the file system. All agents read the same CLAUDE.md file. When one agent makes an architectural decision, it writes to the memory file. The other agents pick up the change on their next file read.

Beam's Install/Save Memory Workflow

Managing memory files manually works, but it adds friction. Every session start requires remembering to check the memory file. Every session end requires remembering to update it. Human memory is unreliable -- which is ironic when the goal is to manage AI memory.

Beam automates this with two toolbar buttons. Install Project Memory reads your memory file and installs it as the project's CLAUDE.md, ensuring every new session starts with full context. Save Project Memory captures the current session state and writes it back to the memory file.

The workflow becomes automatic:

  1. Open Beam, click Install Memory -- session starts with full project context
  2. Work with Claude Code for an hour or two, compacting as needed
  3. When you finish, click Save Memory -- session state is preserved
  4. Tomorrow, repeat from step 1 -- zero context loss between days

This eliminates the most common context management failure: forgetting to save state before closing a session. When saving is a single click rather than a manual file edit, it actually happens consistently.

Context window management is not glamorous. It is not the exciting part of agentic coding. But it is the part that determines whether your agent sessions are productive or frustrating, whether your costs are predictable or surprising, and whether your multi-day projects maintain coherence or restart from scratch every morning. Master /compact, maintain your memory files, and monitor your costs. Everything else in agentic coding gets easier when context is under control.

Never Lose Context Again

Beam's one-click Install and Save Memory workflow preserves your project context across every session. Stop restarting from scratch.

Download Beam Free