Claude guide

Claude keeps forgetting earlier context — why it happens and how to fix it

Even with a 200,000 token context window, Claude can start behaving as though it doesn't remember things you established earlier. Here's what's actually happening and how to fix it.

Start free on Chrome

Two distinct phenomena that feel like forgetting

  • The first is a hard context limit: the conversation has exceeded 200,000 tokens and Claude tells you. This is rare in normal use.
  • The second is attention drift: more common and more subtle. Claude, like all large language models, pays differential attention across a long context. In very long conversations, content near the beginning receives less attention than content near the end — so instructions established early can have less influence on later responses.

Signs of attention drift

  • Claude ignores a writing style or format you established early on.
  • It stops following a constraint you specified in your first message.
  • Responses feel increasingly generic as the conversation progresses.
  • It produces output that contradicts an earlier decision you both agreed on.

How to fix it

  • Re-anchor key instructions mid-session — re-state critical constraints periodically rather than relying on a single mention at the start.
  • Summarise and reset: ask for a mid-session summary and start a new conversation with it as the opening.
  • Use Claude Projects for persistent instructions — these load fresh for every conversation within the Project.
  • Put critical instructions at the end of a long conversation, not just the start — models attend more strongly to recent content.

Related guides

Frequently asked questions

Is context drift worse in Claude than in ChatGPT?

Claude generally handles long contexts better than most models. But the attention drift phenomenon affects all transformer models. No current model maintains perfectly uniform attention across 200K tokens.

Does Claude Projects fix context drift?

Partially. Project instructions load fresh at the start of each conversation, so persistent instructions aren't buried under history. For truly long single conversations, re-anchoring is still needed.