GUIDE

Grok context window explained

Grok's context window is the amount of conversation xAI's model can hold in active memory at once. Grok 4 supports up to 256K tokens through the API, while the consumer Grok app effectively works with around 128K. Once you approach those limits, earlier parts of the thread get dropped or under-weighted, and Grok starts forgetting what you told it.

Start free on Chrome

Grok's token limits in plain numbers

A token is roughly three quarters of a word in English, so 256K tokens is about 190,000 words and 128K is about 96,000 words. The whole conversation counts against the window: your prompts, Grok's replies, any pasted text or files. Via the xAI API, Grok 4 accepts up to 256K tokens. Inside the Grok consumer app on grok.com and X, the practical working window is closer to 128K tokens, which is still very large but fills faster than people expect on long research or coding threads.

What happens when Grok runs out of room

Two things, and they feel similar. Sometimes you hit a hard ceiling and Grok refuses more input. Far more common is silent drift: the oldest messages quietly lose influence, so Grok starts ignoring instructions or decisions you set early in the thread. You are not warned. The replies just stop matching what you established.

How to keep context when Grok forgets

You cannot expand the window, but you can choose what enters it. A clean handover summary keeps the important parts small, strips out repeated back-and-forth, and gives a fresh Grok chat a focused starting point instead of a bloated transcript. thredly compresses long ChatGPT, Claude, and DeepSeek threads into structured handovers you can paste into Grok (or any other model) to continue without losing the plot.

Related guides

Frequently asked questions

How big is Grok's context window?

Grok 4 supports up to 256K tokens via the xAI API. Inside the Grok consumer app, the effective working window is around 128K tokens.

Why does Grok forget what I said earlier in a long chat?

Once a conversation approaches the context window, the oldest messages get pushed out or receive less of the model's attention. Grok keeps replying, but stops weighting the early instructions and decisions you set up.

How do I continue a long Grok conversation without losing context?

Summarise the thread into a focused handover that captures the instructions, decisions, and current state, then paste it into a fresh Grok chat. thredly automates this for ChatGPT, Claude, and DeepSeek threads, and the resulting summary works equally well as a starting prompt in Grok.