You've been coding with AI for two hours. The first 20 messages were productive — you established an architecture, made decisions about your tech stack, and started building. But somewhere around message 40, something shifted. The AI starts suggesting approaches that contradict its earlier recommendations. It "forgets" that you chose React over Vue. It rewrites a function it already wrote, differently. You spend 30 minutes debugging an issue the AI introduced by ignoring its own prior work.
This is context drift — and it's one of the most expensive, least understood problems in AI-assisted development.
What's actually happening inside the model
AI models don't have memory in the way humans do. They process the entire conversation from scratch on every single message. What feels like the model "remembering" your earlier discussion is actually the model re-reading the entire conversation history that gets sent along with each new message.
This creates two problems as conversations get longer. First, the conversation history starts eating into the model's context window — the maximum number of tokens it can process at once. Claude's context window is 200K tokens, GPT-4o's is 128K. When a conversation exceeds this limit, the platform silently drops the oldest messages to make room. Your carefully established architectural decisions from message #3 may literally no longer exist in the model's input by message #60.
Second, even when messages aren't being dropped, research shows that models develop a strong recency bias in long contexts. The model pays disproportionate attention to the most recent messages and the very beginning of the context, while "losing focus" on the middle. Your critical decisions from messages #10–20 fall into this attention dead zone.
Why this kills developer productivity
Context drift doesn't announce itself. There's no error message, no warning. The AI still responds confidently and fluently — it just starts being subtly wrong in ways that compound. You might spend an hour building on top of a suggestion that contradicts your original architecture before realizing something is off. Then you spend another hour untangling the mess.
The cruel irony: Context drift hits hardest during "vibe coding" sessions — exactly the long, immersive sessions where developers feel most productive. The longer you ride the flow state, the more likely you are to be building on drifted context without realizing it.
Compounding the problem, each message in a drifted conversation costs more than it should. By message #50, you're paying for 50 messages of accumulated overhead on every single request — but the quality of the responses has degraded significantly. You're paying premium prices for increasingly unreliable output.
How to detect context drift early
Watch for naming inconsistencies. If the AI starts calling a function handleSubmit that it previously named onSubmitForm, that's an early drift signal. The model has lost track of its own naming choices.
Notice when the AI re-introduces something you already decided against. If you explicitly chose Tailwind over CSS modules in message #5, and the AI starts generating CSS module syntax in message #30, your context has drifted.
Check for architectural contradictions. If the model suggests restructuring something it helped you build 20 messages ago — without acknowledging the change — it's no longer tracking the full picture of your project.
Track your message count. As a rule of thumb, drift risk increases significantly after 30–40 messages on complex coding tasks. If you're past 50, assume some degree of drift is occurring and verify critical outputs more carefully.
What to do about it
Summarize and restart regularly. The single most effective technique. Every 20–30 messages, ask the AI to summarize the key decisions, architecture, and current state. Copy that summary, start a fresh conversation, and paste it as the opening context. You'll get sharper responses at lower cost.
Keep a "project context" document. Maintain a running document with your tech stack decisions, naming conventions, architecture choices, and current progress. Paste the relevant sections into new conversations. This acts as a stable anchor that doesn't degrade over time.
Use shorter, focused conversations. Instead of one marathon session for an entire feature, break it into focused conversations: one for architecture planning, one for the data layer, one for the UI components. Each stays within the high-alignment zone.
Monitor your overhead ratio. When your system context overhead exceeds 90–95% of input tokens, most of what you're paying for is re-sending history rather than getting useful work done. That's a strong signal to start fresh.
Coming soon from Kontinuity: Context drift detection is our next major feature. It will automatically track alignment between your current responses and your original project goals, alerting you when drift risk is high — before you waste hours building on contradicted context.
Know when your context is drifting
Track your conversation length, overhead ratios, and token costs. Drift detection coming soon in Phase 2.
Get Kontinuity — Free →