If you've used ChatGPT or Claude for more than a few minutes, you've probably encountered the word "token" somewhere — in pricing pages, error messages, or vague warnings about hitting a limit. But what actually is a token, and why should you care?
The short answer: a token is the unit AI models use to read and write text. Think of it as a word fragment. AI models don't see your message as words or characters — they break everything into tokens first, and every token costs money.
Tokens are not words
This is the first thing most people get wrong. A token is roughly ¾ of a word, or about 4 characters of English text. Common words like "the", "hello", or "code" are usually one token. Longer or unusual words get split into multiple tokens. Numbers, punctuation, and whitespace all consume tokens too.
"Kontinuity" → ["Kont", "inu", "ity"] → 3 tokens
"GPT-4o" → ["G", "PT", "-", "4", "o"] → 5 tokens
This matters because pricing is per-token, not per-word. A prompt full of technical jargon, variable names, or non-English text will use more tokens than conversational English — and cost proportionally more.
Input tokens vs output tokens
Every interaction with an AI model has two sides: what you send in (input tokens) and what the model sends back (output tokens). These are priced differently — output tokens typically cost 3–5x more than input tokens because generating text is more computationally expensive than reading it.
Key insight: When you send a message on Claude or ChatGPT, your input tokens include far more than just what you typed. The full payload includes system prompts, conversation history, tool definitions, and other context the platform injects automatically. This "overhead" is often 5–10x your visible message.
Why token counts matter for developers
If you're building with AI — or even just using it heavily for coding — tokens affect you in three ways.
Cost. Every token costs money. Claude Opus 4 charges $15 per million input tokens and $75 per million output tokens. A typical coding session of 30 messages might use 50,000–200,000 tokens. That's real money over the course of a month, especially on premium models.
Context window limits. Every model has a maximum number of tokens it can process at once — its "context window." Claude's is 200K tokens; GPT-4o's is 128K tokens. Once your conversation exceeds this, older messages get dropped. This is why AI starts "forgetting" your instructions in long conversations.
Response quality. Research shows that as the context window fills up, models shift to a recency bias — prioritizing your last few messages over earlier instructions. Understanding token counts helps you know when to start a fresh conversation before quality degrades.
How to see your actual token usage
If you're using the API directly, you get exact token counts in every response. But if you're using the web interfaces at claude.ai or chatgpt.com — like most developers — you get zero visibility. You're flying blind.
That's the problem Kontinuity solves. It tracks your tokens in real time as you use Claude and ChatGPT through their web interfaces, showing you exactly what each message costs, how much overhead the platform is adding, and where your spending is going across projects.
See your tokens in real time
Kontinuity tracks token usage and costs across Claude and ChatGPT. Free during beta.
Try Free for 14 Days →Rules of thumb
To develop an intuition for tokens: 1 token ≈ 4 characters ≈ ¾ of a word. A typical paragraph is about 100 tokens. A full page of text is roughly 500 tokens. A long coding prompt with context might run 1,000–3,000 tokens. And remember that output tokens are always more expensive than input tokens, so concise prompts that elicit focused responses save money on both sides.
Understanding tokens is the foundation of using AI cost-effectively. Once you see the numbers, you'll never look at a prompt the same way again.