Token Reuse
The compounding effect where each new AI response requires reprocessing all previous conversation tokens, increasing latency and cost with every turn.
Token reuse is why long chats slow down even when your latest question is short. The model re-reads the entire conversation history on every turn. A follow-up question in a 50-turn session costs far more than the same question in a fresh session because the model is hauling the full backlog. Tool-heavy and code-heavy sessions are the worst offenders.
Learn more in our full guide: Read the article
Related Terms
AI Token
The basic unit of text that AI language models process. Roughly 0.75 words per token in English, though the ratio varies by language and content type.
Context Window
The total amount of text, code, and conversation history an AI model can hold in active memory during a single session. Measured in tokens, not words.
Context Threshold
The percentage of context window usage at which AI output quality begins to noticeably degrade, typically around 50-70% depending on the model and task complexity.