ai workflows

Token Reuse

The compounding effect where each new AI response requires reprocessing all previous conversation tokens, increasing latency and cost with every turn.

Token reuse is why long chats slow down even when your latest question is short. The model re-reads the entire conversation history on every turn. A follow-up question in a 50-turn session costs far more than the same question in a fresh session because the model is hauling the full backlog. Tool-heavy and code-heavy sessions are the worst offenders.

Learn more in our full guide: Read the article

Related Terms