How Conversations Work
Claude doesn't remember previous conversations. Every API call, you send the entire conversation history. Claude reads it all and responds to the latest message.
# Multi-turn conversation
messages = [
{"role": "user", "content": "My name is Alex."},
{"role": "assistant", "content": "Nice to meet you, Alex!"},
{"role": "user", "content": "What's my name?"},
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
# Claude will say "Your name is Alex" - because it's in the historyThe Cost Problem
Every message sends the FULL history. As conversations grow, costs multiply:
| Message # | History Size | Input Cost (Sonnet) |
|---|---|---|
| 1 | 100 tokens | $0.0003 |
| 5 | 2,000 tokens | $0.006 |
| 20 | 15,000 tokens | $0.045 |
| 50 | 40,000 tokens | $0.12 |
Solution: Sliding window + summary
Keep the last 10 messages verbatim. Summarize everything older into a single "context" message at the start. This caps your history size while preserving important context.
Key Takeaways
- This lesson covered multi-turn conversations: managing message history
- Apply these concepts in your own projects before moving on
- Refer back to this lesson when you encounter related challenges