Why Streaming Matters
Without streaming, users wait for the ENTIRE response before seeing anything. With streaming, text appears word by word - making Claude feel faster even though total time is the same.
# Streaming response
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain quantum computing"}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)When to Stream vs Not
| Stream | Don't Stream |
|---|---|
| Chatbots / real-time UI | Background processing |
| Long responses | Short, structured responses |
| User-facing apps | Batch jobs / pipelines |
Key Takeaways
- This lesson covered response streaming: real-time output
- Apply these concepts in your own projects before moving on
- Refer back to this lesson when you encounter related challenges