Building a simple AI-powered app
Advanced AI Workflows & the Claude API
The first four chapters covered the Claude API in pieces — authentication, system prompts, tool use, streaming. This chapter puts them together. You'll build a complete AI-powered chat application: a FastAPI backend that manages conversation history and calls Claude, a streaming endpoint, and a minimal frontend that reads the stream. The goal is a working mental model of how all the parts connect, not a production-ready UI framework.
What We're Building
The app has two routes: GET / serves the HTML page, and POST /chat accepts a user message, appends it to the conversation history, calls Claude with streaming, and returns the response as a stream. The browser appends tokens as they arrive.
Project Structure
main.pyFastAPI app — routes, session store, Claude calls
.envANTHROPIC_API_KEY=sk-ant-...
static/
index.htmlfrontend — chat UI + streaming fetch
requirements.txtfastapi, uvicorn, anthropic, python-dotenv
uvicorn[standard]
anthropic
python-dotenv
Step 1 — Conversation History
Claude has no memory between API calls. Every request must include the full conversation so far. You manage this by keeping a list of {"role": "...", "content": "..."} dicts and appending each new turn. For a real app you'd store this in a database; for this example, a server-side dict keyed by session ID is enough:
The next API call includes all four entries. Claude sees the context from the first question when answering the second — that's how conversation continuity works.
Step 2 — The FastAPI Backend
from dotenv import load_dotenv
from fastapi import FastAPI, Cookie, Response
from fastapi.responses import FileResponse, StreamingResponse
from fastapi.staticfiles import StaticFiles
from pydantic import BaseModel
import anthropic
load_dotenv()
app = FastAPI()
client = anthropic.Anthropic()
app.mount("/static", StaticFiles(directory="static"), name="static")
# In-memory session store: session_id → list of message dicts
sessions: dict[str, list] = {}
SYSTEM_PROMPT = """You are a helpful assistant. \
Answer concisely and accurately. \
If you don't know something, say so."""
class ChatRequest(BaseModel):
message: str
@app.get("/")
def index(response: Response, session_id: str | None = Cookie(default=None)):
if not session_id or session_id not in sessions:
session_id = str(uuid.uuid4())
sessions[session_id] = []
response.set_cookie("session_id", session_id)
return FileResponse("static/index.html")
def stream_claude(history: list, collected: list):
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=2048,
system=SYSTEM_PROMPT,
messages=history
) as stream:
for text in stream.text_stream:
collected.append(text) # accumulate for history
yield text # stream to client
@app.post("/chat")
def chat(req: ChatRequest, session_id: str = Cookie()):
history = sessions[session_id]
history.append({"role": "user", "content": req.message})
collected = [] # will hold assistant reply chunks
def after_stream():
# Save assistant reply to history once stream ends
history.append({"role": "assistant", "content": "".join(collected)})
def generate():
yield from stream_claude(history, collected)
after_stream()
return StreamingResponse(generate(), media_type="text/plain")
Step 3 — The Frontend
<html lang="en"><head>
<meta charset="UTF-8">
<title>AI Chat</title>
<style>
body { font-family: system-ui; background: #0d1117; color: #c9d1d9;
display:flex; flex-direction:column; height:100vh; margin:0; padding:1rem; }
#log { flex:1; overflow-y:auto; padding:1rem; background:#161b22;
border-radius:8px; margin-bottom:1rem; white-space:pre-wrap; }
.user-msg { color:#7ee787; margin:0.5rem 0; }
.bot-msg { color:#c9d1d9; margin:0.5rem 0; }
#form { display:flex; gap:0.5rem; }
input { flex:1; padding:0.6rem; background:#161b22; border:1px solid #30363d;
border-radius:6px; color:#c9d1d9; }
button { padding:0.6rem 1.2rem; background:#a78bfa; border:none;
border-radius:6px; color:#0d1117; font-weight:700; cursor:pointer; }
</style></head><body>
<div id="log"></div>
<form id="form">
<input id="msg" placeholder="Type a message…" autocomplete="off">
<button>Send</button>
</form>
<script>
const log = document.getElementById('log');
function addMsg(cls) {
const el = document.createElement('div');
el.className = cls;
log.appendChild(el);
return el;
}
document.getElementById('form').addEventListener('submit', async (e) => {
e.preventDefault();
const input = document.getElementById('msg');
const text = input.value.trim();
if (!text) return;
addMsg('user-msg').textContent = 'You: ' + text;
input.value = '';
const botEl = addMsg('bot-msg');
botEl.textContent = 'Claude: ';
const res = await fetch('/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: text })
});
const reader = res.body.getReader();
const dec = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
botEl.textContent += dec.decode(value);
log.scrollTop = log.scrollHeight;
}
});
</script></body></html>
Running the App
pip install -r requirements.txt
# Run the server (--reload for auto-restart on file changes)
uvicorn main:app --reload
# Open in browser
http://localhost:8000
How the Pieces Connect
-
1Browser loads the page → receives a session cookie FastAPI's
GET /creates a UUID session and sets it as a cookie. That cookie travels with every subsequent request, identifying whose history to use. -
2User sends a message → appended to history
POST /chatreceives the JSON body, looks up the session's history list, and appends{"role": "user", "content": "..."}. -
3Claude is called with the full history The
messagesparameter contains every turn so far — Claude sees the full conversation every time, which is how it maintains context across turns. -
4Response streams to the browser The generator
yields text chunks as they arrive from Claude. FastAPI'sStreamingResponseforwards them immediately. The browser appends each chunk to the message element. -
5After stream ends → assistant turn saved to history The accumulated chunks are joined and appended as
{"role": "assistant", "content": "..."}. The next user message will include this reply, keeping the conversation coherent.
What This App Has and What It's Missing
.env and is read server-side only — it never touches the browser. Never put your API key in frontend code; it would be visible to anyone who opens DevTools.