System prompts
Advanced AI Workflows & the Claude API
Everything you've done in previous courses — using Claude Code, building projects, pair programming in VS Code — runs on top of the Claude API. This course goes one layer deeper: you'll call the API directly, build Claude-powered applications, chain requests together, and design workflows that go beyond what any chat interface can do. This first chapter covers authentication, the current model family, and the anatomy of a request and response.
Getting an API Key
The API key is the credential that identifies your application to Anthropic. Every request must include it. To get one:
- Sign in at console.anthropic.com
- Navigate to API Keys in the left sidebar
- Click Create Key, give it a name (e.g. "meal-planner-dev"), and copy it immediately — it won't be shown again
- Store it in an environment variable:
ANTHROPIC_API_KEY=sk-ant-...
.env to your .gitignore immediately.
Installing the SDK
Anthropic provides official SDKs for Python and TypeScript. For Python:
# Or with uv (faster)
uv add anthropic
The SDK handles authentication, request serialisation, error parsing, and retry logic for you. You can also call the API directly over HTTP — the SDK is a thin convenience wrapper, not a black box. Understanding the raw request format (covered below) makes debugging SDK issues much easier.
The Current Model Family
claude-sonnet-4-6. Once the feature works correctly, test whether Haiku produces acceptable quality at lower cost. Switch to Opus only if Sonnet's output quality isn't sufficient. This order saves money during development and avoids premature optimisation.
Your First API Request
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from env automatically
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain what an API key is in two sentences."}
]
)
print(message.content[0].text)
That's the minimum viable request — model, max_tokens, and a messages list with at least one user turn. Everything else is optional. Run it and you'll see Claude's response printed to the terminal.
Request Parameters — What Each Field Does
"claude-sonnet-4-6"
{"role": "user" | "assistant", "content": "..."}. Must start with a user turn. Alternate roles for multi-turn conversations.
For a fresh request: one user message. For conversation history: alternate user/assistant pairs followed by the new user message.
["", "###END###"]
True to receive the response as a stream of events rather than waiting for the full response. Covered in depth in Chapter 4.
Essential for any UI that shows Claude's response as it's generated.
Reading the Response Object
# "msg_01XFDUDYJgAACzvnptvVoYEL" — log this for debugging
message.content[0].text the actual response
# The text Claude generated. content is a list — always index [0] for simple requests
message.usage.input_tokens tokens consumed
message.usage.output_tokens tokens generated
# Input + output tokens = what you're billed for
message.stop_reason why Claude stopped
# "end_turn" → Claude finished naturally ✓
# "max_tokens" → hit the limit — increase max_tokens if output was cut off
# "stop_sequence" → hit one of your stop_sequences
# "tool_use" → Claude wants to call a tool (Chapter 3)
stop_reason is "max_tokens", Claude's response was cut off — the output is incomplete. Either increase max_tokens or redesign the prompt to produce shorter output. A truncated response that looks complete is a subtle bug that's easy to miss in testing.
A Request with a System Prompt
model="claude-sonnet-4-6",
max_tokens=512,
system="You are a concise technical writer. "
"Always respond in plain text with no markdown. "
"Keep answers under 3 sentences.",
messages=[
{"role": "user", "content": "What is a webhook?"}
]
)
Common API Errors and What They Mean
| Status | Error type | Cause and fix |
|---|---|---|
| 401 | authentication_error | API key missing, wrong, or revoked. Check your environment variable is set and the key is valid in the console. |
| 400 | invalid_request_error | Malformed request — wrong field name, wrong type, messages list starts with assistant turn, or max_tokens exceeds model limit. Read the error message; it's usually specific. |
| 429 | rate_limit_error | You've exceeded your requests-per-minute or tokens-per-minute limit. Implement exponential backoff and retry. Chapter 10 covers rate limit strategy in depth. |
| 413 | request_too_large | Input tokens exceed the model's context window. Reduce the size of your messages — chunk long documents or truncate conversation history. |
| 500 | api_error | Anthropic server error. Retry with backoff — these are transient. If persistent, check status.anthropic.com. |
| 529 | overloaded_error | Anthropic is temporarily overloaded. Retry with backoff. More common during peak hours — design your application to handle this gracefully. |
Handling Errors in Code
client = anthropic.Anthropic()
try:
message = client.messages.create(...)
except anthropic.AuthenticationError:
raise RuntimeError("Invalid API key — check ANTHROPIC_API_KEY")
except anthropic.RateLimitError:
raise # caller handles retry with backoff
except anthropic.APIStatusError as e:
print(f"API error {e.status_code}: {e.message}")
raise