System prompts

Advanced AI Workflows & the Claude API

Chapter 1  ·  The Claude API — Authentication, Models, and Your First Request

Everything you've done in previous courses — using Claude Code, building projects, pair programming in VS Code — runs on top of the Claude API. This course goes one layer deeper: you'll call the API directly, build Claude-powered applications, chain requests together, and design workflows that go beyond what any chat interface can do. This first chapter covers authentication, the current model family, and the anatomy of a request and response.

Getting an API Key

The API key is the credential that identifies your application to Anthropic. Every request must include it. To get one:

  1. Sign in at console.anthropic.com
  2. Navigate to API Keys in the left sidebar
  3. Click Create Key, give it a name (e.g. "meal-planner-dev"), and copy it immediately — it won't be shown again
  4. Store it in an environment variable: ANTHROPIC_API_KEY=sk-ant-...
Never hardcode API keys
Do not put your API key directly in source code. It will end up in git history, get pushed to GitHub, and be scraped by bots within minutes. Always read it from an environment variable or a secrets manager. Add .env to your .gitignore immediately.

Installing the SDK

Anthropic provides official SDKs for Python and TypeScript. For Python:

Terminalshell
pip install anthropic

# Or with uv (faster)
uv add anthropic

The SDK handles authentication, request serialisation, error parsing, and retry logic for you. You can also call the API directly over HTTP — the SDK is a thin convenience wrapper, not a black box. Understanding the raw request format (covered below) makes debugging SDK issues much easier.

The Current Model Family

claude-opus-4-8
Opus · Most capable
Anthropic's most intelligent model. Best reasoning, analysis, and complex multi-step tasks. Highest cost.
Best for: research synthesis, complex code generation, nuanced writing, difficult reasoning chains
claude-sonnet-4-6
Sonnet · Balanced
High capability at moderate cost. The default choice for most production applications — strong performance, reasonable price.
Best for: most production use cases, API integrations, coding assistants, content generation
claude-haiku-4-5-20251001
Haiku · Fast & cheap
Fastest response time, lowest cost. Excellent for high-volume, latency-sensitive tasks where cost per request matters.
Best for: classification, extraction, summarisation at scale, real-time features, chatbots
claude-fable-5
Fable · Creative
Optimised for creative and narrative tasks. Strong at fiction, dialogue, and expressive writing.
Best for: creative writing, storytelling, dialogue generation, narrative content
Start with Sonnet, optimise later
When building a new feature, start with claude-sonnet-4-6. Once the feature works correctly, test whether Haiku produces acceptable quality at lower cost. Switch to Opus only if Sonnet's output quality isn't sufficient. This order saves money during development and avoids premature optimisation.

Your First API Request

basic_request.pypython
import anthropic

client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from env automatically

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain what an API key is in two sentences."}
    ]
)

print(message.content[0].text)

That's the minimum viable request — model, max_tokens, and a messages list with at least one user turn. Everything else is optional. Run it and you'll see Claude's response printed to the terminal.

Request Parameters — What Each Field Does

model
required
The model ID string. Use the exact ID from the model table above — abbreviated names like "sonnet" are not accepted. Example: "claude-sonnet-4-6"
max_tokens
required
Maximum tokens Claude will generate in the response. Claude stops at this limit even mid-sentence. Set it high enough for your expected output — it's a ceiling, not a target. Typical values: 256 (short answers), 1024 (paragraphs), 4096 (long documents), 8192+ (very long outputs)
messages
required
List of conversation turns. Each turn is {"role": "user" | "assistant", "content": "..."}. Must start with a user turn. Alternate roles for multi-turn conversations. For a fresh request: one user message. For conversation history: alternate user/assistant pairs followed by the new user message.
system
optional
The system prompt — instructions that shape Claude's behaviour for the entire conversation. Equivalent to CLAUDE.md for an API application. Set the role, constraints, output format, and persona here. Not part of the messages list — it's a separate top-level field.
temperature
optional
Controls randomness. Range 0–1. Lower = more deterministic and consistent; higher = more varied and creative. Default is 1. Use 0 for extraction/classification tasks. Use 0.7–1 for creative generation.
stop_sequences
optional
List of strings that cause Claude to stop generating when encountered. Useful for structured output where you want Claude to stop at a delimiter. Example: ["", "###END###"]
stream
optional
Set to True to receive the response as a stream of events rather than waiting for the full response. Covered in depth in Chapter 4. Essential for any UI that shows Claude's response as it's generated.

Reading the Response Object

Response object — key fields
message.id unique request ID
  # "msg_01XFDUDYJgAACzvnptvVoYEL" — log this for debugging

message.content[0].text the actual response
  # The text Claude generated. content is a list — always index [0] for simple requests

message.usage.input_tokens tokens consumed
message.usage.output_tokens tokens generated
  # Input + output tokens = what you're billed for

message.stop_reason why Claude stopped
  # "end_turn" → Claude finished naturally ✓
  # "max_tokens" → hit the limit — increase max_tokens if output was cut off
  # "stop_sequence" → hit one of your stop_sequences
  # "tool_use" → Claude wants to call a tool (Chapter 3)
Always check stop_reason
If stop_reason is "max_tokens", Claude's response was cut off — the output is incomplete. Either increase max_tokens or redesign the prompt to produce shorter output. A truncated response that looks complete is a subtle bug that's easy to miss in testing.

A Request with a System Prompt

with_system_prompt.pypython
message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=512,
    system="You are a concise technical writer. "
           "Always respond in plain text with no markdown. "
           "Keep answers under 3 sentences.",
    messages=[
        {"role": "user", "content": "What is a webhook?"}
    ]
)

Common API Errors and What They Mean

StatusError typeCause and fix
401 authentication_error API key missing, wrong, or revoked. Check your environment variable is set and the key is valid in the console.
400 invalid_request_error Malformed request — wrong field name, wrong type, messages list starts with assistant turn, or max_tokens exceeds model limit. Read the error message; it's usually specific.
429 rate_limit_error You've exceeded your requests-per-minute or tokens-per-minute limit. Implement exponential backoff and retry. Chapter 10 covers rate limit strategy in depth.
413 request_too_large Input tokens exceed the model's context window. Reduce the size of your messages — chunk long documents or truncate conversation history.
500 api_error Anthropic server error. Retry with backoff — these are transient. If persistent, check status.anthropic.com.
529 overloaded_error Anthropic is temporarily overloaded. Retry with backoff. More common during peak hours — design your application to handle this gracefully.

Handling Errors in Code

error_handling.pypython
import anthropic

client = anthropic.Anthropic()

try:
    message = client.messages.create(...)
except anthropic.AuthenticationError:
    raise RuntimeError("Invalid API key — check ANTHROPIC_API_KEY")
except anthropic.RateLimitError:
    raise # caller handles retry with backoff
except anthropic.APIStatusError as e:
    print(f"API error {e.status_code}: {e.message}")
    raise
Next — Chapter 2: System Prompts
Shaping Claude's behaviour at the API level — writing effective system prompts, controlling persona and constraints, the difference between system and user instructions, and designing system prompts for production applications.