Few-shot prompting

Prompt Engineering

Chapter 4  ·  Few-Shot Prompting — Giving Examples to Shape Output Style and Format

Few-shot prompting is the technique of including one or more examples of exactly what you want — input and desired output — before making your actual request. It's one of the most underused techniques in everyday prompting, and one of the most reliably effective for tasks where describing what you want is harder than just showing it.

Zero-Shot vs. Few-Shot — The Core Difference

Almost every prompt you send is zero-shot — you state the task and Claude figures out the format, style, and level of detail on its own. That works well for straightforward tasks, but for anything where the exact shape of the output matters, you're leaving a lot to chance.

A few-shot prompt shows Claude what "correct" looks like before asking for the real thing:

Zero-shot (Claude guesses the format)
You
Classify this support ticket by urgency: "The payment page is throwing a 500 error and customers can't check out."
Claude
This ticket appears to be high urgency, as it directly affects the checkout process and is likely causing revenue loss. I would classify it as Critical or P1...
Few-shot (Claude matches your format exactly)
You
Classify support tickets by urgency. Use this format: TICKET: <text> URGENCY: <Critical | High | Medium | Low> REASON: <one sentence> TICKET: "Login is broken for all users since the deploy." URGENCY: Critical REASON: Complete authentication failure blocking all users. TICKET: "The dashboard chart colours look off on dark mode." URGENCY: Low REASON: Cosmetic issue affecting appearance but not functionality. TICKET: "The payment page is throwing a 500 error and customers can't check out."
Claude
URGENCY: Critical REASON: Payment failures directly block revenue and indicate a broken core user flow.

The second version produces exactly the output you need — consistent format, right level of detail, ready to paste into a ticketing system. The examples did the specification work that would otherwise take several sentences of instruction.

How Many Examples Do You Need?

The name "few-shot" covers a wide range:

  • One-shot — a single example. Often enough to establish format and tone. Good starting point.
  • Two or three shots — the sweet spot for most tasks. Enough to show consistency and handle variation.
  • Five or more — useful when there are many edge cases, or when you're trying to teach a subtle pattern (e.g. a specific writing voice).
Rule of thumb
Start with two examples. If Claude's output drifts from what you want, add a third example that specifically covers the case where it went wrong. More than five examples rarely helps and costs context space.

What Few-Shot Prompting Is Best At

Structured output formats

JSON, CSV, custom report templates, fixed-width tables — anything where the exact shape matters more than prose quality.

Classification tasks

Tagging, categorising, labelling. Examples define what each category actually means in your context, not Claude's general understanding of the word.

Consistent writing style

Matching an existing tone — a company's blog voice, a technical documentation style, a specific author's register. Hard to describe; easy to demonstrate.

Data transformation

Show Claude an input/output pair and it can apply the same transformation reliably across many items — reformatting addresses, normalising product names, extracting fields.

Extraction with specific rules

Pulling specific fields from unstructured text where the rule is easier to demonstrate than to explain (e.g. "extract the candidate name" from varied email formats).

Commit / PR message style

Show two or three examples of your team's commit message convention and Claude will match it exactly — tense, length, prefix format and all.

Designing Good Examples

Cover the variation you expect

If your real inputs vary in some important way, make sure your examples cover that variation. Two examples that look nearly identical teach Claude very little about edge cases.

Poor — examples too similar
Input: "Fix login bug" Output: fix: resolve login authentication error Input: "Fix signup bug" Output: fix: resolve signup validation error
Both examples are the same pattern. Claude learns nothing about how to handle different commit types, long descriptions, or scope prefixes.
Better — examples show variation
Input: "Fix login bug" Output: fix(auth): resolve password comparison returning false positive Input: "Add dark mode toggle to settings" Output: feat(ui): add dark mode toggle to user settings page Input: "Remove old migration files" Output: chore: delete deprecated migration scripts from 2023
Now Claude understands scope prefixes, the difference between fix/feat/chore, and appropriate description length.

Make examples realistic

Use examples that look like your real data, not simplified toy versions. Claude will pattern-match the style, length, and content of your examples — toy examples produce toy-level output.

Include edge cases deliberately

If there's an input type that's tricky or different from the norm, include it as an example explicitly. Don't hope Claude infers the rule — show it.

A Practical Pattern: Batch Processing

Few-shot prompting pairs especially well with batch tasks — give examples, then provide a list of real items for Claude to process all at once:

Batch extraction with few-shot examples
You
Extract the candidate name and job title from each email subject line. Format: NAME | TITLE Subject: New Candidate Application - Smith, John (Senior DevOps Engineer) Result: John Smith | Senior DevOps Engineer Subject: Application Received: Jane O'Brien - Marketing Manager Role Result: Jane O'Brien | Marketing Manager Now process these: - New Candidate Application - Cooper, Tianilya - Application: Raj Patel for Data Analyst Position - CV Submission — Nguyen, Mai (Junior Frontend Developer)
Claude
Tianilya Cooper | (title not specified) Raj Patel | Data Analyst Mai Nguyen | Junior Frontend Developer

Common Pitfalls

PitfallWhat goes wrongFix
Examples that contradict each other Claude averages them or picks one arbitrarily, producing inconsistent output Check that your examples follow the same rules before using them
Examples in a different style than you want Claude mimics the example style even if it's not ideal Examples are more powerful than instructions — make sure they show exactly what you want
Too many examples crowding out the actual task The real request gets lost or Claude keeps generating more examples Put a clear separator between examples and the actual input (e.g. "Now process:")
Examples that only cover one type of input Claude handles that type well but fails on variation Deliberately include varied examples — short/long, simple/complex, edge cases
Using few-shot when zero-shot would do Prompt bloat, wasted context, no meaningful improvement Few-shot shines for format and style tasks. For factual Q&A, just ask.

Few-Shot vs. Fine-Tuning — Knowing the Limit

Few-shot prompting teaches Claude within a single conversation — the examples live in the context window and influence that session only. It's powerful for style and format, but it has a ceiling:

  • It can't teach Claude facts it doesn't have (use retrieval-augmented generation for that)
  • It can't permanently change Claude's behaviour across sessions
  • For very specialised domains, hundreds of examples start to strain the context window

For most day-to-day tasks, these limits don't matter. For production pipelines where you need consistent structured output at scale, combining few-shot with an explicit system prompt (or persistent memory) is the practical answer.

Next — Chapter 5: Chain-of-Thought Prompting
Getting Claude to reason step by step before arriving at an answer — when it helps, when it doesn't, and the specific phrasing that triggers reliable step-by-step reasoning rather than just a longer response.