Writing tests
Claude in VS Code: Pair Programming
Writing tests is one of the most mechanical parts of software development — and one of the best fits for Claude. Given a function to test, Claude can generate a suite of cases covering the happy path, error paths, and boundary conditions faster than you can type them. The skill is briefing Claude correctly on your testing framework and style, and knowing how to prompt for the edge cases you haven't thought of yet.
Brief Claude on Your Testing Setup First
Before asking Claude to write any tests, give it a one-time brief on your testing environment. Without this, Claude will make reasonable guesses — and those guesses may not match your project. Do this once at the start of a testing session:
StyleTests live in tests/ mirroring the app structure. Test files are named test_[module].py. Test functions are named test_[what]_[condition] (e.g. test_create_order_missing_user_returns_404).
CoverageWe aim to cover: happy path, each distinct error case, and at least one boundary value per numeric input.
ExceptionsWe don't mock the database — tests run against a real test DB. We do mock external HTTP calls using respx.
Once Claude has this brief, all test requests in the session will use the right framework, naming conventions, and mocking approach without you repeating it.
Types of Tests Claude Generates Well
Anatomy of a Well-Generated Test
# Arrange
due_date = date(2026, 1, 10)
paid_date = date(2026, 1, 11)
daily_rate = Decimal("2.50")
# Actact
result = calculate_late_fee(due_date, paid_date, daily_rate)
# Assertassert
assert result == Decimal("2.50")
A well-structured test has three clear sections: Arrange (set up inputs), Act (call the thing being tested), Assert (verify the result). If you ask Claude to follow this structure explicitly, the output is easier to read and easier to modify. Add Follow the Arrange-Act-Assert pattern to your brief.
Using Claude to Find Edge Cases You Haven't Thought Of
This is one of the highest-value ways to use Claude for testing. Rather than asking Claude to write tests for cases you already know about, ask it to find the ones you missed:
What edge cases and error conditions should I test for `calculate_late_fee`? List them before writing any code.
// Security-focused
What inputs to this function could cause unexpected behaviour from a security perspective? I want to make sure we're testing those.
// Boundary focus
What are the boundary values for this function? I want parametrised tests that cover exactly at, just below, and just above each boundary.
// Combination cases
What combinations of inputs are most likely to cause problems? I'm thinking about cases where multiple conditions interact.
Ask Claude to list the cases first — before writing any code. Review the list, add or remove cases, then ask Claude to write tests for the approved list. This gives you control over coverage without having to think of every case yourself.
Edge Case Categories to Always Cover
Common Pitfalls in Claude-Generated Tests
| Pitfall | What it looks like | How to avoid it |
|---|---|---|
| Tests that always pass | An assertion like assert result is not None — technically passes but verifies almost nothing |
Ask Claude: "Assert the exact expected value, not just that a result exists." |
| Over-mocking | Claude mocks the function being tested, or mocks so much that the test doesn't exercise real code | Specify what to mock and what not to. "Mock the HTTP call only — use the real service layer." |
| Wrong framework syntax | Claude uses unittest.mock when you use pytest-mock, or uses self.assert in a non-class test |
Include framework and library specifics in your brief. Open an existing test file before asking. |
| Tests coupled to implementation | Tests that break when you refactor internal details, even if the behaviour didn't change | Ask Claude: "Test the observable behaviour, not the internal implementation." |
| Missing cleanup | Tests that leave state (DB rows, files, env vars) that affect other tests | Ask Claude to add fixtures or teardown. "Make sure each test is independent and cleans up after itself." |
Tests for Code That Already Has Bugs
An underused pattern: ask Claude to write a test that reproduces a bug you're about to fix. Write the test first (it should fail), then fix the code (the test passes), then commit both. This is lightweight TDD with Claude doing the test-writing:
calculate_late_fee is called with a paid_date before the due_date, it returns a negative fee instead of zero. The test should fail against the current code and pass once the bug is fixed."