Regular Expressions
Chapter 7 of Fundamentals covered str_contains(), strpos(), and str_replace() — all built around exact, literal text. Regular expressions ("regex") describe patterns instead — "a sequence of digits," "an optional plus sign followed by numbers," "anything that looks like a UK postcode" — and PHP's preg_* functions test and act on those patterns.
preg_match — Does a Pattern Appear?
The pattern is written between two / delimiters. preg_match() returns 1 if the pattern is found anywhere in the string, 0 if not — both treated as truthy/falsy correctly in an if, similar to how strpos() needed care in Chapter 7.
Core Pattern Syntax
| Pattern | Matches |
|---|---|
| \d | Any single digit (0-9) |
| \w | Any "word" character (letters, digits, underscore) |
| \s | Any whitespace character (space, tab, newline) |
| . | Any single character at all (except newline by default) |
| + | One or more of the preceding item |
| * | Zero or more of the preceding item |
| ? | Zero or one of the preceding item (makes it optional) |
| {5} | Exactly 5 of the preceding item |
| [abc] | Any one of the characters listed inside the brackets |
| ^ / $ | Start / end of the string |
Capturing Groups — Extracting Parts of a Match
Parentheses ( ) create a "capturing group" — beyond just confirming a match exists, the third argument ($matches, passed by reference) is filled with the whole match at index 0, and each group's individually captured text at index 1, 2, and so on.
preg_match_all — Every Match, Not Just the First
preg_match() stops after the first match anywhere in the string; preg_match_all() finds every occurrence, returning them as an array.
preg_replace — Pattern-Based Find and Replace
preg_replace() mirrors str_replace() (Chapter 7 of Fundamentals), but matches by pattern rather than exact text, and replaces every match by default — $1, $2, $3 in the replacement string refer back to the captured groups from the pattern itself, genuinely useful for reformatting matched text rather than just deleting or replacing it outright.
., +, ?, (, ) have special meaning in a pattern. To match a literal period in, say, a domain name, write \. — an unescaped . would instead match "any character at all," which is usually not the intent and can cause subtly wrong matches that are easy to miss in testing.
Validating an Email Format (Compared to filter_var)
FILTER_VALIDATE_EMAIL already handles this thoroughly and correctly. This pattern is shown purely as a realistic regex example; in real code, prefer the built-in filter for anything regex could get subtly wrong, and reach for regex when there's no equivalent built-in filter available.
Coding Challenges
Write a pattern that matches a UK postcode in the simplified format "AA1 1AA" (two letters, one digit, a space, one digit, two letters). Test it with preg_match() against both a matching and a non-matching string, echoing the result of each.
📄 View solutionGiven a string containing several prices like "Items: £12.50, £3.99, and £100.00", use preg_match_all() with a capturing group to extract just the numeric amounts (without the £ sign) into an array, then use array_sum() to total them.
📄 View solutionWrite a function maskCardNumber($number) that takes a 16-digit card number string and uses preg_replace() with capturing groups to keep only the last 4 digits visible, replacing the rest with asterisks (e.g. "1234567812345678" becomes "************5678").
Chapter 9 Quick Reference
- preg_match($pattern, $str) — returns 1 if the pattern is found, 0 if not
- \d \w \s . + * ? {n} [abc] ^ $ — core pattern building blocks
- ( ) capturing groups — extract matched parts into $matches[1], [2], etc.
- preg_match_all() — finds every match in the string, not just the first
- preg_replace($pattern, $replacement, $str) — pattern-based find/replace; $1, $2 reference captured groups
- Escape special characters (., +, ?, etc.) with a backslash to match them literally
- Prefer filter_var() over regex for things like email validation where a reliable built-in filter exists
- Next chapter: course capstone — a simple blog with login, CRUD posts, and database storage