Chapter 4

Chapter 4 — Data Types and NULL

Every column in a MySQL table has a data type — and the type you choose matters more than it might seem. It determines how much storage a value uses, what operations you can perform on it, how it sorts, and what values are valid to insert. This chapter covers the most important types, how to choose between them, and how to handle NULL — the absence of a value — gracefully in queries.

1. Numeric Types

TINYINT
−128 to 127  ·  1 byte

Boolean flags, small counters, ratings out of 10. TINYINT(1) is MySQL's internal representation of a boolean (0/1).

SMALLINT
−32,768 to 32,767  ·  2 bytes

Port numbers, year values (though YEAR type is better), small quantities. Rarely used — INT is usually fine.

INT
−2.1 billion to 2.1 billion  ·  4 bytes

The default choice for whole numbers. Primary keys, foreign keys, counts, stock levels, user IDs. INT UNSIGNED doubles the positive range to ~4.3 billion.

BIGINT
±9.2 × 10¹⁸  ·  8 bytes

When INT isn't big enough — social media post IDs, financial transaction counters, Unix timestamps in milliseconds. Uses twice the storage of INT.

DECIMAL(p, s)
Exact precision  ·  variable

Always use for money. DECIMAL(10, 2) stores up to 10 digits total with exactly 2 after the decimal point. Never rounds silently.

FLOAT / DOUBLE
Approximate  ·  4 / 8 bytes

Scientific measurements, coordinates, ML model weights — where approximate is fine. Never use for money — floating-point arithmetic introduces rounding errors: 0.1 + 0.2 = 0.30000000000000004.

Never store money in FLOAT or DOUBLE. Floating-point types cannot represent most decimal fractions exactly. A price of £8.99 stored as FLOAT may be retrieved as £8.98999977111816. Over thousands of transactions this adds up. Always use DECIMAL(10, 2) for currency.
-- DECIMAL in action: p=6 means 6 digits total, s=2 means 2 after the point -- Maximum value: 9999.99 price DECIMAL(6, 2) -- £9999.99 max, good for book prices salary DECIMAL(10, 2) -- £99,999,999.99 max, good for payroll lat DECIMAL(9, 6) -- GPS latitude: e.g. 51.507351

2. String Types

CHAR(n)
Fixed length, 0–255 chars  ·  always n bytes

Always stores exactly n characters, padding with spaces if shorter. Use only when every value is the same length — country codes (CHAR(2)), fixed codes, hashed passwords. Slightly faster to read than VARCHAR for fixed-width data.

VARCHAR(n)
Variable length, up to n chars  ·  actual length + 1–2 bytes

The default choice for text with a known maximum — names, email addresses, titles, URLs. VARCHAR(255) is a safe default; choose a realistic maximum rather than always using 255.

TEXT
Up to 65,535 bytes

Long freeform text — blog post body, product descriptions, comments. Cannot have a DEFAULT value. Can't be fully indexed (only prefix indexes). Use when content might exceed VARCHAR's practical limit.

MEDIUMTEXT / LONGTEXT
Up to 16 MB / 4 GB

Very large documents — book contents, HTML pages, logs. Rarely needed; TEXT handles most use cases. Stored off-page, so accessing them is slower.

ENUM('a','b','c')
One of a fixed list  ·  1–2 bytes

Columns with a small, fixed set of valid values — status, gender, priority. MySQL enforces the list on insert. Downside: adding a new value requires an ALTER TABLE.

JSON
Variable  ·  validated JSON

Semi-structured data with varying fields — settings, metadata, API responses. MySQL validates and stores JSON efficiently, and provides functions to query inside it. Avoid if the data could be normalised into columns.

VARCHAR(255) vs VARCHAR(1000) — does size matter? VARCHAR only uses as much storage as the actual content — declaring VARCHAR(1000) doesn't waste space if you only store 10-character values. The limit is enforced on insert, not storage. However, MySQL uses the declared length when creating temporary tables during query processing, so unnecessarily large VARCHARs can slow complex queries. Set a realistic maximum.

3. Date and Time Types

TypeStorageRangeUse forExample value
DATE 3 bytes 1000-01-01 to 9999-12-31 Birth dates, publication dates, order dates without time '2024-06-15'
TIME 3 bytes −838:59:59 to 838:59:59 Duration, opening hours. Rarely used for actual times of day. '14:30:00'
DATETIME 5 bytes 1000-01-01 to 9999-12-31 Order timestamps, log entries. Stores exactly what you put in — no timezone conversion. '2024-06-15 14:30:00'
TIMESTAMP 4 bytes 1970-01-01 to 2038-01-19 Record creation/update times. Converts to UTC on store, back to local time on retrieve. Auto-updates with ON UPDATE CURRENT_TIMESTAMP. '2024-06-15 14:30:00'
YEAR 1 byte 1901 to 2155 Publication year, graduation year. Only stores the year — no month or day. 2024
-- How our bookshop tables use date types published_year YEAR -- just the year: 1965, 2020 joined_date DATE -- no time needed: '2023-03-15' order_date DATETIME -- exact moment of purchase: '2024-06-15 09:42:17' -- Useful date functions (preview — covered in Chapter 9) SELECT NOW() -- current date and time: 2024-06-15 14:30:00 SELECT CURDATE() -- current date only: 2024-06-15 SELECT YEAR(order_date) -- extract the year part from a DATETIME column
TIMESTAMP has a 2038 problem. TIMESTAMP stores dates as a 32-bit integer counting seconds since 1970 — it overflows on 19 January 2038. For any date column that might hold values beyond 2038 (birth years, future bookings, expiry dates), use DATETIME instead. The 2038 deadline is closer than it sounds for long-running systems.

4. Other Useful Types

TypeUse forExample
BOOLEAN True/false flags. MySQL stores this as TINYINT(1) — 0 is false, 1 is true. TRUE and FALSE keywords work as aliases. is_active BOOLEAN DEFAULT TRUE
BLOB / MEDIUMBLOB Binary data — images, PDFs, files. Generally discouraged: store files on disk/object storage and keep only the path in the database. thumbnail BLOB
BINARY(n) / VARBINARY(n) Fixed or variable binary strings — hashed values, UUIDs stored as raw bytes. More efficient than storing a hex string. password_hash BINARY(32)

5. Choosing the Right Type

Storing…Use this typeWhy
A whole number (ID, count, quantity)INTStandard choice. Use BIGINT only if you genuinely need >2.1 billion values.
Money / currencyDECIMAL(10, 2)Exact arithmetic. Never FLOAT or DOUBLE.
A short text value with a capVARCHAR(n)Email, name, title, URL. Set n to the realistic maximum.
A long freeform text (description, body)TEXTNo length limit to worry about. Can't be defaulted or fully indexed.
A fixed-length code (country, status)CHAR(2) or ENUMCHAR is efficient when every value is the same width. ENUM enforces a valid value list.
A date onlyDATEStores year/month/day with no time component.
A date and time (event, order, log)DATETIMENo timezone surprises. Use TIMESTAMP only if you need auto-update on change.
A year onlyYEAR1 byte. Perfect for publication year, graduation year.
True / falseBOOLEAN (TINYINT(1))0 or 1. Works with TRUE/FALSE keywords.

6. NULL in Depth

We introduced NULL in Chapter 2 as "the absence of a value". This section goes deeper — how NULL behaves in expressions and comparisons, and how to handle it cleanly in queries.

NULL propagates through expressions

Any arithmetic or string operation involving NULL produces NULL. Think of NULL as "unknown" — if one input is unknown, the result is unknown too:

NULL + 5 = NULL You can't add something unknown to 5 — the result is unknown
NULL * 0 = NULL Even multiplying by zero — the result is still unknown
CONCAT('Hello, ', NULL) = NULL String concatenation with NULL returns NULL
NULL = NULL = NULL Not TRUE — two unknowns compared give an unknown result
NULL != NULL = NULL Also not TRUE — still unknown
NULL IS NULL = TRUE IS NULL is the correct way to test for NULL — always returns TRUE or FALSE
-- NULL in a calculation silently ruins results SELECT title, price, stock, price * stock AS stock_value -- NULL if stock is NULL FROM books;
titlepricestockstock_value
The Midnight Library8.9942377.58
Dune9.99NULLNULL
Sapiens10.9931340.69

Dune has no stock value recorded — the calculation silently returns NULL instead of an error.

NULL in WHERE clauses

Because NULL = NULL is NULL (not TRUE), rows with NULL values are silently excluded from results that use = comparisons:

-- If nationality is NULL for some authors, they won't appear here SELECT first_name, last_name, nationality FROM authors WHERE nationality = 'British'; -- NULL rows excluded silently -- To find authors with NO nationality recorded SELECT first_name, last_name FROM authors WHERE nationality IS NULL; -- To find authors WITH a nationality recorded SELECT first_name, last_name, nationality FROM authors WHERE nationality IS NOT NULL;

7. Handling NULL — IFNULL and COALESCE

Rather than letting NULL propagate through your query, you can replace it with a sensible default using IFNULL() or COALESCE().

IFNULL(expr, fallback)
Returns expr if it is not NULL; returns fallback if it is. Takes exactly two arguments. MySQL-specific (though widely supported).

IFNULL(stock, 0)
→ returns stock if not NULL, otherwise 0

IFNULL(nationality, 'Unknown')
→ returns nationality if not NULL, otherwise 'Unknown'
COALESCE(a, b, c, …)
Returns the first non-NULL value from a list of arguments. Standard SQL — works across all databases. More flexible than IFNULL when you have multiple fallback options.

COALESCE(mobile, home_phone, 'No phone')
→ tries mobile first, then home_phone, then the string

COALESCE(stock, 0)
→ same as IFNULL when only two arguments

IFNULL in practice

-- Replace NULL stock with 0 so the calculation works SELECT title, price, IFNULL(stock, 0) AS stock, price * IFNULL(stock, 0) AS stock_value FROM books;
titlepricestockstock_value
The Midnight Library8.9942377.58
Dune9.9900.00
Sapiens10.9931340.69

NULL stock is treated as 0 — the calculation now produces a result for every row.

COALESCE with multiple fallbacks

-- Show the best available contact for each customer -- Try email first, then phone, then a placeholder SELECT first_name, last_name, COALESCE(email, phone, 'No contact info') AS contact FROM customers;

NULLIF — the reverse

NULLIF(a, b) returns NULL if a = b, otherwise returns a. Useful for turning sentinel values (like 0 or an empty string) back into NULL so they don't interfere with calculations:

-- Avoid division by zero: if quantity is 0, treat it as NULL SELECT order_id, total_price, quantity, total_price / NULLIF(quantity, 0) AS unit_price FROM orders; -- If quantity = 0, NULLIF returns NULL, making the division return NULL -- rather than causing a division-by-zero error

8. NOT NULL and DEFAULT Constraints

The best way to deal with NULL is often to prevent it from being stored in the first place. NOT NULL makes a column mandatory — MySQL will reject any INSERT that doesn't provide a value. DEFAULT provides a fallback so the column is never empty even if the user doesn't supply a value:

CREATE TABLE books ( book_id INT NOT NULL AUTO_INCREMENT, title VARCHAR(255) NOT NULL, -- must be provided genre VARCHAR(60), -- optional (nullable) price DECIMAL(6,2) NOT NULL, -- must be provided stock INT NOT NULL DEFAULT 0, -- defaults to 0 published_year YEAR, -- unknown year is valid PRIMARY KEY (book_id) );
Use NOT NULL by default; allow NULL only when absence is meaningful. A NULL nationality on an author record means "we don't know" — a valid state. A NULL price on a book would be a data error — the column should be NOT NULL. When in doubt, ask: "is the absence of this value meaningful, or is it always a mistake?" If it's always a mistake, add NOT NULL.

Chapter Summary

ConceptKey points
INTWhole numbers. Primary keys, counts, quantities. BIGINT if you need >2.1 billion.
DECIMAL(p,s)Exact decimal numbers. Always use for money. Never FLOAT/DOUBLE for currency.
VARCHAR(n)Variable-length text up to n characters. Default choice for names, emails, titles.
TEXTLong freeform text. No DEFAULT value; only prefix-indexed.
DATE / DATETIMEDATE for dates only. DATETIME for dates with time. Avoid TIMESTAMP for dates past 2038.
BOOLEANStored as TINYINT(1). Use TRUE/FALSE keywords; 0 = false, 1 = true.
NULLAbsence of a value. Propagates through arithmetic and string operations. Never use = NULL — always IS NULL / IS NOT NULL.
IFNULL(x, y)Returns x if not NULL, otherwise y. Two arguments. MySQL-specific.
COALESCE(a,b,c…)Returns the first non-NULL argument. Standard SQL. Prefer over IFNULL for portability and multiple fallbacks.
NULLIF(a, b)Returns NULL if a = b, otherwise a. Useful to suppress sentinel values like 0 to prevent division by zero.
NOT NULL / DEFAULTNOT NULL makes a column mandatory. DEFAULT provides a fallback. Prefer NOT NULL wherever absence of a value is an error.
Next: Chapter 5 — Modifying Data. The bookshop database has a schema but no data. Chapter 5 covers INSERT (adding rows), UPDATE (changing existing values), DELETE (removing rows), and the difference between TRUNCATE and DELETE.