indexinginternals

B+ Trees Explained: The Data Structure Powering Almost Every Database Index

June 12, 2026·5 min read·Watch the video ↗

If you've ever run CREATE INDEX on a table and watched a slow query become instant, you've witnessed a B+ tree doing its job. This data structure is the quiet workhorse behind nearly every relational database index — and understanding it changes the way you think about query performance.

What problem does it solve?

Without an index, a database has to scan every single row in a table to find the ones you're looking for. This is called a full table scan, and it's exactly as slow as it sounds for large tables.

An index is a separate data structure that maintains a sorted, searchable map from column values to the rows that contain them. The question is: what data structure should that index use?

Why B+ trees specifically?

You might wonder: why not a binary search tree, a hash table, or a skip list? The answer comes down to how disks work.

Databases store data on disk in fixed-size pages (typically 4KB or 8KB). Reading a single byte from disk costs the same as reading an entire page — the disk head has to seek to the right position either way. So the goal is to minimize the number of pages you need to read.

The structure

A B+ tree has two types of nodes:

Internal nodes hold keys and pointers to child nodes. They act as signposts directing the search.
Leaf nodes hold keys and pointers to the actual data rows. They also link to each other in a doubly-linked list.

B+ tree structure showing internal nodes and linked leaf nodes — A B+ tree with branching factor 4. Internal nodes guide the search; leaf nodes hold the data pointers and link to each other.

This linked-list property at the leaf level is what makes B+ trees exceptional for range queries. Once you find the starting point, you simply walk the linked list — no need to go back up the tree.

Lookup: finding a single value

Searching a B+ tree is straightforward. Start at the root and, at each internal node, compare your search key against the node's keys to decide which child pointer to follow. When you reach a leaf node, scan it for your key.

SQLThis query uses a B+ tree index on the id column

SELECT * FROM users WHERE id = 42;

With a branching factor of 500 (typical for 8KB pages with integer keys), the tree depth stays remarkably small:

Rows	Tree depth	Page reads
500	1	1
250,000	2	2
125,000,000	3	3
62,500,000,000	4	4

Four page reads to search 62 billion rows. That's the power of high fan-out.

Range queries: where B+ trees dominate

The linked leaf nodes make range queries efficient:

SQLRange scan — walks the leaf linked list

SELECT * FROM orders WHERE created_at BETWEEN '2026-01-01' AND '2026-06-01' ORDER BY created_at;

The database finds the first leaf node containing 2026-01-01, then follows the forward pointers through the linked list until it passes 2026-06-01. It never touches any data outside the range.

This is also why column order in composite indexes matters. An index on (country, city) can efficiently answer queries filtering on country alone (or country + city), but not city alone — because the B+ tree is sorted by country first.

Insertions and splits

When you insert a new key and the target leaf node is full, the node splits into two, and the middle key is promoted to the parent. If the parent is also full, it splits too — this can cascade up to the root, which is the only way the tree grows taller.

This self-balancing property guarantees that the tree never becomes lopsided. Every leaf is always the same distance from the root.

Deletions and merging

Deleting a key from a leaf node can cause it to become less than half full. When this happens, the node either borrows a key from a sibling or merges with one. Most databases handle this lazily — they mark space as reusable and only restructure when needed, which is why you sometimes see index bloat and need to REINDEX.

What you should take away

Three things to remember:

B+ trees are optimized for disk access. Their wide, shallow shape minimizes the number of pages read per query. This is why they beat binary trees for on-disk data.

Leaf nodes form a linked list. This makes range scans and ordered iteration fast — follow the pointers instead of re-traversing the tree.

Index column order matters. The B+ tree is sorted by columns in the order you define them. The leftmost prefix rule determines which queries the index can serve.

This article accompanies the video B+ Trees Explained on I Know Database. If you found it useful, consider subscribing to the channel for more deep dives.