What Is Database Indexing? The Science of Instantly Speeding Up Slow Queries

You run a WHERE email = "ali@example.com" query on a table with 10 million rows. Without an index, the database checks every row one by one — a full table scan. With an index, it jumps directly to the result.

The difference? Seconds versus milliseconds.

Why Are Indexes Necessary?

Think of database tables: data is stored unsorted on disk. Finding a record requires the database to scan every row — O(n) complexity. 10 million comparisons for 10 million rows.

An index reduces this search problem to O(log n). Like the index at the back of a book: it keeps the values in the "email" column sorted and points to the physical location of each value.

sql
-- Without index: 10 million rows scanned
SELECT * FROM users WHERE email = 'ali@example.com';

-- Create index
CREATE INDEX idx_users_email ON users(email);

-- With index: B-tree traversal, ~20 steps
SELECT * FROM users WHERE email = 'ali@example.com';

B-Tree: The Structure Behind Indexes

The index structure used by most databases is the B-tree (Balanced Tree). It stores sorted data in a balanced tree structure.

code
                    [M]
                   /   \
              [D, H]   [R, V]
             /  |  \   /  |  \
           [A] [E] [J] [N] [S] [X]

Search always starts at the root. At each node, a decision is made: "smaller, equal, or larger?" and the relevant branch is followed. Even with 1 billion records, maximum ~30 steps.

B-tree is effective not only for equality queries but also for range queries:

sql
-- B-tree handles these queries efficiently
SELECT * FROM orders WHERE amount > 1000;
SELECT * FROM users WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';
SELECT * FROM products WHERE name LIKE 'iPhone%'; -- Prefix search

Index Types

Single Column Index:

sql
CREATE INDEX idx_email ON users(email);
CREATE INDEX idx_created ON orders(created_at);

Composite (Multi-Column) Index:

sql
-- Column order matters: with (last_name, first_name)
-- WHERE last_name = ?              → uses index
-- WHERE first_name = ?             → does NOT use index
-- WHERE last_name = ? AND first_name = ?  → uses index
CREATE INDEX idx_name ON users(last_name, first_name);

Column order in composite indexes is critical. The index is used left to right. Without the first column, the index won't kick in for subsequent columns.

Unique Index:

sql
CREATE UNIQUE INDEX idx_unique_email ON users(email);
-- PRIMARY KEY automatically creates a unique index

Partial Index — only on a subset of the table:

sql
-- Only index active users — smaller, faster
CREATE INDEX idx_active_users ON users(email) WHERE active = true;

-- Only pending orders
CREATE INDEX idx_pending ON orders(created_at) WHERE status = 'pending';

Covering Index — contains all columns the query needs, never touches the table:

sql
-- Covering index for this query
SELECT email, name FROM users WHERE department = 'engineering';

CREATE INDEX idx_dept_covering ON users(department, email, name);
-- Database reads only the index, never goes to the table

EXPLAIN: Reading the Query Plan

The way to understand whether an index is being used is the EXPLAIN command:

sql
EXPLAIN SELECT * FROM users WHERE email = 'ali@example.com';

code
id | select_type | table | type  | key             | rows | Extra
1  | SIMPLE      | users | ref   | idx_users_email | 1    | Using index

The type column is critical:

const or eq_ref — perfect, single row
ref — good, index is being used
range — acceptable, range scan
index — full index scan, be careful
ALL — full table scan, no index or index not being used

Situations Where Indexes Don't Get Used

Index created but query is still slow? Probably one of these mistakes:

Using a function on a column:

sql
-- Does NOT use index: column was transformed
SELECT * FROM users WHERE LOWER(email) = 'ali@example.com';
SELECT * FROM orders WHERE YEAR(created_at) = 2024;

-- Uses index: column left as is
SELECT * FROM users WHERE email = 'ali@example.com';
SELECT * FROM orders WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';

LIKE with non-prefix search:

sql
-- Uses index: prefix search
SELECT * FROM products WHERE name LIKE 'iPhone%';

-- Does NOT use index: no prefix
SELECT * FROM products WHERE name LIKE '%phone';
SELECT * FROM products WHERE name LIKE '%phone%';

Implicit type conversion:

sql
-- user_id is INTEGER but string is given → does NOT use index
SELECT * FROM users WHERE user_id = '123';

-- Correct usage
SELECT * FROM users WHERE user_id = 123;

The Cost of Indexes

Indexes aren't magic. Every index carries a cost:

Write cost: Every INSERT, UPDATE, DELETE operation also updates the index. Write operations slow down on tables with many indexes.

Disk space: Every index uses additional storage. On large tables, indexes can double the total data size.

Index bloat: On tables with constant deletes/updates, indexes fragment over time. Periodic REINDEX or VACUUM may be needed.

The rule: index the columns you filter, sort, and join on — not the columns you select. Instead of adding indexes to every column, identify slow queries and create indexes for them.

The N+1 Problem and Indexes

The N+1 problem frequently encountered when using ORMs isn't solved by indexes — but indexes reduce its impact.

typescript
// N+1: Separate query per user — 1001 queries
const users = await User.findAll(); // 1 query
for (const user of users) {
    const orders = await Order.findAll({ where: { userId: user.id } }); // N queries
}

// Correct: JOIN in a single query
const users = await User.findAll({
    include: [{ model: Order }] // 1 or 2 queries
});

If there's an index on orders.user_id, the JOIN query runs much faster. Foreign key columns should almost always be indexed.

When indexes are set up correctly, the database stops being the bottleneck. When set up incorrectly, both reads and writes slow down. The difference is knowing which queries you run and designing indexes that support them.