Don't over use indexes!
The Problem
The advice “add an index when a query is slow” is so common it has become instinct. A table scan shows up in an execution plan, performance suffers, an index is added, the query speeds up — done. Repeat for every slow query, and eventually every column that appears in a WHERE clause has an index sitting on it.
This approach works right up until it silently stops working.
The overlooked reality is that every index you add is a structure the database must maintain on every write. An INSERT, UPDATE, or DELETE on a table does not just change the row — it must also update every index that covers the modified columns. The more indexes a table carries, the more work each write does. At modest data volumes the cost is invisible. Under real write load it accumulates into measurable latency, lock contention, and wasted I/O.
Other consequences of over-indexing:
- Bloated storage — each index is a copy of the indexed data in sorted order, plus overflow pages from fragmentation
- Query planner confusion — the optimiser may choose a poor index from a large candidate set, particularly when statistics are stale
- Longer maintenance windows — index rebuilds and reorganisations take longer the more indexes exist
- Slower bulk loads —
BULK INSERTand ETL pipelines are disproportionately affected because every row insertion must update all indexes
The problem is not that indexes are bad. It is that an index added without understanding the full cost profile of a table is a liability as much as an asset.
The Solution
Treat every index as a deliberate design decision with an explicit cost/benefit trade-off, not as a free performance knob. Concretely:
- Understand what you already have before adding anything
- Identify genuinely missing indexes using query execution plans and DMVs, not intuition
- Remove unused and duplicate indexes — they pay the write cost with no read benefit
- Prefer fewer, wider covering indexes over many narrow single-column indexes
- Validate the impact of any change under realistic load, not just against the slow query in isolation
Understanding What You Already Have
Before adding anything, audit the existing indexes. In SQL Server:
SELECT
i.name AS index_name,
i.type_desc AS index_type,
STRING_AGG(c.name, ', ')
WITHIN GROUP (ORDER BY ic.key_ordinal) AS key_columns,
i.is_unique,
i.is_primary_key
FROM sys.indexes i
JOIN sys.index_columns ic
ON ic.object_id = i.object_id
AND ic.index_id = i.index_id
JOIN sys.columns c
ON c.object_id = ic.object_id
AND c.column_id = ic.column_id
WHERE i.object_id = OBJECT_ID('dbo.Orders')
AND i.type > 0 -- exclude heap
GROUP BY i.name, i.type_desc, i.is_unique, i.is_primary_key
ORDER BY i.name;
In PostgreSQL:
SELECT
i.relname AS index_name,
am.amname AS index_type,
STRING_AGG(a.attname, ', ' ORDER BY ix.ord) AS key_columns
FROM pg_index x
JOIN pg_class t ON t.oid = x.indrelid
JOIN pg_class i ON i.oid = x.indexrelid
JOIN pg_am am ON am.oid = i.relam
CROSS JOIN LATERAL unnest(x.indkey) WITH ORDINALITY AS ix(col, ord)
JOIN pg_attribute a
ON a.attrelid = t.oid
AND a.attnum = ix.col
WHERE t.relname = 'orders'
GROUP BY i.relname, am.amname
ORDER BY i.relname;
Finding Unused Indexes
The most actionable quick win is removing indexes the database never uses. SQL Server tracks usage statistics in sys.dm_db_index_usage_stats since the last service restart:
SELECT
OBJECT_NAME(i.object_id) AS table_name,
i.name AS index_name,
s.user_seeks,
s.user_scans,
s.user_lookups,
s.user_updates AS write_overhead
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats s
ON s.object_id = i.object_id
AND s.index_id = i.index_id
AND s.database_id = DB_ID()
WHERE i.object_id = OBJECT_ID('dbo.Orders')
AND i.type > 0
AND i.is_primary_key = 0
AND i.is_unique = 0
ORDER BY (s.user_seeks + s.user_scans + s.user_lookups) ASC;
An index where user_seeks + user_scans + user_lookups is zero or very low but user_updates is high is a pure liability — it costs write overhead and returns nothing. These are the first candidates for removal.
In PostgreSQL, the equivalent lives in pg_stat_user_indexes:
SELECT
relname AS table_name,
indexrelname AS index_name,
idx_scan AS times_used,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
WHERE relname = 'orders'
ORDER BY idx_scan ASC;
An index with idx_scan = 0 that has existed since the last pg_stat_reset() is worth examining for removal.
Finding Duplicate and Redundant Indexes
Duplicate indexes — two indexes with the same leading columns — are common in tables that have grown organically. They offer no additional query benefit but double the write cost for those columns.
In SQL Server, compare the first key column across indexes on the same table:
SELECT
OBJECT_NAME(i1.object_id) AS table_name,
i1.name AS index_1,
i2.name AS index_2,
COL_NAME(ic1.object_id, ic1.column_id) AS shared_leading_column
FROM sys.index_columns ic1
JOIN sys.index_columns ic2
ON ic2.object_id = ic1.object_id
AND ic2.column_id = ic1.column_id
AND ic2.key_ordinal = 1
AND ic2.index_id != ic1.index_id
JOIN sys.indexes i1 ON i1.object_id = ic1.object_id AND i1.index_id = ic1.index_id
JOIN sys.indexes i2 ON i2.object_id = ic2.object_id AND i2.index_id = ic2.index_id
WHERE ic1.key_ordinal = 1
AND i1.object_id = OBJECT_ID('dbo.Orders')
ORDER BY table_name, shared_leading_column;
If IX_Orders_CustomerId and IX_Orders_CustomerId_Status both exist, the narrower index (IX_Orders_CustomerId) is redundant for queries that already use the wider one. The query planner will generally prefer the wider index if it covers more columns, meaning the narrower index does write work but earns no reads.
Understanding the Write Cost
The cost of an extra index is easiest to see by measuring it directly. Create a test table with and without an additional index and compare write throughput:
-- Baseline table: clustered index only
CREATE TABLE dbo.OrdersBaseline
(
OrderId INT NOT NULL PRIMARY KEY,
CustomerId INT NOT NULL,
Status VARCHAR(20) NOT NULL,
TotalAmount DECIMAL(18,2) NOT NULL,
CreatedAt DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME()
);
-- Over-indexed table: same table with four additional indexes
CREATE TABLE dbo.OrdersOverIndexed
(
OrderId INT NOT NULL PRIMARY KEY,
CustomerId INT NOT NULL,
Status VARCHAR(20) NOT NULL,
TotalAmount DECIMAL(18,2) NOT NULL,
CreatedAt DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME()
);
CREATE INDEX IX_OOI_CustomerId ON dbo.OrdersOverIndexed (CustomerId);
CREATE INDEX IX_OOI_Status ON dbo.OrdersOverIndexed (Status);
CREATE INDEX IX_OOI_CreatedAt ON dbo.OrdersOverIndexed (CreatedAt);
CREATE INDEX IX_OOI_Amount ON dbo.OrdersOverIndexed (TotalAmount);
Measure insert time for both:
DECLARE @start DATETIME2 = SYSUTCDATETIME();
INSERT INTO dbo.OrdersBaseline (OrderId, CustomerId, Status, TotalAmount)
SELECT TOP (100000)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
ABS(CHECKSUM(NEWID())) % 10000,
CASE ABS(CHECKSUM(NEWID())) % 3
WHEN 0 THEN 'Pending'
WHEN 1 THEN 'Shipped'
ELSE 'Delivered'
END,
ROUND(RAND(CHECKSUM(NEWID())) * 500, 2)
FROM sys.all_objects a CROSS JOIN sys.all_objects b;
SELECT DATEDIFF(MILLISECOND, @start, SYSUTCDATETIME()) AS baseline_ms;
Run the same block against dbo.OrdersOverIndexed. On a table with four extra non-clustered indexes, insert time is typically 2–4x higher. The gap widens further for UPDATE statements on indexed columns.
Prefer Covering Indexes Over Multiple Narrow Ones
When a query needs several columns from a table, the planner may execute a key lookup after finding matching rows in a non-clustered index — an extra I/O per row. A single covering index that includes all the needed columns eliminates the lookup entirely.
Consider a query that retrieves all open orders for a customer:
SELECT OrderId, Status, TotalAmount, CreatedAt
FROM dbo.Orders
WHERE CustomerId = @customerId
AND Status = 'Pending'
ORDER BY CreatedAt DESC;
Three narrow indexes (CustomerId, Status, CreatedAt) each require a key lookup to retrieve the other columns. One covering index does the whole job in a single seek:
-- A single covering index replaces three narrow ones
CREATE INDEX IX_Orders_Customer_Pending
ON dbo.Orders (CustomerId, Status)
INCLUDE (TotalAmount, CreatedAt);
The INCLUDE clause adds non-key columns to the leaf level of the index so SQL Server can return them without touching the clustered index. This removes the key lookup and typically halves the logical reads for the query.
Dropping an Index Safely
Before dropping an index in production, disable it first:
-- SQL Server: disable without dropping (easy to re-enable if needed)
ALTER INDEX IX_Orders_OldColumn ON dbo.Orders DISABLE;
Monitor for a few days. If nothing regresses, drop it:
DROP INDEX IX_Orders_OldColumn ON dbo.Orders;
In PostgreSQL there is no disable command, but you can create an index concurrently on a production table without locking and drop one the same way:
-- Drop without locking the table
DROP INDEX CONCURRENTLY idx_orders_old_column;
Summary
Indexes are not free. Every index you add trades write performance for read performance. Getting that trade-off right consistently means:
- Audit before you add — understand what exists and whether it is being used
- Remove the unused — a zero-read, high-write index is a pure tax on every insert and update
- Eliminate duplicates — redundant leading columns do write work twice for no extra benefit
- Build covering indexes — one wider index beats three narrow ones that force key lookups
- Measure the write cost — benchmark inserts and updates before and after any index change
The goal is not the minimum number of indexes or the maximum. It is the right set, chosen deliberately, with evidence.