Skip to main content

Is over-indexing bad for SELECTs?

Β· 8 min read

It is well-known that having an extra index leads to certain types of overhead:

  1. so-called write amplification, slowing odwn INSERTs and UPDATEs (DELETEs are not affected):
    1. not only writes are slowed down, but index changes are written to WAL, increasing its volume,
    2. thus, backups are also affected,
    3. and physical replication as well, more needs to be replicated;
  2. (a very obvious one!) extra disk space is used;
  3. overhead in memory – space in the buffer pool and page cache, occupied by index pages.

But it is much less known that extra indexes slow down SELECTs, and might even cause performance cliffs. Let's explore it.

Unexpected: the planner locks all indexes​

Let's create a simple table and 20 indexes on it, and remember the PID of this session:

test=# create table test(i int8);
CREATE TABLE
test=# do $$
begin
for i in 1..20 loop
execute 'create index on test(i)';
end loop;
end;
$$;
DO

Now let's remember the PID of this session, and perform an EXPLAIN for a simple query in a transaction, not closing the transaction:

test=# select pg_backend_pid(); begin; explain select from test where i = 0;
pg_backend_pid
----------------
2961644
(1 row)

BEGIN
QUERY PLAN
----------------------------------------------------------------------------
Bitmap Heap Scan on test (cost=1.34..9.56 rows=11 width=0)
Recheck Cond: (i = 0)
-> Bitmap Index Scan on test_i_idx19 (cost=0.00..1.34 rows=11 width=0)
Index Cond: (i = 0)
(4 rows)

test=*#

(EXPLAIN here )

In another session, let's look at the locks, acquired by the first session:

test=# select
locktype,
relation::regclass,
pid,
mode,
granted,
fastpath
from pg_locks
where
pid = 2961907
and locktype <> 'virtualxid'
;
locktype | relation | pid | mode | granted | fastpath
----------+--------------+---------+-----------------+---------+----------
relation | test_i_idx14 | 2961907 | AccessShareLock | t | t
relation | test_i_idx13 | 2961907 | AccessShareLock | t | t
relation | test_i_idx12 | 2961907 | AccessShareLock | t | t
relation | test_i_idx11 | 2961907 | AccessShareLock | t | t
relation | test_i_idx10 | 2961907 | AccessShareLock | t | t
relation | test_i_idx9 | 2961907 | AccessShareLock | t | t
relation | test_i_idx8 | 2961907 | AccessShareLock | t | t
relation | test_i_idx7 | 2961907 | AccessShareLock | t | t
relation | test_i_idx6 | 2961907 | AccessShareLock | t | t
relation | test_i_idx5 | 2961907 | AccessShareLock | t | t
relation | test_i_idx4 | 2961907 | AccessShareLock | t | t
relation | test_i_idx3 | 2961907 | AccessShareLock | t | t
relation | test_i_idx2 | 2961907 | AccessShareLock | t | t
relation | test_i_idx1 | 2961907 | AccessShareLock | t | t
relation | test_i_idx | 2961907 | AccessShareLock | t | t
relation | test | 2961907 | AccessShareLock | t | t
relation | test_i_idx16 | 2961907 | AccessShareLock | t | f
relation | test_i_idx17 | 2961907 | AccessShareLock | t | f
relation | test_i_idx19 | 2961907 | AccessShareLock | t | f
relation | test_i_idx15 | 2961907 | AccessShareLock | t | f
relation | test_i_idx18 | 2961907 | AccessShareLock | t | f
(21 rows)

As we can see, the first session locked the table with AccessShareLock, but it also locked all its indexes. This might look unexpected – 21 heavy locks in total.

Additionally, let's notice that only first 16 locks have fastpath set to true, five more have false in there.

// TBD explain what's happening // https://github.com/postgres/postgres/blob/f21e51ea73973f87715581fc68a6caf97f56505e/src/include/storage/proc.h#L85 // pg18: allows adjusting it https://github.com/postgres/postgres/blob/4e1fad37872e49a711adad5d9870516e5c71a375/src/include/storage/proc.h#L91C1-L91C92

Benchmark​

Using the Postgres.AI assistant, we conducted a series of traditional pgbench benchmarks to analyze impact of over-indexing on the SELECT queries: https://postgres.ai/chats/01921029-781b-789c-9076-31c1f87ab649.

The key idea was to run pgbench with -S ("SELECTs only") multiple times, add one more index at each step, testing with up to 20 additional indexes on pgbench_accounts. In this research we used Postgres 16 running on small machines in GCP, n2-standard-8 (8 vCPUs, 64 GiB RAM).

Here is the full config for the first series of experiments, defined in the format understood by the Postgres.AI assistant:

{
"POSTGRES_VERSION": "16",
"SERVER_TYPE": "n2-standard-8",
"VOLUME_SIZE": "100",
"TEST_RUNS": {
"index_01": {
"pre_sql": "create index on pgbench_accounts (aid);",
"workload_pgbench": "pgbench -c 8 -j 8 -S -P 10 -T 300 -R 5000 -r"
},
"index_02": {
"pre_sql": "create index on pgbench_accounts (aid);",
"workload_pgbench": "pgbench -c 8 -j 8 -S -P 10 -T 300 -R 5000 -r"
},
"index_03": {
"pre_sql": "create index on pgbench_accounts (aid);",
"workload_pgbench": "pgbench -c 8 -j 8 -S -P 10 -T 300 -R 5000 -r"
},
... // omitted: steps to add 4th..19th extra indexes
"index_20": {
"pre_sql": "create index on pgbench_accounts (aid);",
"workload_pgbench": "pgbench -c 8 -j 8 -S -P 10 -T 300 -R 5000 -r"
}
}
}

With option -r, pgbench reports the latencies in the end, as we can double-check in the artifact set collected for the benchamarks. For example, for the first series of experiments, where we limited TPS by 5000 (-R 5000), for the last iteration, where we had 20 extra indexes, we had:

[index_20|workload_pgbench] 2024-09-20 18:06:25 number of transactions actually processed: 1500352
[index_20|workload_pgbench] 2024-09-20 18:06:25 number of failed transactions: 0 (0.000%)
[index_20|workload_pgbench] 2024-09-20 18:06:25 latency average = 0.283 ms
[index_20|workload_pgbench] 2024-09-20 18:06:25 latency stddev = 0.251 ms
[index_20|workload_pgbench] 2024-09-20 18:06:25 rate limit schedule lag: avg 0.091 (max 9.729) ms
[index_20|workload_pgbench] 2024-09-20 18:06:25 initial connection time = 45.322 ms
[index_20|workload_pgbench] 2024-09-20 18:06:25 tps = 5001.918903 (without initial connection time)
[index_20|workload_pgbench] 2024-09-20 18:06:25 statement latencies in milliseconds and failures:
[index_20|workload_pgbench] 2024-09-20 18:06:25 0.001 0 \set aid random(1, 100000 * :scale)
[index_20|workload_pgbench] 2024-09-20 18:06:25 0.191 0 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

– indeed, we had only a single SELECT.

The first series of pgbench runs, latency was slowly growing, with a noticeable step at the moment, when 15th extra index was created:

Overindexing affects SELECT latency

Why the step at 15th extra index? 15 extra indexes, with the table itself and its primary key – this gives overall number of relations locked 17, exceeding FP_LOCK_SLOTS_PER_BACKEND.

However, if we repeat the experiment on a server with 96 vCPUs, the picture changes drastically (re-test details are available here):

Overindexing affects SELECT latency – LWLock:LockManager performance cliff

As a result, we see a significant drawdown of TPS:

Overindexing affects SELECT latency – LWLock:LockManager performance cliff

index_01
index_02
index_03
index_04
index_05
index_06
index_07
index_08
index_09
index_10
index_11
index_12
index_13
index_14
index_15
index_16
index_17
index_18
index_19
index_20
index_01

wait_typewait_eventof_events
CPU*CPU or unspecified1678243
ClientClientRead1247803
TimeoutSpinDelay1413
IODataFileRead80
LWLockProcArray25
LWLockBufferMapping4
IOWALSync2
wait_typewait_eventof_events
CPU*CPU or unspecified1702462
ClientClientRead1224858
TimeoutSpinDelay910
IODataFileRead86
LWLockProcArray22
LWLockBufferMapping5
IORelationMapRead1
wait_typewait_eventof_events
CPU*CPU or unspecified1720935
ClientClientRead1210059
TimeoutCheckpointWriteDelay11424
TimeoutSpinDelay978
IODataFileRead116
IODataFileWrite7
LWLockProcArray33
LWLockBufferMapping4
LWLockLockManager2
LWLockPgStatsData1
IODataFileFlush1
wait_typewait_eventof_events
CPU*CPU or unspecified1741786
ClientClientRead1186858
TimeoutCheckpointWriteDelay10794
TimeoutSpinDelay1362
IODataFileRead73
IODataFileWrite8
LWLockLockManager29
LWLockProcArray28
LWLockBufferMapping3
IORelationMapRead2
IOWALInitSync2
IOWALInitWrite2
IODataFileFlush1
IODataFileSync1
wait_typewait_eventof_events
CPU*CPU or unspecified1751332
ClientClientRead1174238
TimeoutSpinDelay901
IODataFileRead89
LWLockpg_stat_statements84
LWLockProcArray29
LWLockLockManager7
LWLockBufferMapping6
wait_typewait_eventof_events
CPU*CPU or unspecified1758766
ClientClientRead1156647
TimeoutSpinDelay1076
TimeoutCheckpointWriteDelay216
IODataFileRead76
LWLockpg_stat_statements46
LWLockProcArray38
LWLockLockManager15
LWLockBufferMapping1
IORelationMapRead1
wait_typewait_eventof_events
CPU*CPU or unspecified1762178
ClientClientRead1149165
TimeoutSpinDelay2351
IODataFileRead85
LWLockpg_stat_statements51
LWLockProcArray34
LWLockLockManager18
LWLockBufferMapping2
wait_typewait_eventof_events
CPU*CPU or unspecified1785926
ClientClientRead1135163
TimeoutSpinDelay1286
IODataFileRead88
LWLockProcArray46
LWLockBufferMapping5
wait_typewait_eventof_events
CPU*CPU or unspecified1810982
ClientClientRead1114912
TimeoutSpinDelay1262
TimeoutCheckpointWriteDelay168
IODataFileRead60
LWLockLockManager40
LWLockProcArray25
LWLockpg_stat_statements21
LWLockBufferMapping3
IORelationMapRead1
IODataFileFlush1
IOControlFileSyncUpdate1
wait_typewait_eventof_events
CPU*CPU or unspecified1830171
ClientClientRead1098539
TimeoutSpinDelay1161
IODataFileRead90
LWLockpg_stat_statements76
LWLockProcArray28
LWLockBufferMapping5
LWLockLockManager1
wait_typewait_eventof_events
CPU*CPU or unspecified1843424
ClientClientRead1086142
TimeoutSpinDelay554
IODataFileRead65
LWLockProcArray31
LWLockLockManager10
LWLockBufferMapping3
LWLockLockFastPath1
wait_typewait_eventof_events
CPU*CPU or unspecified1830895
ClientClientRead1078809
TimeoutSpinDelay832
TimeoutCheckpointWriteDelay188
LWLockLockManager72
IODataFileRead68
LWLockProcArray28
LWLockBufferMapping5
IOSLRUFlushSync1
IODataFileFlush1
wait_typewait_eventof_events
CPU*CPU or unspecified1844351
ClientClientRead1066445
TimeoutSpinDelay377
IODataFileRead86
LWLockProcArray23
LWLockLockManager22
LWLockBufferMapping6
IORelationMapRead1
IOWALSync1
wait_typewait_eventof_events
CPU*CPU or unspecified1860699
ClientClientRead1058952
TimeoutSpinDelay496
TimeoutCheckpointWriteDelay196
IODataFileRead61
LWLockLockManager60
LWLockpg_stat_statements58
LWLockProcArray29
LWLockBufferMapping6
LWLockLockFastPath2
IOSLRUFlushSync1
IODataFileWrite1
IOControlFileSyncUpdate1
wait_typewait_eventof_events
LWLockLockManager1905264
CPU*CPU or unspecified693754
ClientClientRead330996
TimeoutSpinDelay109
IODataFileRead35
LWLockProcArray7
IORelationMapRead2
wait_typewait_eventof_events
LWLockLockManager1903838
CPU*CPU or unspecified706550
ClientClientRead323393
TimeoutSpinDelay131
IODataFileRead26
LWLockProcArray7
IORelationMapRead1
wait_typewait_eventof_events
LWLockLockManager2251059
CPU*CPU or unspecified431848
ClientClientRead232233
TimeoutCheckpointWriteDelay167
TimeoutSpinDelay72
IODataFileRead30
LWLockProcArray13
wait_typewait_eventof_events
LWLockLockManager2231931
CPU*CPU or unspecified448468
ClientClientRead234223
TimeoutSpinDelay94
LWLockpg_stat_statements39
IODataFileRead23
LWLockProcArray11
LWLockBufferMapping1
wait_typewait_eventof_events
LWLockLockManager2209764
CPU*CPU or unspecified479033
ClientClientRead238590
TimeoutSpinDelay122
IODataFileRead29
LWLockProcArray12
LWLockpg_stat_statements7
LWLockBufferMapping3
IOWALSync1
IOWALWrite1
wait_typewait_eventof_events
LWLockLockManager2228834
CPU*CPU or unspecified461774
ClientClientRead230907
TimeoutCheckpointWriteDelay188
TimeoutSpinDelay104
IODataFileRead27
LWLockpg_stat_statements21
LWLockProcArray6
index_01

Explore it further​

Postgres.AI logoPostgres.AI Assistant
Postgres.AI
β €
By default, this communication is public. For private chats, register your own organization in Console.


We encourage you to experiment with the interactive AI assistant provided in this post, explore how these findings apply to your specific use cases, and share your experiences with the wider Postgres community.