Question 1

How do I choose between a B-tree, GIN, GiST, and BRIN index in PostgreSQL?

Accepted Answer

The choice depends on your data type and query operators. **B-tree** is the default and handles equality (`=`) and range (`<`, `>`, `BETWEEN`, `IS NULL`) on scalar types like integers, text, and timestamps. It supports ordered output and is the most versatile choice. **GIN** (Generalized Inverted Index) is designed for values that contain multiple elements — full-text search (`@@`), JSONB containment (`@>`), array overlap (`&&`), and trigram similarity (`%`). GIN indexes are larger and slower to update than B-tree but provide fast lookups into composite values. **GiST** (Generalized Search Tree) supports geometric operators, range type containment and overlap, nearest-neighbor (`<->`) ordering, and exclusion constraints. It is also used for full-text search as a smaller, lossy alternative to GIN. **BRIN** (Block Range Index) stores summary information per range of physical table blocks, making it extremely compact. It works well only when the indexed column is naturally correlated with the physical row order — timestamps in append-only tables are the classic case. For unordered data, BRIN provides almost no selectivity. **Hash** indexes handle only equality (`=`) on a single column. Since PostgreSQL 10, they are WAL-logged and crash-safe. They can be smaller than B-tree for wide keys, but do not support range queries or sorting.

Question 2

What is a covering index in PostgreSQL and when should I use INCLUDE?

Accepted Answer

A covering index contains all columns that a query needs, enabling PostgreSQL to answer the query entirely from the index without visiting the heap (table) pages. Since PostgreSQL 11, you can add non-key columns to a B-tree index using the `INCLUDE` clause. For example, `CREATE INDEX ON orders (customer_id, status) INCLUDE (total, created_at)` stores `total` and `created_at` in the index leaf pages without making them part of the sort key. The planner can then use an **index-only scan**, which is significantly faster for queries that would otherwise require a heap fetch for each matching row — especially when the table is large and poorly cached. You should consider `INCLUDE` when: (1) a query selects a small number of additional columns beyond the indexed key columns, (2) the table is large enough that heap fetches are expensive, and (3) the included columns are not too wide, since they increase the index size. Do not include columns in the index key just to get covering behavior — putting them in the key changes the sort order and can make the index less useful for other queries. Also note that `INCLUDE` columns are not used for filtering or sorting; they are only stored in leaf pages for retrieval. For GiST and GIN indexes, `INCLUDE` support was added in PostgreSQL 12 and 14 respectively.

Question 3

When should I use a partial index and how do I design one?

Accepted Answer

A partial index includes only the rows that satisfy a `WHERE` predicate in the index definition. This is useful when your queries consistently filter on a condition that selects a small fraction of the table. The classic example is a status column: if 95% of your rows have `status = 'completed'` and your queries only look for `status = 'pending'`, a partial index like `CREATE INDEX ON orders (created_at) WHERE status = 'pending'` is much smaller and faster than indexing the entire table. PostgreSQL will use the partial index when the query's `WHERE` clause logically implies the index predicate. The match does not need to be syntactically identical — the planner can infer implications like `status = 'pending'` implying `status = 'pending'`. However, if you use a parameterized query like `WHERE status = $1`, the planner cannot prove at plan time that `$1` will always be `'pending'`, so the partial index will not be used. In that case, you need the literal value in the query or a separate code path. Partial indexes are also valuable for unique constraints on subsets of data — for example, `CREATE UNIQUE INDEX ON users (email) WHERE deleted_at IS NULL` enforces uniqueness only among active users. When designing a partial index, check `pg_stat_user_indexes` after deployment to verify it is actually being scanned; if `idx_scan` stays at zero, the planner is not matching it.

Question 4

How many indexes are too many on a PostgreSQL table?

Accepted Answer

There is no fixed maximum, but every index carries ongoing costs that must be justified by query benefits. Each index must be updated on every `INSERT`, `UPDATE` (of indexed columns), and `DELETE`, adding write latency and WAL volume. Indexes consume storage — often 20-50% of the table size per index for B-tree — and they require vacuum maintenance to clean up dead index entries. A table with 10+ indexes can see `INSERT` throughput drop by 3-5x compared to an unindexed table. To determine if an index is pulling its weight, query `pg_stat_user_indexes` for `idx_scan` — if an index has zero or near-zero scans over weeks of production traffic, it is likely a candidate for removal. Also watch for redundant indexes: an index on `(a, b)` makes a separate index on `(a)` redundant for most queries, since the composite index supports prefix lookups. Use the `pg_catalog` views or tools like `pgstatindex()` to check index bloat — a bloated index can be several times larger than necessary and should be rebuilt with `REINDEX CONCURRENTLY`. A pragmatic guideline: aim for the minimum set of indexes that covers your actual query patterns, review unused indexes quarterly, and always measure write impact when adding indexes to write-heavy tables.

Index advisor

About this tool

Examples

Inputs and outputs

What you provide

What you get

Use cases

Features

Frequently asked questions

Related tools

Related resources

Ready to try it?