Index advisor
Get expert PostgreSQL index recommendations tailored to your queries and workload
For people who have slow queries or a new schema and want to know which indexes to create, what index type to use, and how to avoid over-indexing.
About this tool
Choosing the right indexes is one of the highest-leverage decisions you can make for PostgreSQL performance — and one of the easiest to get wrong. Too few indexes and your queries crawl through sequential scans on million-row tables. Too many and your writes slow down, your storage balloons, and autovacuum struggles to keep up. The right answer depends on your query patterns, data distribution, table size, and write-to-read ratio — context that generic rules of thumb cannot capture.
This tool analyzes your queries, schema, and workload characteristics to recommend the specific indexes that will have the greatest impact. It goes far beyond simple "add an index on the WHERE column" advice. It considers composite index column ordering (high-selectivity columns first for equality, range columns last), covering indexes with INCLUDE to enable index-only scans, partial indexes that target only the rows your queries care about, and expression indexes for computed filter conditions like lower(email) or date_trunc('day', created_at).
PostgreSQL offers a rich set of index types, each optimized for different access patterns. B-tree indexes handle equality and range queries on scalar types and are the right default in most cases. GIN indexes excel at full-text search, JSONB containment queries, and array overlap operations. GiST indexes support geometric types, range types, and nearest-neighbor searches. BRIN indexes provide compact, low-maintenance indexing for physically ordered data like timestamps in append-only tables. Hash indexes (fully WAL-logged since PostgreSQL 10) serve single-column equality lookups with a smaller footprint than B-tree when range queries are not needed.
Column ordering within a composite index matters enormously and is one of the most commonly misunderstood aspects of PostgreSQL indexing. The general principle is: equality-tested columns first (in order of descending selectivity), followed by range-tested columns, and finally columns used only for sorting. An index on (tenant_id, status, created_at) can efficiently serve WHERE tenant_id = $1 AND status = 'active' ORDER BY created_at, but reversing the column order to (created_at, status, tenant_id) makes the index almost useless for that same query because the leading column is not constrained to a single value. This tool applies these ordering rules automatically and explains the reasoning behind each recommendation.
Beyond choosing the index type and column order, this tool helps you design indexes that work together as a coherent strategy. It identifies redundant indexes that waste space and slow down writes — for instance, an index on (a) is redundant when (a, b) already exists, since the composite index handles prefix lookups. It suggests consolidating overlapping indexes into broader composite indexes, and warns when a proposed index is unlikely to be used because of low selectivity or planner cost estimates. It also considers operational concerns: whether to use CREATE INDEX CONCURRENTLY to avoid blocking writes, how large the index will be relative to the table, and whether your maintenance_work_mem is sized appropriately for the index build.
The tool also addresses expression indexes, which are essential when your queries filter on the result of a function applied to a column. For example, if your application performs case-insensitive email lookups with WHERE lower(email) = lower($1), a standard B-tree index on email will not be used — you need CREATE INDEX ON users (lower(email)). Similarly, queries that filter on date ranges extracted from timestamps often benefit from an expression index on date_trunc or a cast to date.
For best results, provide the queries you want to optimize, the relevant CREATE TABLE statements (including existing indexes and constraints), approximate table sizes, and your PostgreSQL version. If you have EXPLAIN ANALYZE output showing the current plan, include that too — it gives the advisor concrete data about selectivity and row counts to base its recommendations on.
Examples
-- Table schema
create table events (
id bigint generated always as identity primary key,
tenant_id uuid not null,
event_type text not null,
created_at timestamptz not null default now(),
payload jsonb not null,
processed boolean not null default false
);
-- Query 1: fetch unprocessed events for a tenant
select id, event_type, payload
from events
where tenant_id = $1
and processed = false
order by created_at
limit 100;
-- Query 2: search events by payload field
select id, event_type, created_at
from events
where tenant_id = $1
and payload @> '{"source": "webhook"}'
and created_at >= now() - interval '7 days';
-- Query 3: count events by type per tenant
select event_type, count(*)
from events
where tenant_id = $1
and created_at between $2 and $3
group by event_type;
A multi-tenant events table with three distinct query patterns. The advisor will recommend a partial composite index for the unprocessed events query, a GIN index for the JSONB containment query, and a composite B-tree index for the aggregation query — each tailored to the specific access pattern.
-- Existing indexes on a table
create table orders (
id serial primary key,
customer_id integer not null,
status text not null,
total numeric(10,2),
created_at timestamptz not null default now()
);
create index idx_orders_customer on orders (customer_id);
create index idx_orders_status on orders (status);
create index idx_orders_customer_status on orders (customer_id, status);
create index idx_orders_created on orders (created_at);
create index idx_orders_customer_created on orders (customer_id, created_at);
-- The query I need to optimize:
select id, status, total, created_at
from orders
where customer_id = $1
and status = 'shipped'
and created_at >= now() - interval '30 days'
order by created_at desc
limit 20;
A table with five existing indexes and a query that touches multiple columns. The advisor will identify that idx_orders_customer is redundant (it is a prefix of idx_orders_customer_status and idx_orders_customer_created), and recommend a single targeted index like (customer_id, status, created_at DESC) with INCLUDE (total) to serve this query as an index-only scan.
Inputs and outputs
What you provide
- SQL queries to optimize
- CREATE TABLE statements with existing indexes and constraints
- Approximate table row counts and write-to-read ratio
- PostgreSQL version
- EXPLAIN ANALYZE output (optional, for selectivity data)
What you get
- CREATE INDEX CONCURRENTLY statements with exact column ordering and type selection
- Explanation of index type choice and column ordering rationale
- Redundant index identification with safe removal guidance
- Summary table of recommended indexes with estimated size impact
Use cases
- Determining which indexes to create for a set of application queries before deploying to production
- Choosing the correct index type (B-tree, GIN, GiST, BRIN, hash) for a specific query pattern and data type
- Designing composite indexes with optimal column ordering for multi-column WHERE and ORDER BY clauses
- Identifying redundant or unused indexes that waste storage and slow down writes
- Creating partial indexes to accelerate queries that filter on a common condition like status = 'active'
- Evaluating whether a covering index with INCLUDE can enable index-only scans and eliminate heap fetches
Features
- Recommends specific index types based on operator usage, data types, and access patterns
- Generates ready-to-use CREATE INDEX statements with optimal column ordering
- Suggests partial indexes when queries consistently filter on a selective condition
- Identifies opportunities for covering indexes (INCLUDE) to enable index-only scans
- Detects redundant indexes that are subsets of existing composite indexes
- Considers write overhead and storage impact when recommending indexes
- Advises on CONCURRENTLY usage and operational concerns for production deployments
Frequently asked questions
How do I choose between a B-tree, GIN, GiST, and BRIN index in PostgreSQL?
The choice depends on your data type and query operators. **B-tree** is the default and handles equality (=) and range (<, >, BETWEEN, IS NULL) on scalar types like integers, text, and timestamps. It supports ordered output and is the most versatile choice. **GIN** (Generalized Inverted Index) is designed for values that contain multiple elements — full-text search (@@), JSONB containment (@>), array overlap (&&), and trigram similarity (%). GIN indexes are larger and slower to update than B-tree but provide fast lookups into composite values. **GiST** (Generalized Search Tree) supports geometric operators, range type containment and overlap, nearest-neighbor (<->) ordering, and exclusion constraints. It is also used for full-text search as a smaller, lossy alternative to GIN. **BRIN** (Block Range Index) stores summary information per range of physical table blocks, making it extremely compact. It works well only when the indexed column is naturally correlated with the physical row order — timestamps in append-only tables are the classic case. For unordered data, BRIN provides almost no selectivity. **Hash** indexes handle only equality (=) on a single column. Since PostgreSQL 10, they are WAL-logged and crash-safe. They can be smaller than B-tree for wide keys, but do not support range queries or sorting.
What is a covering index in PostgreSQL and when should I use INCLUDE?
A covering index contains all columns that a query needs, enabling PostgreSQL to answer the query entirely from the index without visiting the heap (table) pages. Since PostgreSQL 11, you can add non-key columns to a B-tree index using the INCLUDE clause. For example, CREATE INDEX ON orders (customer_id, status) INCLUDE (total, created_at) stores total and created_at in the index leaf pages without making them part of the sort key. The planner can then use an **index-only scan**, which is significantly faster for queries that would otherwise require a heap fetch for each matching row — especially when the table is large and poorly cached. You should consider INCLUDE when: (1) a query selects a small number of additional columns beyond the indexed key columns, (2) the table is large enough that heap fetches are expensive, and (3) the included columns are not too wide, since they increase the index size. Do not include columns in the index key just to get covering behavior — putting them in the key changes the sort order and can make the index less useful for other queries. Also note that INCLUDE columns are not used for filtering or sorting; they are only stored in leaf pages for retrieval. For GiST and GIN indexes, INCLUDE support was added in PostgreSQL 12 and 14 respectively.
When should I use a partial index and how do I design one?
A partial index includes only the rows that satisfy a WHERE predicate in the index definition. This is useful when your queries consistently filter on a condition that selects a small fraction of the table. The classic example is a status column: if 95% of your rows have status = 'completed' and your queries only look for status = 'pending', a partial index like CREATE INDEX ON orders (created_at) WHERE status = 'pending' is much smaller and faster than indexing the entire table. PostgreSQL will use the partial index when the query's WHERE clause logically implies the index predicate. The match does not need to be syntactically identical — the planner can infer implications like status = 'pending' implying status = 'pending'. However, if you use a parameterized query like WHERE status = $1, the planner cannot prove at plan time that $1 will always be 'pending', so the partial index will not be used. In that case, you need the literal value in the query or a separate code path. Partial indexes are also valuable for unique constraints on subsets of data — for example, CREATE UNIQUE INDEX ON users (email) WHERE deleted_at IS NULL enforces uniqueness only among active users. When designing a partial index, check pg_stat_user_indexes after deployment to verify it is actually being scanned; if idx_scan stays at zero, the planner is not matching it.
How many indexes are too many on a PostgreSQL table?
There is no fixed maximum, but every index carries ongoing costs that must be justified by query benefits. Each index must be updated on every INSERT, UPDATE (of indexed columns), and DELETE, adding write latency and WAL volume. Indexes consume storage — often 20-50% of the table size per index for B-tree — and they require vacuum maintenance to clean up dead index entries. A table with 10+ indexes can see INSERT throughput drop by 3-5x compared to an unindexed table. To determine if an index is pulling its weight, query pg_stat_user_indexes for idx_scan — if an index has zero or near-zero scans over weeks of production traffic, it is likely a candidate for removal. Also watch for redundant indexes: an index on (a, b) makes a separate index on (a) redundant for most queries, since the composite index supports prefix lookups. Use the pg_catalog views or tools like pgstatindex() to check index bloat — a bloated index can be several times larger than necessary and should be rebuilt with REINDEX CONCURRENTLY. A pragmatic guideline: aim for the minimum set of indexes that covers your actual query patterns, review unused indexes quarterly, and always measure write impact when adding indexes to write-heavy tables.
Related tools
Related resources
Ready to try it?
Use this tool for free — powered by PostgresAI.