Skip to main content

Too many connections troubleshooter

Diagnose connection exhaustion and get expert fixes for PostgreSQL connection limits

For people who are getting FATAL: too many connections errors in PostgreSQL and need to diagnose the cause, fix it immediately, and prevent it from recurring.

About this tool

The PostgreSQL error FATAL: too many connections for role "myuser" or FATAL: sorry, too many clients already is one of the most common production emergencies. It means every available connection slot is occupied and new clients are being refused. The immediate impact is a full application outage — no new queries can execute until connections are freed. The first step in any "too many connections" incident is to connect as a superuser (PostgreSQL reserves superuser_reserved_connections slots, default 3, for exactly this purpose) and query pg_stat_activity to understand what is consuming the slots.

The root cause is rarely that max_connections is simply too low. More often, connections are being consumed by idle sessions, leaked connections from application code that fails to close them, long-running transactions holding slots open, or the absence of a connection pooler between the application tier and PostgreSQL. Raising max_connections without understanding the underlying cause creates a worse problem: each PostgreSQL backend process consumes memory (typically 5-10 MB per connection for work_mem, temp_buffers, and shared catalog caches), and hundreds of active backends create lock contention, cache pressure, and process scheduling overhead that degrades performance for everyone. PostgreSQL's process-per-connection architecture means the kernel must context-switch between all active backends, and the shared buffer pool suffers increased contention on buffer mapping locks with more concurrent processes.

This tool helps you diagnose exactly what is consuming your connections by analyzing pg_stat_activity, breaking down connections by state (active, idle, idle in transaction, idle in transaction (aborted)), by user, by application, and by client address. It then provides targeted fixes: configuring PgBouncer in transaction mode for connection multiplexing, setting idle_in_transaction_session_timeout to kill abandoned transactions, tuning superuser_reserved_connections so you always have emergency access, applying per-role and per-database connection limits to prevent a single application from monopolizing slots, and configuring your ORM's connection pool to match your actual concurrency needs rather than defaulting to an excessively large pool size.

For production environments, the recommended architecture is to never let application servers connect directly to PostgreSQL. Instead, place PgBouncer or Pgpool-II between the application and the database. PgBouncer in transaction pooling mode can serve thousands of application connections with as few as 20-50 actual PostgreSQL backend connections, because most application connections are idle at any given moment. This dramatically reduces memory usage and eliminates the "too many connections" problem at its source. When configuring PgBouncer, critical parameters include default_pool_size (server connections per user/database pair), reserve_pool_size (overflow during spikes), and max_client_conn (total client connections PgBouncer accepts). Transaction mode is preferred for web workloads, but session-level features like LISTEN/NOTIFY, advisory locks, and prepared statements require session mode or direct connections.

Connection leaks are the most insidious cause because they accumulate slowly. A connection that is opened but never returned to the pool (due to an exception path that skips the finally block, or a missing context manager in Python) will sit idle until the process exits or the TCP keepalive timeout fires. This tool helps you identify leaked connections by examining session age, state duration, and query patterns in pg_stat_activity, and provides guidance for configuring TCP keepalives (tcp_keepalives_idle, tcp_keepalives_interval, tcp_keepalives_count) and connection lifetime limits to automatically reclaim them. On the application side, every major ORM and connection pool library provides leak detection settings — SQLAlchemy's pool_recycle and pool_pre_ping, HikariCP's leakDetectionThreshold, Django's CONN_MAX_AGE — that should be configured as a standard part of production deployment.

Examples

-- Connection state breakdown
select state, count(*) as cnt
from pg_stat_activity
where backend_type = 'client backend'
group by state
order by cnt desc;

-- Connections by user and application
select usename, application_name, state, count(*)
from pg_stat_activity
where backend_type = 'client backend'
group by usename, application_name, state
order by count(*) desc;

-- Longest-running idle connections (potential leaks)
select pid, usename, application_name, client_addr,
state, state_change,
now() - state_change as idle_duration,
left(query, 80) as last_query
from pg_stat_activity
where state = 'idle'
and backend_type = 'client backend'
order by state_change asc
limit 20;

Diagnostic queries to run immediately when hitting connection limits. The first query shows how many connections are active vs. idle vs. stuck in transaction. The second identifies which applications and users are consuming the most connections. The third finds the oldest idle connections, which are likely leaked.

; PgBouncer config — /etc/pgbouncer/pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

; Transaction pooling — connections returned after each transaction
pool_mode = transaction

; Max connections PgBouncer opens to PostgreSQL (per database/user pair)
default_pool_size = 25

; Extra connections allowed during traffic spikes
reserve_pool_size = 5
reserve_pool_timeout = 3

; Max client connections PgBouncer accepts
max_client_conn = 2000

; Close server connections that have been idle this long
server_idle_timeout = 300

; Force-close server connections after this lifetime
server_lifetime = 3600

; Log connections/disconnections for leak debugging
log_connections = 1
log_disconnections = 1

A production PgBouncer configuration using transaction pooling mode. This setup allows up to 2000 application connections while using only 25-30 actual PostgreSQL connections per database/user pair. Transaction mode returns connections to the pool after each transaction completes, providing maximum multiplexing efficiency.

-- Set server-level timeouts (requires restart for some, reload for others)
alter system set max_connections = '200';
alter system set superuser_reserved_connections = '5';
alter system set idle_in_transaction_session_timeout = '300s';
alter system set tcp_keepalives_idle = '60';
alter system set tcp_keepalives_interval = '10';
alter system set tcp_keepalives_count = '6';
-- Requires PostgreSQL 17+
alter system set client_connection_check_interval = '30s';
select pg_reload_conf();

-- Per-role connection limits
alter role app_backend connection limit 50;
alter role reporting_user connection limit 10;
alter role admin_user connection limit 5;

-- Per-database connection limits
alter database production connection limit 150;
alter database analytics connection limit 30;

Server configuration and connection limit statements. TCP keepalive settings detect dead connections faster (within ~120 seconds instead of the OS default of ~2 hours). Per-role and per-database limits ensure no single consumer can exhaust all connection slots.

Inputs and outputs

What you provide

  • pg_stat_activity output showing connection states, users, and applications
  • Current max_connections and superuser_reserved_connections settings
  • Connection pooler configuration (PgBouncer, Pgpool-II) if any
  • Application stack details (language, ORM, framework)

What you get

  • Root cause diagnosis with evidence from pg_stat_activity analysis
  • Emergency triage steps to restore service during an outage
  • PgBouncer or connection pooler configuration recommendations
  • SQL statements for connection limits, timeouts, and TCP keepalive settings
  • ORM-specific connection pool configuration guidance

Use cases

  • Diagnosing a production outage caused by "FATAL: sorry, too many clients already" by analyzing pg_stat_activity connection states
  • Setting up PgBouncer in transaction pooling mode to multiplex thousands of application connections onto a small backend pool
  • Identifying and fixing connection leaks in application code by examining idle connection age and client information
  • Configuring per-role and per-database connection limits to prevent a single application from exhausting all available slots
  • Tuning idle_in_transaction_session_timeout and statement_timeout to automatically reclaim stuck connections
  • Planning max_connections and related memory settings for a new production deployment based on expected concurrency

Features

  • Analyzes pg_stat_activity output to break down connections by state, user, application_name, and client_addr
  • Identifies connection leaks by examining session duration, state change timestamps, and query patterns
  • Provides PgBouncer configuration recommendations tailored to your workload (transaction vs. session mode)
  • Calculates safe max_connections values based on available memory and expected backend overhead
  • Generates ALTER ROLE and ALTER DATABASE statements for per-role and per-database connection limits
  • Recommends idle_in_transaction_session_timeout, tcp_keepalives, and client_connection_check_interval settings
  • Suggests ORM-specific connection pool configuration (SQLAlchemy, Django, Rails ActiveRecord, JDBC HikariCP)

Frequently asked questions

What causes "FATAL: sorry, too many clients already" in PostgreSQL and how do I fix it immediately?

This error occurs when the number of connected backends reaches max_connections (minus superuser_reserved_connections). For an immediate fix during an outage, connect as a superuser (the reserved connections are kept for this purpose) and identify what is consuming slots: run select state, count(*) from pg_stat_activity where backend_type = 'client backend' group by state;. The most common findings are: (1) hundreds of idle connections from an application without connection pooling or with a leaked pool — terminate them with select pg_terminate_backend(pid) from pg_stat_activity where state = 'idle' and state_change < now() - interval '10 minutes';, (2) many idle in transaction connections from abandoned transactions — terminate them similarly and set idle_in_transaction_session_timeout to prevent recurrence, or (3) a sudden spike in active connections from a traffic surge — this requires connection pooling (PgBouncer) as the long-term fix. Do not simply raise max_connections as a permanent solution. Each backend consumes memory and OS resources; PostgreSQL performs poorly with hundreds of truly active connections due to lock contention, context switching, and buffer pool pressure. The sustainable fix is to place a connection pooler in front of PostgreSQL and keep max_connections at a reasonable value (100-300 for most workloads).

Should I use PgBouncer in transaction mode or session mode, and what are the tradeoffs?

Transaction pooling mode (pool_mode = transaction) provides the best connection multiplexing because the server connection is returned to the pool after each transaction completes. This means 2000 application connections can share 25 server connections, as long as they are not all executing transactions simultaneously. However, transaction mode has restrictions: you cannot use session-level features like SET commands (use SET LOCAL inside transactions instead), PREPARE/EXECUTE for prepared statements (use PgBouncer's prepared_statement_mode with version 1.21+), LISTEN/NOTIFY, temporary tables that persist across transactions, or advisory locks held across transactions. Session pooling mode (pool_mode = session) assigns a server connection for the entire client session, which supports all PostgreSQL features but provides much less multiplexing — a connection is only reused after the client disconnects. For most web applications and microservices, transaction mode is the correct choice because each HTTP request runs one or a few transactions and does not need session-level state. For long-lived connections that use session features (like background workers using LISTEN/NOTIFY), either use session mode for those specific pools or connect directly to PostgreSQL, bypassing PgBouncer.

How do I detect and fix connection leaks in my application?

Connection leaks happen when application code obtains a connection from the pool but fails to return it — typically because an exception is thrown before the close() or release() call, or because a code path simply forgets to release. The leaked connection sits idle in PostgreSQL until the process exits or a timeout fires. To detect leaks, query pg_stat_activity for connections that have been idle for an unusually long time: select pid, usename, application_name, client_addr, now() - state_change as idle_time from pg_stat_activity where state = 'idle' and now() - state_change > interval '30 minutes' order by idle_time desc;. If you see connections idle for hours from an application that should only hold connections briefly, you have a leak. To identify the leaking code, use application_name (set it in your connection string to include the service name or version) and client_addr to narrow down which instance is leaking. In your application code, always use try-with-resources (Java), context managers (Python with), or ensure blocks (Ruby) to guarantee connections are returned. Configure your ORM pool with aggressive leak detection: SQLAlchemy pool_recycle=1800 and pool_pre_ping=True, HikariCP leakDetectionThreshold=60000, Django CONN_MAX_AGE=300. On the PostgreSQL side, set tcp_keepalives_idle=60, tcp_keepalives_interval=10, and tcp_keepalives_count=6 to detect dead TCP connections within about two minutes instead of the OS default of two hours.

How should I set max_connections and what is the memory impact per connection?

Each PostgreSQL backend process allocates memory for work_mem (used per sort/hash operation, can be multiple per query), temp_buffers (for temporary tables), catalog caches, and various per-session state. A completely idle connection uses roughly 5-10 MB, but an active connection running a complex query with multiple sort operations can easily use 50-200 MB or more (since work_mem is allocated per operation node, not per connection). The formula for a safe ceiling is: max_connections * estimated_memory_per_backend + shared_buffers + OS_needs < total_RAM. For a server with 16 GB RAM, 4 GB shared_buffers, and assuming 10 MB average per backend, you could support around 400 connections before memory pressure becomes a concern — but having 400 *active* connections will cause severe CPU and lock contention. A better approach: set max_connections to 200-300 and use PgBouncer to handle the application-side connection count. The superuser_reserved_connections setting (default 3, recommend 5) keeps slots available for emergency access. For cloud-managed PostgreSQL (RDS, Cloud SQL), the provider sets max_connections based on instance memory, but connection pooling is still essential. PostgreSQL 14+ introduced min_dynamic_shared_memory and other improvements that reduce per-backend overhead, but the fundamental advice remains: fewer active connections with pooling outperforms many direct connections.

Related tools

Related resources

Ready to try it?

Use this tool for free — powered by PostgresAI.