06. Replication and HA
Monitor replication lag, slot activity, and the overall high availability posture of your PostgreSQL cluster.
note
This dashboard is currently under development. In the live demo, it shows a placeholder. The metrics are being collected and the dashboard will be populated in a future release.
When to use
- Diagnosing replication lag between primary and replicas
- Monitoring replication slot growth to prevent WAL retention issues
- Validating HA posture after failover or configuration changes
Key panels (planned)
- Replication lag (seconds and bytes) — delay between primary and each replica
- Slot activity and retention — active vs inactive slots, WAL retained per slot
- Replica state and sync status — streaming, applying, or disconnected
What good looks like
- Replication lag is near zero or within your SLA
- No inactive replication slots growing unbounded (these retain WAL and can fill disk)
- All replicas are in
streamingstate
What to investigate
| Signal | Next step |
|---|---|
| Growing replication lag | Check replica load, network bandwidth, and max_wal_senders |
| Inactive slot retaining WAL | Drop unused slots or investigate why the consumer disconnected |
| Replica not streaming | Check pg_stat_replication on primary and replica logs |
Related Checkup checks
- A004 — cluster information