Question 1

What is the difference between pg_dump, pg_basebackup, and continuous WAL archiving?

Accepted Answer

`pg_dump` creates logical backups — a SQL-level snapshot of a single database. It is portable across PostgreSQL major versions, allows restoring individual tables, and supports custom format with compression and parallel restore. However, it is slow on large databases (hundreds of gigabytes can take hours), causes I/O pressure, and can only restore to the exact moment the dump started — there is no point-in-time recovery. `pg_basebackup` creates physical backups — a binary copy of the entire PostgreSQL data directory (all databases in the cluster). It is faster than `pg_dump` for large databases and includes everything needed to start a new PostgreSQL instance, but the backup can only be restored to the same major version and the same architecture. Continuous WAL archiving works alongside physical base backups: you take periodic base backups and continuously ship WAL (Write-Ahead Log) segments to a backup location. During recovery, PostgreSQL replays WAL on top of a base backup, allowing you to stop at any point in time (PITR). This is the foundation of production backup strategies because it lets you recover to the moment just before an accidental `DROP TABLE` or data corruption, not just the last scheduled backup. Most production setups combine all three: continuous WAL archiving with periodic base backups for PITR, plus `pg_dump` of critical databases for portability and logical-level safety.

Question 2

How do I choose between pgBackRest, WAL-G, and Barman for backup management?

Accepted Answer

All three are mature, production-proven tools, but they target slightly different use cases. pgBackRest is the most feature-complete: it supports full, differential, and incremental backups, parallel backup and restore, multiple repositories (local, S3, GCS, Azure), encryption, backup verification, and a detailed backup catalog. It is the default recommendation for most production environments and is included in many PostgreSQL distributions. WAL-G (originally developed at Citus/Microsoft) focuses on cloud-native deployments with a strong emphasis on S3-compatible storage and tight integration with cloud object stores. It supports delta backups (similar to incremental), page-level compression, and is popular in containerized and Kubernetes environments. Its configuration is simpler than pgBackRest, which can be an advantage for smaller teams. Barman (by EDB) provides centralized backup management — it runs on a dedicated backup server and can manage backups for multiple PostgreSQL instances. It supports both streaming and rsync-based backup modes, PITR, and has strong retention management. It is well-suited for enterprises managing many PostgreSQL clusters from a central location. The practical decision often comes down to: pgBackRest if you want the widest feature set and community adoption; WAL-G if you are cloud-native with simple requirements; Barman if you need centralized management of many instances from one server.

Question 3

How should I define RPO and RTO, and how do they affect my backup strategy?

Accepted Answer

RPO (Recovery Point Objective) is the maximum amount of data loss you can tolerate, measured in time. An RPO of 1 hour means you accept losing up to 1 hour of transactions. An RPO of zero means no data loss is acceptable. RTO (Recovery Time Objective) is the maximum time from the start of a failure to the moment the database is operational again. An RTO of 30 minutes means the application must be back online within half an hour. These two numbers fundamentally shape your backup architecture. For RPO: if you only run nightly `pg_dump` jobs, your RPO is up to 24 hours. Continuous WAL archiving reduces RPO to minutes (the interval between WAL segment shipments, often 1-5 minutes with `archive_timeout`). Synchronous streaming replication to a standby achieves RPO of zero. For RTO: restoring a large logical backup with `pg_restore` can take hours, while promoting a streaming replica takes seconds. Physical backup restore with pgBackRest using parallel restore and delta restore (only restoring changed files) falls in between. A common production strategy for moderate requirements (RPO < 5 minutes, RTO < 1 hour) is: pgBackRest with continuous WAL archiving, daily incremental backups, weekly full backups, and a warm standby replica. For near-zero RPO/RTO, add synchronous streaming replication with automatic failover via Patroni or pg_auto_failover.

Question 4

How do I verify that my PostgreSQL backups actually work?

Accepted Answer

Untested backups are not backups — they are hopes. Verification should be automated and regular, not a manual process you run once a year. There are several levels of verification. Level 1 is catalog verification: tools like pgBackRest have a `verify` command that checks backup file integrity against stored checksums without performing a full restore. Run this after every backup. Level 2 is automated restore testing: schedule a periodic job (weekly or daily for critical databases) that restores the latest backup to a temporary instance, runs basic sanity checks (table counts, critical query execution, `pg_catalog` consistency), and tears down the instance. This proves the backup is actually restorable. Level 3 is PITR testing: restore to a specific timestamp, verify that a known transaction from that time is present, and verify that a transaction from after that time is absent. This confirms that WAL replay is working correctly. Level 4 is RTO measurement: time the full restore process end-to-end and compare it to your RTO target. If restoration takes 4 hours but your RTO is 1 hour, your strategy needs to change. Monitoring should cover: backup completion (did today's backup finish?), backup duration (is it taking longer than usual, suggesting growth or I/O problems?), backup size (sudden drops may indicate missing data), WAL archiving lag (are segments being shipped on time?), and verification results. Alert immediately on any failure — a missed backup alert that sits unnoticed for a week defeats the purpose.

Backup strategy planner

About this tool

Examples

Inputs and outputs

What you provide

What you get

Use cases

Features

Frequently asked questions

Related tools

Related resources

Ready to try it?