Pg-archivecleanup Must Specify Oldest Kept Wal File Link
pg_archivecleanup /var/lib/postgresql/archive/ which fails silently in cron unless error handling is implemented. Consequently, archives grow unbounded, causing disk full errors. | Scenario | Result | |----------|--------| | pg_archivecleanup called with only one argument | Cleanup not performed; archive accumulates forever | | Script assumes it succeeds | Disk space exhaustion, PostgreSQL stops due to archive_command failure | | Using -n (dry run) without the required argument | Dry run also fails, providing false sense of testing |
Ensuring Safety in Physical Replication: Why pg_archivecleanup Must Specify the Oldest Kept WAL File Abstract pg_archivecleanup is a critical PostgreSQL utility for managing Write-Ahead Log (WAL) archives in streaming replication and log-shipping setups. Misuse of this tool—specifically omitting the oldest kept WAL file argument—can lead to catastrophic data loss, replica failure, or broken recovery chains. This paper explains the internal design of pg_archivecleanup , demonstrates the consequences of improper invocation, and establishes a formal requirement: the oldest kept WAL file argument is not optional but a safety necessity. We provide usage patterns, error analysis, and a recommendation for wrapper scripts or monitoring. 1. Introduction PostgreSQL’s physical replication relies on continuously archived WAL files. The utility pg_archivecleanup is designed to clean up WAL files from the archive directory after they are no longer needed for recovery or replica catch‑up. Its signature is: pg-archivecleanup must specify oldest kept wal file
pg_archivecleanup /archive/ # OOPS - no oldestkeptwal Because the command fails, the archive still contains all 100 files. The replica recovers correctly. The hidden danger emerges when the DBA, frustrated, forces a deletion using rm and incorrectly guesses the oldest needed file. The requirement forces the DBA to be explicit. 6.1 Correct Usage in recovery_end_command recovery_end_command = 'pg_archivecleanup /mnt/archive %r' %r is replaced by PostgreSQL with the oldest WAL still required by the standby. 6.2 Manual Cron Script with Safety Check #!/bin/bash ARCHIVE="/var/lib/pgsql/archive" OLDEST_REQUIRED=$(ls -1 $ARCHIVE | head -1) # simplistic; use pg_controldata instead if [ -z "$OLDEST_REQUIRED" ]; then echo "No WAL files found" exit 1 fi pg_archivecleanup $ARCHIVE $OLDEST_REQUIRED || echo "Cleanup failed" exit 1 Misuse of this tool—specifically omitting the oldest kept
Always include the oldest kept argument even in dry runs: The safety design is correct
No scenario leads to incorrect deletion , but the operator may believe cleanup occurred. The safety design is correct, but user expectation fails. Consider a replica that fell behind by 100 WAL files. The DBA manually runs:
Example:
Sshr
Saxi
Sade your songs brings me joy and thanks for that. Love your songs ????
Inspiring generations ballads