PostgreSQL Deep Dive

PostgreSQL Deep Dive: Transaction ID Wraparound — The Day Your Database Stops Taking Writes

Imagine waking up to find your production database is accepting reads but refusing every INSERT, UPDATE, and DELETE. No error in your app code, no failed deployment — the database itself has gone into a self-imposed lockdown. Your users can see their data but can’t change anything. If this sounds like a nightmare scenario, welcome to transaction ID wraparound.

Yesterday we covered how MVCC uses xmin and xmax to determine row visibility. Today we are looking at the existential threat that emerges directly from that design: the fact that transaction IDs are finite, and what happens when they run out.

The 32-Bit Clock

PostgreSQL transaction IDs (XIDs) are 32-bit unsigned integers. That gives you roughly 4.3 billion unique IDs. Every transaction that modifies data consumes one XID, and the counter wraps around to zero after hitting the maximum. That wraparound is the problem.

Why? Because Postgres compares XIDs using modular arithmetic to decide which transactions are “in the past” and which are “in the future.” For any given XID, there are approximately two billion XIDs considered older and two billion considered newer. The comparison window is a circle, not a line.

Here is what that means in practice:

Current XIDA row with xmin = 100Visible?
1,000100 is 900 behind → past✅ Yes
2,147,483,748100 is ~2.1B behind → past✅ Yes
2,147,483,848100 is ~2.1B behind → just past✅ Yes (barely)
2,147,483,948100 is now ~2.1B ahead → futureData disappears

Once more than two billion transactions have elapsed since a row was inserted, that row’s xmin flips from “past” to “future.” The row still exists on disk, but it becomes invisible to every query. This is catastrophic, silent data loss.

Freezing: The Defense Mechanism

PostgreSQL’s solution is freezing — marking old rows as permanently visible, regardless of XID arithmetic.

When VACUUM processes a table, it can mark tuples as frozen. A frozen tuple is treated as if its xmin is FrozenTransactionId (a special XID value of 2), which is defined to be older than every normal XID. Once frozen, a tuple is visible to all transactions forever, completely sidestepping the wraparound problem.

In PostgreSQL 9.4 and later, freezing does not actually overwrite the xmin column. Instead, it sets a combo flag bit in the tuple header (HEAP_XMIN_FROZEN). This preserves the original xmin for forensic purposes while still getting the freeze benefit. If you query xmin on a frozen row, you will still see the original transaction ID — but the system knows to treat it as frozen.

You can observe frozen tuples using the pg_visibility extension:

CREATE EXTENSION IF NOT EXISTS pg_visibility;

-- Check how many pages are all-frozen in a table
SELECT
  relblocknumber AS page,
  all_visible,
  all_frozen
FROM pg_visibility_map('users')
LIMIT 20;

Pages that are all_frozen in the visibility map are exempt from freeze scans — they have already been fully processed.

The Three Ages: Understanding the Freeze Lifecycle

Three configuration parameters control the freeze lifecycle, and understanding their relationship is critical:

1. vacuum_freeze_min_age (default: 50 million)

How old a tuple’s XID must be before VACUUM will consider freezing it. This avoids wasting effort freezing rows that might be updated again soon. A row with XID 100 won’t be frozen until the current XID is at least 50,000,100.

2. vacuum_freeze_table_age (default: 150 million)

When the oldest unfrozen XID in a table exceeds this age relative to the current XID, VACUUM switches to aggressive mode. In aggressive mode, VACUUM scans every page that is not already all-frozen (not just pages with dead tuples). This ensures all eligible tuples get frozen and the table’s relfrozenxid advances.

The effective maximum for this setting is 0.95 × autovacuum_freeze_max_age. Setting it higher gets capped.

3. autovacuum_freeze_max_age (default: 200 million)

The doomsday trigger. If any table’s oldest unfrozen XID exceeds this age, autovacuum forces an aggressive vacuum on that table — even if autovacuum is disabled. This is a non-negotiable safety mechanism.

The maximum time a table can go without being vacuumed is roughly 2 billion - vacuum_freeze_min_age transactions. Beyond that, data loss becomes possible.

Monitoring: Your Early Warning System

The key metric is the age of the oldest unfrozen XID in each table and database. The age() function computes the distance from a given XID to the current XID.

Per-table age

SELECT
  c.oid::regclass AS table_name,
  greatest(age(c.relfrozenxid), age(t.relfrozenxid)) AS xid_age,
  pg_size_pretty(pg_total_relation_size(c.oid)) AS size
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r', 'm')
ORDER BY greatest(age(c.relfrozenxid), age(t.relfrozenxid)) DESC
LIMIT 20;

Watch for tables where xid_age is approaching autovacuum_freeze_max_age. That is your early warning.

Per-database age

SELECT datname, age(datfrozenxid) AS xid_age
FROM pg_database
ORDER BY age(datfrozenxid) DESC;

datfrozenxid is the minimum of all per-table relfrozenxid values in that database. If this number is high, you have at least one table that is not being frozen fast enough.

A monitoring query for your dashboard

SELECT
  c.oid::regclass AS table_name,
  age(c.relfrozenxid) AS xid_age,
  round(100.0 * age(c.relfrozenxid) /
    current_setting('autovacuum_freeze_max_age')::bigint, 2) AS pct_to_danger,
  pg_size_pretty(pg_total_relation_size(c.oid)) AS size
FROM pg_class c
WHERE c.relkind IN ('r', 'm')
  AND age(c.relfrozenxid) > 100000000  -- only show tables above 100M
ORDER BY pct_to_danger DESC;

Any table above 80% should be investigated immediately.

The Warning Signs

PostgreSQL does not go silently into the night. It gives you two escalating warnings:

Warning 1: ~40 million transactions remaining

WARNING: database "mydb" must be vacuumed within 39985967 transactions
HINT: To avoid XID assignment failures, execute a database-wide VACUUM in that database.

This appears in the server logs. If you are not monitoring your logs, you will miss it. Set up alerts on this pattern immediately.

Error: ~3 million transactions remaining

ERROR: database is not accepting commands that assign new XIDs to avoid wraparound data loss
HINT: Execute a database-wide VACUUM in that database.

At this point, the database is read-only. No writes are accepted. Existing transactions can continue to completion, but no new write transactions can start.

Recovery from this state

If you hit the error (not just the warning), follow these steps:

  1. Resolve old prepared transactions:

    SELECT * FROM pg_prepared_xacts WHERE age(transactionid) > 1000000;
    -- COMMIT PREPARED or ROLLBACK PREPARED for each
  2. Kill long-running transactions:

    SELECT pid, now() - xact_start AS duration, query, state
    FROM pg_stat_activity
    WHERE state IN ('idle in transaction', 'active')
      AND xact_start < now() - interval '5 minutes'
    ORDER BY duration DESC;
    -- pg_terminate_backend(pid) for the oldest ones
  3. Drop stale replication slots:

    SELECT slot_name, plugin, slot_type, active, age(xmin) AS xmin_age
    FROM pg_replication_slots;
    -- DROP_REPLICATION_SLOT for old inactive slots
  4. Run VACUUM (not VACUUM FULL, not VACUUM FREEZE):

    -- As a superuser
    VACUUM;

    Do NOT use VACUUM FULL — it requires an XID and will fail (or make things worse). Do NOT use VACUUM FREEZE — it does more work than necessary.

  5. Fix your autovacuum config so this does not happen again.

Multixact Wraparound: The Other Ticking Clock

Transaction IDs are not the only 32-bit counter that can wrap around. Multixact IDs (MXIDs) are used when multiple transactions lock the same row simultaneously. They are stored in the xmax field as a single ID that maps to a list of member transactions in pg_multixact.

MXIDs have the same 32-bit limitation and the same wraparound risk. The same freezing mechanism handles both. The relevant parameters are vacuum_multixact_freeze_min_age, vacuum_multixact_freeze_table_age, and autovacuum_multixact_freeze_max_age.

MXID wraparound is rarer but can happen on workloads with heavy SELECT ... FOR UPDATE or SELECT ... FOR SHARE usage. Monitor with:

SELECT c.oid::regclass, mxid_age(c.relminmxid) AS mxid_age
FROM pg_class c
WHERE c.relkind IN ('r', 'm')
ORDER BY mxid_age DESC
LIMIT 10;

Tuning for Large Tables

For very large, mostly-static tables (think: append-only event logs, dimension tables), aggressive vacuums can be expensive. Here are some tuning strategies:

Increase autovacuum_freeze_max_age: The default is 200 million. For databases with lots of static data, increasing this to 1 billion or higher reduces how often aggressive vacuums run. The trade-off is that pg_xact grows (roughly 50MB per 200M transactions).

ALTER SYSTEM SET autovacuum_freeze_max_age = 1000000000;
-- Requires restart

Per-table tuning: You can set freeze parameters per table:

-- For a large, static archive table
ALTER TABLE events SET (autovacuum_freeze_max_age = 1500000000);
ALTER TABLE events SET (vacuum_freeze_table_age = 1400000000);

Insert-only tables and autovacuum_freeze_min_age: For tables that receive INSERTs but no UPDATEs or DELETEs, lowering autovacuum_freeze_min_age lets earlier vacuums freeze rows sooner, reducing the work needed for aggressive vacuums:

ALTER TABLE events SET (autovacuum_freeze_min_age = 10000000);

Key Takeaways

  1. XID wraparound is real and will happen if vacuuming stops — it is not theoretical. Every write transaction consumes one of 4.3 billion XIDs.
  2. Autovacuum handles this automatically in almost all cases. Do not disable it. The forced anti-wraparound vacuum runs even when autovacuum is “off.”
  3. Monitor age(relfrozenxid) for your largest and most static tables. Set up alerts at 75% of autovacuum_freeze_max_age.
  4. Long-running transactions, stale replication slots, and abandoned prepared transactions are the most common causes — they prevent vacuum from freezing old tuples.
  5. MXID wraparound is the quieter cousin — monitor it too if you use row-level locking (SELECT FOR UPDATE/SHARE).
  6. Recovery is possible without downtime — do not panic and do not reach for VACUUM FULL. Follow the recovery steps above.

What’s Next

Tomorrow we will look at autovacuum internals and tuning — now that you understand why vacuum must run, we will dig into how autovacuum decides when to run, how to read its behavior in pg_stat_user_tables, and how to tune it for tables that matter most.


This post is part of the PostgreSQL Deep Dive series, published weekdays at 9am AEST. Catch up on the full series or browse by PostgreSQL tag.