PostgreSQL Deep Dive: Foreign Key Locking — Why Missing FK Indexes Kill Throughput

You have a straightforward parent-child relationship: orders references customers. Inserts into orders are fast, deletes from customers are rare, and everything has been running fine for months. Then traffic ramps up. You start seeing sporadic lock timeouts. pg_stat_activity shows sessions blocked on orders rows they never explicitly touched. Deadlock reports start landing in your alerts. The FK constraint you added months ago is the culprit, and the fix takes about thirty seconds.

This is the story of foreign key locking in PostgreSQL: a mechanism most developers treat as a simple declarative check, but one that silently acquires locks across two tables simultaneously, and that can cascade into severe contention if one critical index is missing.

How FK Locking Actually Works

Foreign key enforcement in PostgreSQL is not a passive constraint. Every write to either side of the relationship triggers lock acquisition on the other side.

Writes to the Child Table

When you INSERT or UPDATE a row in the child table (the table with the FK reference), PostgreSQL must verify the referenced row exists in the parent table. The mechanism:

INSERT INTO orders (customer_id, amount)
VALUES (42, 100.00);

PostgreSQL acquires:

A FOR KEY SHARE row-level lock on the referenced row in customers (the row where id = 42)
This lock prevents the referenced row from being DELETEd or having its primary key UPDATEd by another transaction for the duration of your transaction

The FOR KEY SHARE mode is the weakest row-level lock that still blocks FOR UPDATE and DELETE. It allows concurrent FOR KEY SHARE and FOR SHARE locks, so multiple child table inserts referencing the same parent row do not conflict with each other. But any transaction trying to DELETE or update the PK of that parent row will block until all referencing transactions complete.

Writes to the Parent Table

This is where the contention explodes. When you DELETE a row or UPDATE its primary key in the parent table, PostgreSQL must verify no child rows reference it. The mechanism:

DELETE FROM customers WHERE id = 42;

PostgreSQL needs to scan the orders table to find any rows where customer_id = 42. For each matching child row found, it acquires a FOR NO KEY UPDATE lock. Only after confirming no child rows exist (or after dealing with them via CASCADE/SET NULL) can the parent row be deleted.

Here is the critical difference from child-side locking:

Child side: You know exactly which parent row to lock (the one you’re referencing). One lock, always fast.
Parent side: You have to find the child rows. If there is an index on orders.customer_id, this is a quick index scan. If there is no index, this is a sequential scan of the entire orders table.

The Missing FK Index Problem

PostgreSQL does not automatically create an index on the FK column. You must do it yourself. And if you don’t, every DELETE or primary-key UPDATE on the parent table triggers a full sequential scan of the child table.

Consider what happens under concurrent load:

Transaction A: DELETE FROM customers WHERE id = 42
  → Seq scan on orders (no index on customer_id)
  → Acquires FOR NO KEY UPDATE on every matching row (say 500 rows)
  → Holds those locks for the duration of the transaction

Transaction B: INSERT INTO orders (customer_id, ...) VALUES (99, ...)
  → Acquires FOR KEY SHARE on customers row 99
  → Also acquires row-level locks on the new orders row
  → No conflict with Transaction A, proceeds normally

Transaction C: UPDATE orders SET amount = 200 WHERE id = 1000
  → customer_id for order 1000 happens to be 42
  → Transaction A already holds FOR NO KEY UPDATE on this row
  → Transaction C BLOCKS

Transaction D: DELETE FROM customers WHERE id = 99
  → Seq scan on orders (no index)
  → Tries to lock order rows... some held by other transactions
  → BLOCKS

The problem compounds with scale. A sequential scan of a 10 million row child table to verify one FK reference touches every page, acquiring buffer pins and potentially row-level locks on thousands of tuples. Under concurrent modifications, these scans overlap, and you get:

Lock table bloat in shared memory
Increased deadlock detection overhead (the deadlock detector runs after deadlock_timeout, default 1s, and scans the entire lock table)
Wakeup storms when a long-running transaction finally commits and releases hundreds of locks simultaneously
WAL amplification from the tuple header writes (setting xmax and lock flags)

The Cascade Pattern

The worst case is cascading deletes across multiple FK levels:

-- order_items references orders references customers
DELETE FROM customers WHERE id = 42;
-- Triggers:
--   1. Scan orders for customer_id = 42 (seq scan if no index)
--   2. For each matching order, DELETE it (if ON DELETE CASCADE)
--   3. Each order delete scans order_items (seq scan if no index)
--   4. For each matching order_item, DELETE it

Each level of the cascade triggers another scan. Without indexes, a single parent DELETE can degrade into scanning multiple tables, each potentially millions of rows.

Detecting Missing FK Indexes

This query finds FK columns that lack a supporting index:

SELECT
    con.conrelid::regclass AS child_table,
    att.attname      AS fk_column,
    con.confrelid::regclass AS parent_table,
    'missing index'  AS status
FROM pg_constraint  con
JOIN pg_attribute   att
  ON att.attrelid = con.conrelid
 AND att.attnum   = con.conkey[1]
WHERE con.contype = 'f'
  AND NOT EXISTS (
      SELECT 1
      FROM pg_index idx
      WHERE idx.indrelid = con.conrelid
        AND con.conkey[1] = ANY(idx.indkey)
  )
ORDER BY con.conrelid::regclass::text, att.attname;

Run this on every database. Every row returned is a potential contention time bomb.

For the general case (multi-column FKs), expand the conkey[1] to check all array elements against the index columns, but single-column FKs cover the overwhelming majority of cases in practice.

Beyond FKs: Sequential Scan Contention Patterns

FK checks are the most common source of hidden contention from sequential scans, but the same pattern appears elsewhere.

Row-Level Lock Contention from Unbounded Scans

Consider a job queue implementation using SELECT FOR UPDATE SKIP LOCKED:

-- Worker picks up jobs
SELECT id, payload
FROM job_queue
WHERE status = 'pending'
ORDER BY created_at
LIMIT 10
FOR UPDATE SKIP LOCKED;

If status = 'pending' matches millions of rows, the query must scan through them all (up to the LIMIT) and attempt to lock each one. Even with SKIP LOCKED, each locked row requires a tuple header write and a WAL record. With dozens of concurrent workers, they all contend on the same pages of pending rows.

The fix is a partial index that exactly matches the query predicate:

CREATE INDEX CONCURRENTLY idx_job_queue_pending
    ON job_queue (created_at)
    WHERE status = 'pending';

Now the index only contains pending rows, and the workers can quickly find and lock available jobs without scanning the entire table.

FK Check Locks and Long-Running Transactions

Here is a subtler pattern. A long-running transaction holds open a snapshot:

-- Session 1: starts at T1
BEGIN;
SELECT * FROM orders WHERE customer_id = 42;

-- Session 2: at T2, tries to delete customer 42
DELETE FROM customers WHERE id = 42;
-- This scans orders for customer_id = 42
-- If no index: full seq scan, acquires FOR NO KEY UPDATE on matches
-- If Session 1's snapshot can see those rows, it holds visibility pins
-- Session 2 may block waiting for Session 1 to release its snapshot

The FK check scan interacts with MVCC visibility rules. A concurrent scan cannot acquire row locks on tuples that are not visible to it due to another transaction’s snapshot. The lock acquisition can block behind the snapshot horizon, creating apparent lock waits that have nothing to do with the FK rows themselves and everything to do with an unrelated long-running query.

This is why idle_in_transaction_session_timeout matters. A developer opens a transaction, runs a SELECT, then sits idle. Their session holds open a snapshot horizon that prevents FK check scans from completing, which blocks parent table modifications, which cascades into lock timeouts across the application.

pg_locks Diagnostic Queries

To identify FK-related lock contention:

-- Sessions blocked on tuple locks (likely FK checks)
SELECT
    blocked.pid          AS blocked_pid,
    blocked.query        AS blocked_query,
    blocking.pid         AS blocking_pid,
    blocking.query       AS blocking_query,
    blocked.waitstart
FROM pg_locks bl
JOIN pg_stat_activity blocked  ON bl.pid = blocked.pid
JOIN pg_locks       kl         ON bl.locktype = kl.locktype
                               AND bl.database IS NOT DISTINCT FROM kl.database
                               AND bl.relation IS NOT DISTINCT FROM kl.relation
                               AND bl.page IS NOT DISTINCT FROM kl.page
                               AND bl.tuple IS NOT DISTINCT FROM kl.tuple
                               AND bl.pid != kl.pid
                               AND bl.granted = false
                               AND kl.granted = true
JOIN pg_stat_activity blocking ON kl.pid = blocking.pid
WHERE bl.locktype = 'tuple';

For broader contention analysis:

-- Tables with the most current lock activity
SELECT
    relation::regclass AS table_name,
    mode,
    COUNT(*)           AS lock_count,
    COUNT(*) FILTER (WHERE granted = false) AS waiting
FROM pg_locks
WHERE relation IS NOT NULL
GROUP BY relation::regclass, mode
ORDER BY lock_count DESC
LIMIT 20;

The Shared Lock Table Under Pressure

Every lock acquisition, whether fast-path or through the main lock table, consumes shared memory. The lock manager allocates space for max_locks_per_transaction (default 64) lockable objects per backend process. Row-level locks do not count against this limit (they use the tuple header mechanism), but table-level locks, transaction locks, and advisory locks do.

Under heavy FK contention, each session scanning the child table for FK verification holds:

A ROW SHARE table-level lock on the child table (the lock mode acquired by SELECT FOR KEY SHARE / SELECT FOR NO KEY UPDATE)
Potentially thousands of row-level tuple locks in the child table

The row-level tuple locks themselves are stored in tuple headers and don’t fill the shared lock table. But the table-level locks do. If a single transaction touches many tables (common in queries joining across FK relationships), you can hit the shared lock table limit:

ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction

This error terminates the transaction. The default of 64 is usually sufficient, but workloads with many-partitioned tables (each partition is a separate relation) can exhaust it quickly.

The ON DELETE/UPDATE Action Interaction

The FK action clause determines what happens to child rows when a parent row is modified, and each has different locking implications:

Action	Parent DELETE behavior	Lock implications
`NO ACTION` (default)	Scans child, blocks if rows exist	`FOR NO KEY UPDATE` on child rows, then ERROR or proceed
`RESTRICT`	Same as NO ACTION	Identical locking
`CASCADE`	Scans child, DELETEs matching child rows	`FOR NO KEY UPDATE` then DELETE, each child DELETE triggers further FK checks
`SET NULL`	Scans child, UPDATEs child FK column to NULL	`FOR NO KEY UPDATE` then UPDATE, the UPDATE releases the FK lock on the parent row
`SET DEFAULT`	Scans child, UPDATEs child FK column to default	`FOR NO KEY UPDATE` then UPDATE

NO ACTION and RESTRICT are functionally identical in PostgreSQL (the check is always deferred to the end of the statement, never truly deferred to end of transaction unless DEFERRABLE is specified). The difference is semantic only: RESTRICT prevents the delete immediately, while NO ACTION would allow it if intervening actions removed the referencing rows.

For CASCADE, the locking chain grows with each level. The parent DELETE acquires locks on child rows, then the cascade DELETEs acquire locks on grandchild rows, and so on. Under concurrent load, this multi-level lock acquisition dramatically increases the window for deadlocks.

Practical Configuration

The most impactful configuration for FK-heavy workloads:

# Faster deadlock detection when contention is expected
deadlock_timeout = 500ms

# Log long lock waits for diagnosis
log_lock_waits = on

# Ensure the shared lock table is large enough
# (requires restart)
max_locks_per_transaction = 128

For applications that frequently delete or update parent table rows, the most effective fix is always the same: index the FK columns.

-- Standard FK index
CREATE INDEX CONCURRENTLY idx_orders_customer_id
    ON orders (customer_id);

-- For composite FKs, index all referenced columns in order
CREATE INDEX CONCURRENTLY idx_order_items_order_product
    ON order_items (order_id, product_id);

Use CONCURRENTLY to avoid acquiring an ACCESS EXCLUSIVE lock on the child table during index creation. This is especially important on production tables where blocking all writes is unacceptable.

Key Takeaways

PostgreSQL does not auto-create indexes on FK columns. Every parent table DELETE or PK UPDATE without an FK index triggers a sequential scan of the child table.
Child-side FK checks acquire FOR KEY SHARE on the referenced parent row. Parent-side FK checks acquire FOR NO KEY UPDATE on matching child rows. These are real locks with real contention consequences.
A sequential scan for FK verification under concurrent load causes lock table bloat, increased deadlock detection overhead, and WAL amplification from tuple header writes.
Cascading FK relationships multiply the scan cost: a single parent DELETE can trigger scans across every level of the FK chain.
Long-running transactions holding open snapshots can block FK check scans, creating lock waits that appear unrelated to the FK rows themselves.
Run the missing FK index detection query on every database. The fix is straightforward and the performance impact of adding the index is immediate.

What’s Next

Foreign key locking is one major source of contention, but another common culprit is vacuum behaviour when dead tuples accumulate under long-running transactions. Next time: how vacuum full, regular vacuum, and vacuum analyze differ, and when each one is the right tool for the job.

Previous in the series: Lock Queues and Fast-Path Locks: Why Some Lock Acquisition Is Nearly Free