PostgreSQL Deep Dive

PostgreSQL Deep dive: Incremental sort — how PostgreSQL 13 made multi-column ORDER BY cheaper

You have an index on created_at. Your query orders by created_at DESC, status. The index covers the first sort key. PostgreSQL 12 sorts the entire result set anyway.

PostgreSQL 13 does not. It sorts within each group of rows that share the same created_at value and stitches the results together. Same output, less work.

This is incremental sort. It is a practical performance improvement in PostgreSQL 13, and it activates automatically when the planner detects a presorted prefix in your data. Here’s how it works, when it fires, and when it deliberately chooses not to.

The Problem Incremental Sort Solves

Consider a table with an index on (country):

CREATE INDEX idx_orders_country ON orders (country);

Now run this query:

SELECT * FROM orders ORDER BY country, order_date;

The index guarantees that rows come back grouped by country. Within each country group, the rows are in some arbitrary order. Before PostgreSQL 13, the planner had two options:

  1. Use the index and apply a full sort on top, sorting the entire result set by both country and order_date.
  2. Ignore the index, do a sequential scan, and sort the entire result set.

Both options sort everything. The index is providing useful ordering information, and the planner in PostgreSQL 12 throws it away. This is wasteful. The index has already done half the work. You just need to sort within each country group.

Incremental sort fixes this.

How Incremental Sort Works

The planner sees that the data is already sorted by a prefix of the required sort order. It breaks the input into groups that share the same prefix value and sorts each group independently.

For ORDER BY country, order_date with an index on (country):

  1. Read all rows where country = 'US'. Sort them by order_date.
  2. Read all rows where country = 'DE'. Sort them by order_date.
  3. Continue for each distinct country value.

The result is identical to a full sort. The difference is that each group is small enough to sort in memory even when the total result set would overflow work_mem.

The EXPLAIN output

EXPLAIN SELECT * FROM orders ORDER BY country, order_date;

                          QUERY PLAN
---------------------------------------------------------------
 Incremental Sort
   Sort Key: country, order_date
   Presorted Key: country
   -> Index Scan using idx_orders_country on orders

Three lines tell you everything:

  • Sort Key: the full sort specification the query requested.
  • Presorted Key: the prefix that the child node already provides.
  • The child node is an index scan that returns rows in country order.

Compare this to a full sort:

EXPLAIN SELECT * FROM orders ORDER BY order_date;

                 QUERY PLAN
-------------------------------------------
 Sort
   Sort Key: order_date
   -> Seq Scan on orders

A Sort node has no Presorted Key line. It sorts everything from scratch.

Cost estimates in the plan

The cost model reflects the savings:

 Incremental Sort (cost=50.00..5234.39 rows=100000 width=100)
   Sort Key: country, order_date
   Presorted Key: country
   -> Index Scan using idx_orders_country on orders
         (cost=0.29..3342.20 rows=100000 width=100)

The startup cost of incremental sort is lower than a full sort because it can begin returning rows before reading the entire input. This matters enormously for queries with LIMIT.

The LIMIT Interaction

Incremental sort shines with LIMIT queries. A full sort must consume the entire input before it can produce the first output row. Incremental sort can begin returning rows as soon as it finishes sorting the first presorted group.

EXPLAIN SELECT * FROM orders ORDER BY country, order_date LIMIT 100;

                      QUERY PLAN
-----------------------------------------------------------
 Limit (cost=19.35..39.49 rows=100 width=100)
   -> Incremental Sort (cost=19.35..2033.39 rows=100000 width=100)
         Sort Key: country, order_date
         Presorted Key: country
         -> Index Scan using idx_orders_country on orders
               (cost=0.29..1574.20 rows=100000 width=100)

With LIMIT 100, the planner estimates it may only need to sort the first one or two groups to find the 100 earliest rows. The startup cost is 19.35, compared to the full cost of sorting all 100,000 rows. If the first country group contains enough rows, the query finishes after sorting just that group.

A full sort would have to sort all 100,000 rows even to return the first one.

Where the Presorted Prefix Comes From

Incremental sort is not limited to index scans. The presorted prefix can come from any plan node that provides ordering guarantees:

Index scans. The most common source. An index on (country) or (country, order_date) provides the presorted prefix directly.

Merge joins. A merge join produces output sorted by the join key. If your query orders by the join key plus additional columns, incremental sort can handle the remainder.

Group aggregation. GROUP BY country produces groups in country order when using hash or sort aggregation. If you then order by country, total_revenue, the grouping already provides the first sort key.

Window functions. Some window function implementations produce output in a specific order. Incremental sort can build on that.

CTEs with ORDER BY. A materialized CTE that includes ORDER BY preserves the sort order. The outer query can use incremental sort for any additional sort keys.

Previous incremental sort nodes. A query can have multiple incremental sort stages stacked on top of each other, each handling one additional sort key.

When Incremental Sort Does Not Activate

The planner weighs incremental sort against a full sort and sometimes picks the full sort. This is not a bug. There is a real overhead to splitting the input into groups.

Too many distinct prefix values

If the presorted key has high cardinality relative to the row count, the groups become very small. The overhead of managing many tiny sorts exceeds the benefit of sorting smaller batches. The planner’s cost model accounts for this and falls back to a full sort.

Example: an index on (user_id) where user_id is unique. Each group has one row. There is nothing to incrementally sort. The planner will choose a full sort or a different plan entirely.

The cost of group detection

Incremental sort must detect when the presorted key changes to know where one group ends and the next begins. This comparison overhead is proportional to the number of rows. For small result sets or when the presorted prefix covers most of the sort keys, the overhead is not worth it.

Parallel query interaction

Incremental sort does not parallelize internally. A full sort can use parallel workers for the external merge sort. In some cases, a parallel full sort is cheaper than a serial incremental sort. The planner evaluates both paths.

Work_mem and group sizes

If each presorted group is small enough to fit in work_mem, incremental sort avoids disk spills entirely. If individual groups are still larger than work_mem, those groups spill to disk just like a full sort would. The win comes from smaller spill sizes, not from eliminating spills.

Disabling Incremental Sort

The GUC enable_incremental_sort (default on) controls the feature:

SET enable_incremental_sort = off;

EXPLAIN SELECT * FROM orders ORDER BY country, order_date;

                 QUERY PLAN
-------------------------------------------
 Sort
   Sort Key: country, order_date
   -> Index Scan using idx_orders_country on orders

This is useful for A/B testing. If you suspect incremental sort is causing a regression on a specific query, disable it and compare execution times with EXPLAIN ANALYZE.

Note that setting enable_incremental_sort = off does not force a sequential scan. The planner may still choose the index scan and apply a full sort on top of it. If you want to see the sequential scan plan, disable both:

SET enable_incremental_sort = off;
SET enable_indexscan = off;

Practical Tuning Scenarios

Multi-column sort with a single-column index

This is the textbook case. You have an index on the leading column and need to sort by additional columns:

CREATE INDEX idx_events_user ON events (user_id);

SELECT * FROM events
WHERE user_id IN (SELECT user_id FROM premium_users)
ORDER BY user_id, created_at DESC
LIMIT 50;

The IN subquery returns a set of user IDs. For each user, the index scan fetches their events in user_id order. Incremental sort handles created_at DESC within each user group. The LIMIT 50 means the planner may only need to process a handful of users.

Sorting after a partition-wise join

Partition-wise joins produce output ordered by the partition key. If your ORDER BY starts with the partition key and adds columns, incremental sort fills the gap without a full sort.

ORDER BY with expressions

Incremental sort works with sort expressions, not just bare columns:

SELECT * FROM products
ORDER BY category, LOWER(name);

If an index on (category) provides the presorted prefix, incremental sort handles the LOWER(name) part within each category.

Multi-Level Incremental Sort

PostgreSQL can stack incremental sort nodes when multiple levels of presorting are available. Consider:

CREATE INDEX idx_a ON t (a);
CREATE INDEX idx_ab ON t (a, b);

SELECT * FROM t ORDER BY a, b, c;

If the planner chooses the index on (a), you get one incremental sort handling (b, c). But if another part of the plan provides ordering on (a, b), the incremental sort only needs to handle c. The planner picks whichever is cheaper.

In practice, multi-level stacking is uncommon because the planner typically finds a more direct plan. But it can appear in complex queries with multiple join stages.

Monitoring Incremental Sort

There is no dedicated view for incremental sort statistics. You monitor it through the standard channels:

EXPLAIN ANALYZE. Look for Incremental Sort nodes. The actual timing tells you whether the incremental approach paid off compared to the estimated cost.

Sort method. EXPLAIN ANALYZE shows the sort method:

Sort Method: quicksort  Average Memory: 42kB  Peak Memory: 42kB

Each incremental sort group gets its own sort method line. If you see external merge Disk: 1234kB, that group spilled to disk.

Full Sort Groups. In EXPLAIN ANALYZE, the Incremental Sort node reports Full-sort Groups and Pre-sorted Groups:

Incremental Sort (actual time=0.123..45.678 rows=100000 loops=1)
  Sort Key: country, order_date
  Presorted Key: country
  Full-sort Groups: 47  Sort Method: quicksort  Average Memory: 256kB
  Pre-sorted Groups: 3  Sort Method: quicksort  Average Memory: 24kB
  -> Index Scan using idx_orders_country on orders
        (actual time=0.012..12.345 rows=100000 loops=1)

Full-sort Groups are groups that contained more than one row and required actual sorting. Pre-sorted Groups are single-row groups where no sort was needed. High pre-sorted group counts indicate that your presorted key has very high cardinality.

Incremental Sort vs Full Sort: Decision Framework

Use this mental model when reading query plans:

ConditionPlanner Choice
Presorted prefix covers most sort keysIncremental sort (few additional keys to sort)
Presorted prefix has low cardinality (few distinct values)Incremental sort (large groups, worth splitting)
Presorted prefix has high cardinality (many distinct values)Full sort or different plan (groups too small)
Query has LIMITIncremental sort (can skip later groups)
Total result fits in work_memMarginal difference (either approach avoids spills)
Total result exceeds work_mem, groups fitIncremental sort wins (no spills)
Individual groups exceed work_memIncremental sort still helps (smaller spills)

The Common Misconception

“Incremental sort means PostgreSQL can partially sort data.”

No. Incremental sort produces a fully correct, totally sorted output. It is not a “best effort” sort. The rows come out in exactly the same order as a full sort. The difference is entirely internal: how the sort is partitioned into smaller operations.

Think of it as sorting a deck of cards that is already sorted by suit. You do not reshuffle the whole deck. You sort the clubs among themselves, then the diamonds, then the hearts, then the spades. Same result, less work.

When to Create Indexes for Incremental Sort

The planner will not use incremental sort if no part of the sort order is presorted. Adding an index on the leading sort key columns is what enables it.

Before reaching for a multi-column index, check whether a single-column index on the first sort key gives you incremental sort for free. A full multi-column index on (country, order_date) would eliminate the sort entirely, but a single-column index on (country) gives you incremental sort at lower index maintenance cost.

The tradeoff depends on your write patterns. If you write frequently and the incremental sort is fast enough, the lighter index is the better choice. If the query runs often and latency matters, the full covering index wins.

The Practical Takeaway

Incremental sort, introduced in PostgreSQL 13, lets the planner exploit partial ordering in the data. When a child node provides a prefix of the required sort order, incremental sort handles the remaining keys within each group rather than sorting the entire result set from scratch.

The practical impact:

  • Smaller sort batches. Each group sorts independently, reducing memory pressure and disk spills.
  • Faster LIMIT queries. The planner can return rows before processing the full input.
  • Better multi-column ORDER BY with single-column indexes. You no longer need a full composite index to avoid sorting the entire table.
  • Automatic activation. No query changes needed. The planner picks it up when the conditions are right.

The GUC enable_incremental_sort lets you disable it for testing. The EXPLAIN output shows Presorted Key to confirm which prefix the planner detected. And EXPLAIN ANALYZE breaks down full-sort vs pre-sorted group counts so you can see exactly how the work was distributed.

Before PostgreSQL 13, presorted data was a hint the planner could not fully use. Now it is a first-class optimization path.