PostgreSQL Deep dive: BRIN Indexes — how a 1mb index can replace a 1gb B-Tree
A B-tree index on a 100 million row table might be 2GB. A GIN index on the same table might be 4GB. A BRIN index on that same table might be 2MB. That is not a typo.
BRIN (Block Range Index) takes a different approach to indexing. Instead of storing one entry per row (B-tree) or one posting list per key (GIN), BRIN stores a single summary record per block range, a group of physically adjacent pages. The entire index is a small table of summaries, each saying “pages 1000-1031 contain values between 142 and 891.” When a query asks for values between 200 and 300, BRIN checks each summary and skips the block ranges that cannot possibly contain matches.
The result is an index that is orders of magnitude smaller than any other index type, with near-zero maintenance overhead. The catch: it only works well when the data has a natural physical correlation with the column you are indexing.
How BRIN Works
Block Ranges
A BRIN index divides the table into block ranges. Each block range covers pages_per_range consecutive 8KB pages. The default is 128 pages, meaning each block range covers 1MB of table data (128 × 8KB). At roughly 290 rows per 8KB page, a default block range contains about 37,000 rows.
For each block range, the index stores a single summary tuple. For a minmax operator class (the most common), this tuple contains the minimum and maximum values of the indexed column across all rows in that block range.
The Index Scan
A query like WHERE created_at BETWEEN '2026-01-01' AND '2026-03-01' against a BRIN index proceeds as follows:
- Scan the BRIN index. For each block range summary, check whether
[min, max]overlaps with the query range. - If the summary’s range overlaps with the query, include all pages in that block range. If it does not overlap, skip the entire block range.
- For included block ranges, read the actual heap pages and recheck each row against the query condition.
This is a bitmap index scan. The BRIN index produces a bitmap of potentially matching pages, and the executor reads those pages from the heap and filters out non-matching rows. The recheck is mandatory because BRIN is lossy: a block range whose summary says “values 100-500” might contain a query match at value 200, or it might not. The summary says it is possible, not guaranteed.
Why Physical Correlation Matters
Consider a table of orders where created_at is monotonically increasing and rows are appended in chronological order:
Block Range 1 (pages 0-127): created_at 2024-01-01 to 2024-02-15
Block Range 2 (pages 128-255): created_at 2024-02-15 to 2024-04-01
Block Range 3 (pages 256-383): created_at 2024-04-01 to 2024-05-20
...
A query for orders in January 2024 checks each summary. Only Block Range 1 overlaps. The other 99 block ranges are skipped. Out of 100 million rows, PostgreSQL reads roughly 37,000. That is 0.037% of the table, with a 2MB index.
Now consider the same table after months of random UPDATE statements that moved rows to different physical locations. The correlation between created_at and physical position is now gone:
Block Range 1 (pages 0-127): created_at 2024-01-01 to 2026-04-24
Block Range 2 (pages 128-255): created_at 2023-11-15 to 2026-04-23
Block Range 3 (pages 256-383): created_at 2024-03-01 to 2026-04-24
...
Every block range summary now spans the entire date range. Every query overlaps with every summary. The BRIN index cannot skip anything, and the query degrades to a full sequential scan. The index is still only 2MB, but it is completely useless.
This is the fundamental BRIN tradeoff: tiny index, but only effective when physical ordering matches query ordering.
Operator Classes
BRIN ships with four families of operator classes, each storing a different type of summary.
minmax (the default)
Stores the minimum and maximum value in each block range. Supports all comparison operators (=, <, >, <=, >=). Available for nearly every scalar type: int2, int4, int8, float4, float8, numeric, text, date, timestamp, timestamptz, uuid, inet, bytea, and more.
This is the workhorse. Use it for any column with a linear sort order.
-- Simple minmax BRIN index
CREATE INDEX idx_orders_created ON orders USING brin(created_at);
-- With custom pages_per_range
CREATE INDEX idx_orders_created ON orders USING brin(created_at)
WITH (pages_per_range = 32);
minmax-multi
Stores multiple minimum and maximum value intervals per block range, not just one. This handles data that is mostly correlated but has some outliers or gaps.
For example, if a block range contains timestamps from 2024-01-01 to 2024-06-30, but with a gap from February to April (maybe no orders during that period), minmax would store [2024-01-01, 2024-06-30]. minmax-multi would store [2024-01-01, 2024-01-31] and [2024-04-01, 2024-06-30], allowing more precise skipping.
Controlled by the values_per_range parameter (default 32, range 8-256). More values means more precision but a larger index.
CREATE INDEX idx_logs_ts ON server_logs USING brin(timestamp)
WITH (pages_per_range = 64);
-- minmax-multi with more stored intervals
CREATE INDEX idx_logs_ts_multi ON server_logs USING brin(timestamp minmax_multi_ops)
WITH (pages_per_range = 64, values_per_range = 64);
bloom
Builds a Bloom filter for each block range instead of storing min/max ranges. A Bloom filter can answer “does this value exist in the block range?” with a configurable false positive rate (default 1%). This makes it effective for equality queries even when the data is not physically correlated.
The tradeoff: bloom BRIN only supports equality (=), not range queries. And the false positive rate means some non-matching block ranges will be included in the scan.
-- Bloom BRIN for equality lookups on a non-correlated column
CREATE INDEX idx_events_type ON events USING brin(event_type bloom_ops)
WITH (false_positive_rate = 0.01);
The two bloom parameters:
false_positive_rate: 0.0001 to 0.25, default 0.01. Lower values mean fewer false positives but larger index entries.n_distinct_per_range: estimated number of distinct values per block range. Default -0.1 (scales linearly with block size). Set positive for columns with a known number of distinct values.
inclusion
Stores a bounding value that includes all values in the block range. Used for geometric types (box, inet, range types) where “contains” semantics make more sense than min/max.
-- Inclusion BRIN for range types
CREATE INDEX idx_reservations ON reservations USING brin(during range_inclusion_ops);
-- Inclusion BRIN for IP addresses
CREATE INDEX idx_access_ip ON access_logs USING brin(ip_address inet_inclusion_ops);
pages_per_range: The Most Important Tuning Parameter
The pages_per_range storage parameter controls the granularity of the index. It is set at index creation time and cannot be changed without rebuilding the index.
pages_per_range = 128 (default)
→ 1MB per block range
→ ~37,000 rows per block range
→ Smallest index, least precise
pages_per_range = 32
→ 256KB per block range
→ ~9,000 rows per block range
→ 4× larger index, 4× more precise
pages_per_range = 4
→ 32KB per block range
→ ~1,100 rows per block range
→ 32× larger index, 32× more precise
Lower values mean more index entries, more precision, and a larger index. Higher values mean fewer entries, less precision, and a smaller index. The right value depends on your query pattern and table size.
For a 1TB table with default pages_per_range of 128, the BRIN index would be roughly 8,000 pages (64MB). With pages_per_range of 16, it would be 64,000 pages (512MB). Still tiny compared to a B-tree, but noticeably larger.
The rule: start with the default (128) and decrease only if queries are scanning too many block ranges. The goal is the smallest pages_per_range that still allows the planner to skip most of the table for your typical queries.
Index Maintenance: Summarization
BRIN indexes have a unique maintenance model. Unlike B-tree (updated on every write) or GIN (fastupdate pending list), BRIN summaries are updated in specific circumstances.
When Summaries Are Created
At index creation time, all existing block ranges are summarized. As new data is appended, existing summaries are updated in-place. But new block ranges that fall beyond the last summarized range are not automatically summarized. They remain unsummarized until one of three things happens:
- VACUUM runs on the table (manual or autovacuum)
- autosummarize is enabled (off by default, opt-in)
- Manual summarization via
brin_summarize_new_values()
This means that after a bulk insert, newly created block ranges might not have summaries yet. Queries against those ranges will scan all pages in the unsummarized block ranges (treating them as if every block could match).
Enabling autosummarize
CREATE INDEX idx_orders_created ON orders USING brin(created_at)
WITH (pages_per_range = 128, autosummarize = on);
With autosummarize = on, autovacuum will summarize new block ranges even if the table itself does not need vacuuming. This is usually what you want for append-only tables.
Manual Summarization
-- Summarize all unsummarized block ranges
SELECT brin_summarize_new_values('idx_orders_created');
-- Summarize a specific block range
SELECT brin_summarize_range('idx_orders_created', 500);
Desummarization
If a block range’s summary becomes inaccurate (because updates moved values around), you can remove the summary and let it be rebuilt:
SELECT brin_desummarize_range('idx_orders_created', 500);
BRIN vs B-tree: When BRIN Wins
BRIN is not a general-purpose replacement for B-tree. It wins in specific, well-understood scenarios.
BRIN wins when:
- The table is large (tens of millions of rows or more)
- The indexed column has strong physical correlation with query patterns
- The table is mostly append-only (time-series, logs, events, IoT data)
- Index maintenance overhead is a concern
- Storage budget for indexes is limited
- Queries use range predicates on the correlated column
B-tree wins when:
- The table is small to medium (under 10 million rows)
- The indexed column is not physically correlated
- Queries need ordering (ORDER BY, LIMIT)
- Queries need exact lookups by primary key
- The index must be unique or support foreign keys
A useful rule of thumb: if CLUSTER on the BRIN column would make sense for your workload, BRIN is appropriate. Clustering physically reorders the table by the indexed column, which is exactly the correlation BRIN needs. If clustering makes no sense (random access patterns, frequent updates, non-correlated data), BRIN will not help either.
Checking Physical Correlation
Before creating a BRIN index, check whether the data is physically correlated:
-- Check correlation between column values and physical row order
-- 1.0 = perfect correlation (monotonically increasing with CTID)
-- 0.0 = no correlation (random)
-- -1.0 = perfect inverse correlation
SELECT attname, correlation
FROM pg_stats
WHERE tablename = 'orders'
AND attname IN ('created_at', 'customer_id', 'amount')
ORDER BY correlation DESC;
A correlation above 0.9 on the indexed column is a strong signal that BRIN will be effective. Below 0.7, BRIN will likely degrade to a sequential scan for most queries.
Diagnostic Queries
-- Check BRIN index size
SELECT indexrelname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
pg_size_pretty(pg_relation_size(indrelid)) AS table_size,
round(100.0 * pg_relation_size(indexrelid) /
pg_relation_size(indrelid), 2) AS size_ratio_pct
FROM pg_index i
JOIN pg_stat_user_indexes s ON i.indexrelid = s.indexrelid
WHERE s.amname = 'brin';
-- Check for unsummarized ranges
SELECT pg_size_pretty(brin_summarize_new_values('idx_orders_created')) AS newly_summarized;
-- Compare index usage
SELECT schemaname, relname, indexrelname,
idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE amname = 'brin'
ORDER BY idx_scan;
-- Verify the planner is using the BRIN index
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders
WHERE created_at BETWEEN '2026-01-01' AND '2026-03-01';
BRIN and Parallel Query
BRIN indexes work well with parallel sequential scans. When the planner decides to use a BRIN index, it produces a bitmap that eliminates non-matching pages. The remaining pages are then scanned in parallel if the table is large enough.
This means even when BRIN cannot skip most of the table (low correlation), the overhead of checking the small index is negligible. The planner can fall back to a parallel sequential scan with almost no penalty for having tried the BRIN index first.
The Gotcha: Silent Degradation
The most dangerous thing about BRIN is how quietly it becomes useless. Unlike a B-tree that is always correct (just potentially slower than optimal), a BRIN index that has lost physical correlation does not error, does not warn, and does not cause wrong results. It just stops being useful. Every query that would use the BRIN index falls back to a sequential scan, and the planner makes this decision silently.
If you have a BRIN index on a column that was correlated when the table was created but that correlation has degraded over time due to updates, HOT chains, page fills, and VACUUM, you might have an index that does nothing and not know it.
Monitor this by checking idx_scan in pg_stat_user_indexes. If the BRIN index’s scan count is zero or near-zero for queries that should be hitting it, the planner has decided the index is not useful.
Key Takeaways
- BRIN (Block Range Index) stores summary statistics (min/max, bloom filters, inclusion bounds) for groups of physically adjacent pages, producing an index that is typically 100-1000× smaller than B-tree on the same data.
- BRIN is lossy: it tells the planner which block ranges might contain matches. The executor must recheck actual rows. This is always a bitmap heap scan, never an index-only scan.
- BRIN effectiveness depends entirely on physical correlation between the indexed column and the CTID (row location). Check with
pg_stats.correlation. Above 0.9 is good. Below 0.7 is usually not worth it. pages_per_range(default 128) controls the tradeoff between index size and precision. Lower values mean a larger but more selective index.- Four operator class families:
minmax(range queries),minmax-multi(multiple intervals for gaps/outliers),bloom(equality queries without correlation), andinclusion(contains/overlaps for geometric and range types). - New block ranges are not automatically summarized. Enable
autosummarize = onor runbrin_summarize_new_values()after bulk inserts. - BRIN cannot enforce uniqueness, cannot support index-only scans, and cannot provide ordering. Use B-tree for those requirements.
- BRIN’s silent degradation is the main risk. Monitor
idx_scanand periodically check physical correlation on columns with BRIN indexes.
What’s Next
BRIN completes the specialised index trilogy (GIN, GiST, BRIN). The next post in this series shifts focus to a topic that affects every index type: index bloat. Over time, B-tree pages fragment, GIN posting lists accumulate dead entries, and tables grow even when row counts stay flat. I’ll cover how to detect bloat, what causes it, and the tools for reclaiming space without locking your tables (pg_repack, pg_squeeze, and the long-overdue bottom-up deletion improvements in PostgreSQL 14+).
Previous in the series: GiST Indexes — The Extensible Infrastructure Behind Spatial Data, Ranges, and Full-Text Search