When running:
REINDEX DATABASE CONCURRENTLY mydb;
which could take several hours, or even days, depending on the size of the database, is there anyway to get a rough estimate of its progress?
I've seen some forum posts claiming you can query index creation status with a query like:
SELECT
now()::TIME(0),
a.query,
p.phase,
p.blocks_total,
p.blocks_done,
p.tuples_total,
p.tuples_done,
FROM pg_stat_progress_create_index p
JOIN pg_stat_activity a ON p.pid = a.pid;
The _done/_total columns in combination with phase does provide a rough progress percent. However, this only lists the progress of the currently updating index. It doesn't tell you how many other indexes are pending update, much less how much work there is to do for each.
Edit: I've tried combining the views pg_index
, which lists the *_ccnew
temporary indexes used by the concurrent process, with pg_stat_progress_create_index
like:
SELECT relname,
CASE WHEN blocks_total > 0 THEN (ci.blocks_done/ci.blocks_total::numeric*100)::int ELSE NULL END as blocks_percent,
i.*
FROM pg_class as pgc
inner join pg_index as i on i.indexrelid = pgc.oid
left outer join pg_stat_progress_create_index as ci on ci.index_relid = i.indexrelid
WHERE i.indisvalid = false;
but this shows strange results. For my database, it lists ~300 indexes in pg_index that are temporary, and waiting to be updated. However, the one index cross referenced by pg_stat_progress_create_index that updates never is marked valid. It gets to 100% of blocks processed, and then disappears from pg_stat_progress_create_index
but its indisvalid
stays false. Why is this?
3 Answers 3
Postgres 12 or later has the system view pg_stat_progress_create_index
.
It reports ...
One row for each backend running
CREATE INDEX
orREINDEX
, showing current progress.
Find details in the chapter CREATE INDEX Progress Reporting of the manual.
This is potentially very expensive on a busy server!
REINDEX DATABASE CONCURRENTLY mydb;
The manual has a chapter CREATE INDEX Phases describing states of the column phase
. In particular, consider:
waiting for old snapshots
CREATE INDEX CONCURRENTLY
orREINDEX CONCURRENTLY
is waiting for transactions that can potentially see the table to release their snapshots. This phase is skipped when not in concurrent mode. Columnslockers_total
,lockers_done
andcurrent_locker_pid
contain the progress information for this phase.
Long-running transactions can stall the progress. Consider running REINDEX CONCURRENTLY
on selected indexes. And REINDEX DATABASE
at off-hours when you can afford to lock tables exclusively.
If your server is not actually busy, check for long-running transactions with:
SELECT * FROM pg_stat_activity;
The ones with state = 'idle in transaction'
are the prime troublemakers (typically hint at a programming error, where transactions are not committed or rolled back). Those would likely show up in pg_stat_progress_create_index.current_locker_pid
of stalled indexes.
Related:
After reviewing the columns in pg_index
, it looks like Postgres uses indisvalid = false
to denote all the indexes being rebuilt and the subset of those with indisready = true
denotes the indexes that pg_stat_progress_create_index
has processed.
Using that, I think I can calculate a total progress percent with a query like:
SELECT
m.complete_steps,
m.total_steps,
m.step_percent,
(m.complete_steps/m.total_steps::numeric*100)::int AS total_percent
FROM (
SELECT
MAX(CASE WHEN blocks_total > 0 THEN (ci.blocks_done/ci.blocks_total::numeric*100)::int ELSE NULL END) AS step_percent,
COUNT(*) AS total_steps,
COUNT(CASE WHEN i.indisready THEN 1 ELSE NULL END) AS complete_steps
FROM pg_class AS pgc
INNER JOIN pg_index AS i ON i.indexrelid = pgc.oid
LEFT OUTER JOIN pg_stat_progress_create_index AS ci ON ci.index_relid = i.indexrelid
WHERE i.indisvalid = false
) AS m
REINDEX DATABASE
command has no visible progress reporting in terms of overall work. Here is relevant code in postgresql source tree. REINDEX DATABASE
will iterate over tables list and call (literally) reindex table
for each individual relation in separate transaction. There is no progress report on how many relations have already been processed and how big the remaining list is.
The statuses in pg_stat_progress_create_index
are reported by the cyclically called reindex_relation
. So, it is useful from the point of view of processing a single index, but does not have information about the total amount of work across the entire database.