The purpose of pg_index_pilot
is to provide all tools needed to manage indexes in Postgres in most automated fashion.
This project is in its very early stage. We start with most boring yet extremely important task: automatic reindexing ("AR") to mitigate index bloat, supporting any types of indexes, and then expand to other areas of index health. And then expand to two other big areas β automated index removal ("AIR") and, finally, automated index creation and optimization ("AIC&O"). It is a part of the Selfβdriving Postgres, but can be used independently as a standalone tool.
Docs: Installation | Runbook | FAQ | Function reference | Architecture
- Automated index lifecycle management for PostgreSQL, starting with automatic reindexing to keep index bloat under control without manual work.
- Simplicity and full embed: implemented entirely inside PostgreSQL (PL/pgSQL), no external services required to rebuild indexes.
- Works everywhere PostgreSQL runs (including managed platforms) β all logic lives in the database.
- No superuser requirement for dayβtoβday operations; designed to run under owner/privileged roles in control DB and target DBs.
- Scheduling inside the database via
pg_cron
β no EC2/Lambda or other external orchestrators needed. - Supports reindexing of all common index types (btree, hash, gin, gist, spgist); brin is currently excluded.
- Control DB orchestrates multiple target databases via
postgres_fdw
/dblink
; reindexing is executed withreindex concurrently
to minimize locking.
See Architecture for detailed design decisions and requirements.
- Roadmap
- Automated reindexing
- Requirements
- Recommendations
- Installation
- Initial launch
- Scheduling automated maintenance
- Uninstalling pg_index_pilot
- Updating pg_index_pilot
- Monitoring and Analysis
The roadmap covers three big areas:
- "AR": Automated Reindexing
- Maxim Boguk's bloat estimation formula β works with any type of index, not only btree
- original implementation (
pg_index_pilot
) β requires initial full reindex - non-superuser mode for cloud databases (AWS RDS, Google Cloud SQL, Azure)
- flexible connection management for dblink
- API for stats obtained on a clone (to avoid full reindex on prod primary)
- original implementation (
- Traditional bloat estimation (ioguix; btree only)
- Exact bloat analysis (pgstattuple; analysis on clones)
- Tested on managed services
- RDS and Aurora (see AWS specifics in Installation: docs/installation.md#aws-rds--aurora-specifics)
- CloudSQL
- Supabase
- Crunchy Bridge
- Azure
- Integration with postgres_ai monitoring
- Resource-aware scheduling, predictive maintenance windows (when will load be lowest?)
- Coordination with other ops (backups, vacuums, upgrades)
- Parallelization and throttling (adaptive)
- Predictive bloat modeling
- Learning & Feedback Loops: learning from past actions, A/B testing and "what-if" simulation (DBLab)
- Impact estimation before scheduling
- RCA of fast degraded index health (why it gets bloated fast?) and mitigation (tune autovacuum, avoid xmin horizon getting stuck)
- Self-adjusting thresholds
- Maxim Boguk's bloat estimation formula β works with any type of index, not only btree
- "AIR": Automated Index Removal
- Unused indexes
- Redundant indexes
- Invalid indexes (or, per configuration, rebuilding them)
- Advanced scoring; suboptimal / rarely used indexes cleanup; self-adjusting thresholds
- Forecasting of index usage; seasonal pattern recognition
- Impact estimation before removal; "what-if" simulation (DBLab)
- "AIC&O": Automated Index Creation & Optimization
- Index recommendations (including multi-column, expression, partial, hybrid, and covering indexes)
- Index optimization according to configured goals (latency, size, WAL, write/HOT overhead, read overhead)
- Experimentation (hypothetical with HypoPG, real with DBLab)
- Query pattern classification
- Advanced scoring; cost/benefit analysis
- Impact estimation before operations; "what-if" simulation (DBLab)
The framework of reindexing is implemented entirely inside Postgres, using:
- PL/pgSQL functions and stored procedures with transaction control
- dblink to execute
REINDEX CONCURRENTLY
β because it cannot be inside a transaction block) - pg_cron for scheduling
- PostgreSQL version 13.0 or higher
- IMPORTANT: Requires ability to create database (not supported on TigerData, formerly Timescale Cloud)
- Separate control database (
index_pilot_control
) to manage target databases dblink
andpostgres_fdw
extensions installed in control database- Database owner or user with appropriate permissions
- Works with AWS RDS, Google Cloud SQL, Azure Database for PostgreSQL (where database creation is allowed)
- Manages multiple target databases from single control database
- Uses REINDEX CONCURRENTLY from control database (avoids deadlocks)
- If server resources allow set non-zero
max_parallel_maintenance_workers
(exact amount depends on server parameters). - To set
wal_keep_segments
to at least5000
, unless the WAL archive is used to support streaming replication.
# Clone the repository git clone https://gitlab.com/postgres-ai/pg_index_pilot cd pg_index_pilot # 1) Install into control database (auto-creates DB, installs extensions/objects) PGPASSWORD='your_password' \ ./index_pilot.sh install-control \ -H your_host -U your_user -C your_control_db_name # 2) Register a target database via FDW (secure user mapping) PGPASSWORD='your_password' \ ./index_pilot.sh register-target \ -H your_host -U your_user -C your_control_db_name \ -T your_database --fdw-host your_host # 3) Verify installation and environment PGPASSWORD='your_password' \ ./index_pilot.sh verify \ -H your_host -U your_user -C your_control_db_name # (Optional) Uninstall PGPASSWORD='your_password' \ ./index_pilot.sh uninstall \ -H your_host -U your_user -C your_control_db_name --drop-servers
Notes:
- Use
PGPASSWORD
to avoid echoing secrets; the script wonβt print passwords. --fdw-host
should be reachable from the database server itself (in Docker/CI it might bepostgres
,127.0.0.1
, or the container IP).- For self-hosted replace host with
127.0.0.1
. For managed services ensure the admin user cancreate database
andcreate extension
.
- PostgreSQL β₯ 13 and ability to create database/extensions (control DB).
- Decide: CONTROL_DB name, TARGET_DB name, TARGET_HOST (reachable from Postgres server, not only from client).
- If you plan to use pg_cron: ensure itβs in
shared_preload_libraries
(RDS: parameter group + reboot), andcreate extension pg_cron
incron.database_name
. - The FDW user mapping is looked up for the
current_user
in the control DB session. Create mapping for that user.
- CONTROL_DB, TARGET_DB, TARGET_HOST, SERVER_NAME (e.g.
target_<target_db>
) - CONTROL_USER/PASS (user running commands in control DB)
- TARGET_USER/PASS (user in the target DB; typically an owner or a role with owner rights)
target_<db>
: FDW server that points to the target database. This name goes toindex_pilot.target_databases.fdw_server_name
.- A user mapping must exist for
current_user
(in the control DB) to eachtarget_<db>
server you intend to use.
CRITICAL: Never use hardcoded passwords in production. The setup_01_user.psql
script requires a secure password to be provided via psql variable:
# Generate secure random password RANDOM_PWD=$(openssl rand -base64 32) # Use the secure setup script (recommended) ./setup_user_secure.sh # Or run manually with secure password psql -f setup_01_user.psql -v index_pilot_password="$RANDOM_PWD" echo "Generated password: $RANDOM_PWD"
# Clone the repository git clone https://gitlab.com/postgres-ai/pg_index_pilot cd pg_index_pilot # 1. Create control database (as admin user) psql -h your-instance.region.rds.amazonaws.com -U postgres -c "create database index_pilot_control;" # 2. Install required extensions in control database psql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -c "CREATE EXTENSION IF NOT EXISTS postgres_fdw;" psql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -c "CREATE EXTENSION IF NOT EXISTS dblink;" # 3. Install schema and functions in control database psql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -f index_pilot_tables.sql psql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -f index_pilot_functions.sql psql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -f index_pilot_fdw.sql # 4. Create FDW server and user mapping for the TARGET database psql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control <<'SQL' create server if not exists target_<your_database> foreign data wrapper postgres_fdw options (host 'your-instance.region.rds.amazonaws.com', port '5432', dbname 'your_database'); -- dblink_connect(server_name) uses current_user user mapping; create mapping for the user running control DB (often postgres or index_pilot) create user mapping if not exists for current_user server target_<your_database> options (user 'remote_owner_or_role', password 'remote_password'); SQL # 5. Register the TARGET database (links index_pilot.target_databases to your FDW server) psql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control <<'SQL' insert into index_pilot.target_databases(database_name, host, port, fdw_server_name, enabled) values ('your_database', 'your-instance.region.rds.amazonaws.com', 5432, 'target_your_database', true) on conflict (database_name) do update set host=excluded.host, port=excluded.port, fdw_server_name=excluded.fdw_server_name, enabled=true; SQL # 7. Verify FDW and environment psql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -c "select * from index_pilot.check_fdw_security_status();" psql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -c "select * from index_pilot.check_environment();"
# Clone the repository git clone https://gitlab.com/postgres-ai/pg_index_pilot cd pg_index_pilot # 1. Create control database (as superuser) psql -U postgres -c "create database index_pilot_control;" # 2. Install required extensions in control database (as superuser) psql -U postgres -d index_pilot_control -c "CREATE EXTENSION IF NOT EXISTS postgres_fdw;" psql -U postgres -d index_pilot_control -c "CREATE EXTENSION IF NOT EXISTS dblink;" # 3. Install schema and functions in control database (as superuser) psql -U postgres -d index_pilot_control -f index_pilot_tables.sql psql -U postgres -d index_pilot_control -f index_pilot_functions.sql psql -U postgres -d index_pilot_control -f index_pilot_fdw.sql # 4. Create FDW server and user mapping for the TARGET database psql -U postgres -d index_pilot_control <<'SQL' create server if not exists target_your_database foreign data wrapper postgres_fdw options (host '127.0.0.1', port '5432', dbname 'your_database'); create user mapping if not exists for current_user server target_your_database options (user 'remote_owner_or_role', password 'remote_password'); SQL # 5. Register the TARGET database psql -U postgres -d index_pilot_control <<'SQL' insert into index_pilot.target_databases(database_name, host, port, fdw_server_name, enabled) values ('your_database', '127.0.0.1', 5432, 'target_your_database', true) on conflict (database_name) do update set host=excluded.host, port=excluded.port, fdw_server_name=excluded.fdw_server_name, enabled=true; SQL # 7. Verify psql -U postgres -d index_pilot_control -c "select * from index_pilot.check_fdw_security_status();" psql -U postgres -d index_pilot_control -c "select * from index_pilot.check_environment();"
For manual initial run:
# Set credentials export PGSSLMODE=require export PGPASSWORD='your_index_pilot_password' # Run initial analysis and reindexing nohup psql -h your_host -U index_pilot -d your_database \ -qXt -c "call index_pilot.periodic(true)" >> index_pilot.log 2>&1
The optimal maintenance schedule depends on your database characteristics:
Daily maintenance (recommended for):
- High-traffic databases with frequent updates
- Databases where index bloat accumulates quickly
- Systems with sufficient maintenance windows each night
- When you want to catch and fix bloat early
Weekly maintenance (recommended for):
- Stable databases with predictable workloads
- Systems where index bloat accumulates slowly
- Production systems where daily maintenance might be disruptive
- Databases with limited maintenance windows
Step 1: Check where pg_cron is installed
-- Find which database has pg_cron show cron.database_name;
Step 2: Schedule jobs from the pg_cron database
-- Connect to the database shown in step 1 \c postgres_ai -- or whatever cron.database_name shows -- Daily maintenance at 2 AM select cron.schedule_in_database( 'pg_index_pilot_daily', '0 2 * * *', 'call index_pilot.periodic(real_run := true);', 'index_pilot_control' -- Run in control database ); -- Monitoring every 6 hours (no actual reindex) select cron.schedule_in_database( 'pg_index_pilot_monitor', '0 */6 * * *', 'call index_pilot.periodic(real_run := false);', 'index_pilot_control' ); -- OR weekly maintenance on Sunday at 2 AM select cron.schedule_in_database( 'pg_index_pilot_weekly', '0 2 * * 0', 'call index_pilot.periodic(real_run := true);', 'index_pilot_control' );
Step 3: Verify and manage schedules
-- View scheduled jobs select jobname, schedule, command, database, active from cron.job where jobname like 'pg_index_pilot%'; -- Disable a schedule select cron.unschedule('pg_index_pilot_daily'); -- Change schedule time select cron.unschedule('pg_index_pilot_daily'); select cron.schedule_in_database( 'pg_index_pilot_daily', '0 3 * * *', -- New time: 3 AM 'call index_pilot.periodic(real_run := true);', 'index_pilot_control' );
Create a maintenance script:
# Runs reindexing only on primary (all databases) psql -d postgres -AtqXc "select not pg_is_in_recovery()" | grep -qx t || exit; psql -d postgres -qt -c "call index_pilot.periodic(true);"
Add to crontab:
# Runs reindexing daily at 2 AM (only on primary) 0 2 * * * /usr/local/bin/index_maintenance.sh
π‘ Best Practices:
- Schedule during low-traffic periods
- Avoid overlapping with backup or other IO-intensive operations
- Consider hourly runs for high-write workloads
- Monitor resource usage during initial runs (first of all, both disk IO and CPU usage)
To completely remove pg_index_pilot from your database:
# Uninstall the tool (this will delete all collected statistics!) psql -h your-instance.region.rds.amazonaws.com -U postgres -d your_database -f uninstall.sql # Check for any leftover invalid indexes from failed reindexes psql -h your-instance.region.rds.amazonaws.com -U postgres -d your_database \ -c "select format('drop index concurrently if exists %I.%I;', n.nspname, i.relname) from pg_index idx join pg_class i on i.oid = idx.indexrelid join pg_namespace n on n.oid = i.relnamespace where i.relname ~ '_ccnew[0-9]*$' and not idx.indisvalid;" # Run any drop index commands from the previous query manually
Note: The uninstall script will:
- Remove the
index_pilot
schema and all its objects - Remove the FDW server configuration
- List any invalid
_ccnew*
indexes that need manual cleanup - Preserve the
postgres_fdw
extension (may be used by other tools)
To update to the latest version:
cd pg_index_pilot git pull # Reload the updated functions (or reinstall completely) psql -1 -d your_database -f index_pilot_functions.sql psql -1 -d your_database -f index_pilot_fdw.sql
-- Show recent reindexing operations with status select schemaname, relname, indexrelname, pg_size_pretty(indexsize_before::bigint) as size_before, pg_size_pretty(indexsize_after::bigint) as size_after, reindex_duration, status, case when error_message is not null then left(error_message, 50) else null end as error, entry_timestamp from index_pilot.reindex_history order by entry_timestamp desc limit 20; -- Show only failed reindexes for debugging select schemaname, relname, indexrelname, pg_size_pretty(indexsize_before::bigint) as size_before, reindex_duration, error_message, entry_timestamp from index_pilot.reindex_history where status = 'failed' order by entry_timestamp desc;
π‘ Tip: Use the convenient index_pilot.history
view for formatted output:
-- View recent operations with formatted sizes and status select * from index_pilot.history limit 20; -- View only failed operations select * from index_pilot.history where status = 'failed';
-- Check bloat estimates for current database select indexrelname, pg_size_pretty(indexsize::bigint) as current_size, round(estimated_bloat::numeric, 1)||'x' as bloat_now from index_pilot.get_index_bloat_estimates(current_database()) order by estimated_bloat desc nulls last limit 40;
-- Initialize baseline without reindex (sets best_ratio for large indexes) select index_pilot.do_force_populate_index_stats('<TARGET_DB>', null, null, null); -- List what periodic(true) would take under current thresholds select schemaname, relname, indexrelname, pg_size_pretty(indexsize) as size, round(estimated_bloat::numeric, 2) as bloat_x from index_pilot.get_index_bloat_estimates('<TARGET_DB>') where indexsize >= pg_size_bytes(index_pilot.get_setting(datname, schemaname, relname, indexrelname, 'index_size_threshold')) and coalesce(index_pilot.get_setting(datname, schemaname, relname, indexrelname, 'skip')::boolean, false) = false and (estimated_bloat is null or estimated_bloat >= index_pilot.get_setting(datname, schemaname, relname, indexrelname, 'index_rebuild_scale_factor')::float) order by estimated_bloat desc nulls first limit 50; -- Exclude service schemas if desired select index_pilot.set_or_replace_setting('<TARGET_DB>','pg_toast',null,null,'skip','true',null); select index_pilot.set_or_replace_setting('<TARGET_DB>','_timescaledb_internal',null,null,'skip','true',null);
Notes:
- Baseline sets best_ratio to current size/tuples; immediately after, bloat_x β 1.0 and will grow as indexes bloat.
- Small indexes (< minimum_reliable_index_size, default 128kB) skip best_ratio to avoid noise; candidates are still gated by index_size_threshold (default 10MB).