Correct way to index a boolean column when used with other columns

Question 1

When implementing soft delete on a table that can be searched by other columns, which is the correct way to index it?

Let's say the table has an id field, and a couple of text fields, and finally a isDeleted Boolean field.

All queries will include WHERE isDeleted=FALSE AND ...

Should I add one index for each (id, each text columns, and finally one for isDeleted)? Or Composite indexes that include isDeleted (e.g. INDEX x ON "Table"("id","isDeleted"), etc)? Or something else?

I'm tempted to leave the isDeleted alone since it will only be scanned on content i might return based on other indexes, and I don't expect to have most of the data deleted.

Question 2

If the index is solely for the delete purpose you may create a filtered or partial index

CREATE INDEX "IX_NAME" on "table" (id,text1,..) where isDeleted =FALSE;

This way the index will be compact and efficient. You may choose to create a compound index if multiple columns are referenced in the same query. If the "ID" alone provides high selectivity you could also remove other fields from the index.

Considering the comment from Laurenz, while both "IS" and "=" operator in query bring same result, their semantics are different. When they encounter NULL value, "IS" evaluate it to FALSE and "=" evaluate it to NULL. Even though rows evaluated to both NULL and FALSE are excluded from the result the query and index definition should be a literal match. Due to this an index with "=" do not support a query with "IS" and vice versa. An index that could support both of them is,

CREATE INDEX "IX_NAME" on "table" (id,text1,..) where (isDeleted =FALSE or isDeleted is FALSE);

Question 3

@LaurenzAlbe that was really an interesting piece of information. At first I wrote it with "=", but then I saw an example in site with "IS" hence I changed. I am curios should the application use "IS" or "=" for booleans ? "IS" for nullable columns?

Question 4

I think that sometimes, single field indexes are wasted.

Assuming you want to filter out isDeleted ones, I would create indexes

Id, isDeleted
Text1, isDeleted
Text2, isDeleted

Question 5

I'd like to "move" archived (aka soft deleted) records to a different table partition:

CREATE TABLE t1 (id int GENERATED ALWAYS AS IDENTITY , f1 text, f2 text, del_stamp timestamptz) PARTITION BY LIST ( (del_stamp IS NULL ));
CREATE TABLE t1_active PARTITION OF t1 FOR VALUES IN (TRUE);
CREATE TABLE t1_archive PARTITION OF t1 FOR VALUES IN (FALSE);
ALTER TABLE t1_active -- pkey for foreign key usage
 ADD CONSTRAINT pkey_t1_active PRIMARY KEY (id);
ALTER TABLE t1_archive -- pkey for foreign key usage
 ADD CONSTRAINT pkey_t1_archive PRIMARY KEY (id);
INSERT INTO t1(f1, f2)
VALUES ('foo', 'bar')
 ,('another foo', 'some more bar');
ANALYSE t1;
EXPLAIN(ANALYSE , VERBOSE , BUFFERS )
SELECT *
FROM t1
WHERE del_stamp IS NULL;
EXPLAIN(ANALYSE , VERBOSE , BUFFERS )
SELECT *
FROM t1
WHERE del_stamp IS NOT NULL; -- deleted records
UPDATE t1
SET del_stamp = now() -- soft delete
WHERE id = 1
AND del_stamp IS NULL; -- not "deleted" yet
EXPLAIN(ANALYSE , VERBOSE , BUFFERS )
SELECT *
FROM t1
WHERE del_stamp IS NULL;
EXPLAIN(ANALYSE , VERBOSE , BUFFERS )
SELECT *
FROM t1
WHERE del_stamp IS NOT NULL; -- deleted records

And one of the query plans, only reading from public.t1_active:

Seq Scan on public.t1_active t1 (cost=0.00..1.02 rows=2 width=29)
(actual time=0.043..0.044 rows=1 loops=1) Output: t1.id, t1.f1,
t1.f2, t1.del_stamp Filter: (t1.del_stamp IS NULL) Buffers: shared
hit=1 Query Identifier: -8475205591691465029 Planning Time: 0.131 ms
Execution Time: 0.066 ms

As you can see in the query plan, the planner has already selected the partition you need. You don't need to worry about an index on the soft-delete column. Another benefit is that deleted records do not pollute your active partition. And when you implement the soft-delete feature with a timestamp, data retention will be much easier to handle as well. You can even create sub-partitions per month and discard entire partitions after a specified number of months.

goodfella goodfella 6885 silver badges14 bronze badges · Answer 1 · 2025-08-28 04:16:00Z

If the index is solely for the delete purpose you may create a filtered or partial index

CREATE INDEX "IX_NAME" on "table" (id,text1,..) where isDeleted =FALSE;

This way the index will be compact and efficient. You may choose to create a compound index if multiple columns are referenced in the same query. If the "ID" alone provides high selectivity you could also remove other fields from the index.

Considering the comment from Laurenz, while both "IS" and "=" operator in query bring same result, their semantics are different. When they encounter NULL value, "IS" evaluate it to FALSE and "=" evaluate it to NULL. Even though rows evaluated to both NULL and FALSE are excluded from the result the query and index definition should be a literal match. Due to this an index with "=" do not support a query with "IS" and vice versa. An index that could support both of them is,

CREATE INDEX "IX_NAME" on "table" (id,text1,..) where (isDeleted =FALSE or isDeleted is FALSE);

@LaurenzAlbe that was really an interesting piece of information. At first I wrote it with "=", but then I saw an example in site with "IS" hence I changed. I am curios should the application use "IS" or "=" for booleans ? "IS" for nullable columns?

Rohit Gupta Rohit Gupta 2,1248 gold badges20 silver badges25 bronze badges · Answer 2 · 2025-08-28 03:20:54Z

I think that sometimes, single field indexes are wasted.

Assuming you want to filter out isDeleted ones, I would create indexes

Id, isDeleted
Text1, isDeleted
Text2, isDeleted

Frank Heikens Frank Heikens 24.2k1 gold badge29 silver badges20 bronze badges · Answer 3 · 2025-08-28 16:37:13Z

I'd like to "move" archived (aka soft deleted) records to a different table partition:

CREATE TABLE t1 (id int GENERATED ALWAYS AS IDENTITY , f1 text, f2 text, del_stamp timestamptz) PARTITION BY LIST ( (del_stamp IS NULL ));
CREATE TABLE t1_active PARTITION OF t1 FOR VALUES IN (TRUE);
CREATE TABLE t1_archive PARTITION OF t1 FOR VALUES IN (FALSE);
ALTER TABLE t1_active -- pkey for foreign key usage
 ADD CONSTRAINT pkey_t1_active PRIMARY KEY (id);
ALTER TABLE t1_archive -- pkey for foreign key usage
 ADD CONSTRAINT pkey_t1_archive PRIMARY KEY (id);
INSERT INTO t1(f1, f2)
VALUES ('foo', 'bar')
 ,('another foo', 'some more bar');
ANALYSE t1;
EXPLAIN(ANALYSE , VERBOSE , BUFFERS )
SELECT *
FROM t1
WHERE del_stamp IS NULL;
EXPLAIN(ANALYSE , VERBOSE , BUFFERS )
SELECT *
FROM t1
WHERE del_stamp IS NOT NULL; -- deleted records
UPDATE t1
SET del_stamp = now() -- soft delete
WHERE id = 1
AND del_stamp IS NULL; -- not "deleted" yet
EXPLAIN(ANALYSE , VERBOSE , BUFFERS )
SELECT *
FROM t1
WHERE del_stamp IS NULL;
EXPLAIN(ANALYSE , VERBOSE , BUFFERS )
SELECT *
FROM t1
WHERE del_stamp IS NOT NULL; -- deleted records

And one of the query plans, only reading from public.t1_active:

Seq Scan on public.t1_active t1 (cost=0.00..1.02 rows=2 width=29)
(actual time=0.043..0.044 rows=1 loops=1) Output: t1.id, t1.f1,
t1.f2, t1.del_stamp Filter: (t1.del_stamp IS NULL) Buffers: shared
hit=1 Query Identifier: -8475205591691465029 Planning Time: 0.131 ms
Execution Time: 0.066 ms

As you can see in the query plan, the planner has already selected the partition you need. You don't need to worry about an index on the soft-delete column. Another benefit is that deleted records do not pollute your active partition. And when you implement the soft-delete feature with a timestamp, data retention will be much easier to handle as well. You can even create sub-partitions per month and discard entire partitions after a specified number of months.

Stack Exchange Network

Correct way to index a boolean column when used with other columns

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Correct way to index a boolean column when used with other columns

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions