How to efficiently get absolute value of a time interval in Postgresql?

Question 1

I have a huge table in Postgresql-11 like following:

CREATE TABLE my_huge_table(
 tick_time timestamp(6) with time zone NOT NULL,
 brok_time timestamp(6) with time zone,
 trade_day date NOT NULL,
 --other fields ...
 ...
 CONSTRAINT my_huge_table_pkey PRIMARY KEY (tick_time)
);
CREATE INDEX idx_my_huge_table_td_time ON my_huge_table USING brin
 ( trade_day, abs(tick_time - brok_time) );

Then I make a query and want it to take advantage of the index idx_my_huge_table_td_time, like this:

SELECT * FROM my_huge_table
WHERE trade_day BETWEEN TO_DATE('20220104', 'YYYYMMDD') AND TO_DATE('20220104', 'YYYYMMDD') 
 AND ABS(tick_time - brok_time) < INTERVAL '10 s';

But PostgreSQL refuse to execute it, and said:

ERROR: function abs(interval) does not exist

LINE 3: AND ABS(tick_time - brok_time) < INTERVAL '10 s'
 ^ 
HINT: No function matches the given name and argument types. You might need to add explicit type casts.

SQL state: 42883 Character: 525

It looks like that the func abs() can NOT accept a interval value as a argument.

Then, I changed my query:

SELECT * FROM my_huge_table
WHERE trade_day BETWEEN TO_DATE('20220104', 'YYYYMMDD') AND TO_DATE('20220104', 'YYYYMMDD') 
 AND GREATEST(tick_time - brok_time, brok_time - tick_time) < INTERVAL '10 s';

This time it can be executed, but didn't take advantage of the index.

My questions:

1.How should I compose the expression of index? In fact I want it to record a distance(absolute interval value) between two timestamp fields;

2.How should I code the query that can use the index above?

3.In fact GREATEST(tick_time - brok_time, brok_time - tick_time) is NOT a good idea, since it invoked two times computing. Isn't it?

4.After created the index, I note that the real DDL SQL of the index reported by PostgreSQL is:

CREATE INDEX idx_my_huge_table_td_time ON public.my_huge_table USING brin
 (trade_day, abs(date_part('epoch'::text, tick_time - brok_time)));

Have the value of the expresstion casted into a text type? It apparently is NOT my expectation!

Question 2

The answer is to create a generated column as follows (all of the code below is available on the fiddle here):

I had an original answer (shown at end of answer), but I've revised it to use a Generated Column (aka "Computed" or "Virtual" column) instead of an Expression Index (aka "Functional Index").

This has the advantages of:

a) It's calculated on insertion and does not have to be recomputed every time and
b) it makes the SQL much clearer - see original answer below.

There's one disadvantage in that it uses more space, but I've found that this is not normally a critical issue (never seen it myself). Unfortunately, PostgreSQL does not yet have virtual generated columns - see link.

Your table definition should be as follows:

CREATE TABLE t 
(
 ticktime TIMESTAMPTZ, 
 broktime TIMESTAMPTZ,
 trade_day DATE,
 -- 
 -- other fields
 --
 abs_b_minus_t INTERVAL GENERATED ALWAYS AS (GREATEST(broktime, ticktime) - LEAST(broktime, ticktime)) STORED
);

Then create an index on abs_b_minus_t:

CREATE INDEX t_ix ON t 
USING BRIN (trade_day, abs_b_minus_t );

Populate:

INSERT INTO t VALUES
('2022-02-14 14:43:55'::TIMESTAMPTZ, '2022-02-14 12:43:55'::TIMESTAMPTZ, '2022-02-14'::DATE),
('2022-03-14 14:43:55'::TIMESTAMPTZ, '2022-02-14 12:43:55'::TIMESTAMPTZ, '2022-03-14'::DATE),
('2022-02-14 14:43:55'::TIMESTAMPTZ, '2022-05-14 12:43:55'::TIMESTAMPTZ, '2022-02-14'::DATE);

Then we run:

SELECT 
 ticktime - broktime AS t_minus_b,
 abs_b_minus_t
FROM t;

Result:

t_minus_b abs_b_minus_t
02:00:00 02:00:00
28 days 02:00:00 28 days 02:00:00
-88 days -21:00:00 88 days 21:00:00

So, we see that it's working - we are obtaining absolute values of the difference between broktime and tradtime.

Now, we can check index usage - we run SET enable_seqscan = OFF; and then:

EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT 
 broktime - ticktime
FROM t
WHERE abs_b_minus_t < INTERVAL '30 DAYS';

Result:

QUERY PLAN
Bitmap Heap Scan on public.t (cost=12.14..39.07 rows=423 width=16) (actual time=0.022..0.025 rows=2 loops=1)
 Output: (broktime - ticktime)
 Recheck Cond: (t.abs_b_minus_t < '30 days'::interval)
 Rows Removed by Index Recheck: 1
 Heap Blocks: lossy=1
 Buffers: shared hit=3
 -> Bitmap Index Scan on t_ix (cost=0.00..12.03 rows=1270 width=0) (actual time=0.017..0.017 rows=10 loops=1)
    Index Cond: (t.abs_b_minus_t < '30 days'::interval)
    Buffers: shared hit=2
Planning:
 Buffers: shared hit=1
Planning Time: 0.042 ms
Execution Time: 0.052 ms

So, we are using t_ix with the BRIN index on our generated field.

Original Answer:

CREATE TABLE t 
(
 ticktime TIMESTAMPTZ, 
 broktime TIMESTAMPTZ,
 trade_day DATE
 -- 
 -- other fields
 --
);

Now, we create our functional index as follows:

CREATE INDEX t_ix ON t 
USING BRIN (trade_day, (GREATEST(broktime, ticktime) - LEAST(broktime, ticktime)));

Populate the table:

INSERT INTO t VALUES
('2022-02-14 14:43:55'::TIMESTAMPTZ, '2022-02-14 12:43:55'::TIMESTAMPTZ, '2022-02-14'::DATE),
('2022-03-14 14:43:55'::TIMESTAMPTZ, '2022-02-14 12:43:55'::TIMESTAMPTZ, '2022-03-14'::DATE),
('2022-02-14 14:43:55'::TIMESTAMPTZ, '2022-05-14 12:43:55'::TIMESTAMPTZ, '2022-02-14'::DATE);

Now we test:

SELECT 
 ticktime - broktime AS t_minus_b,
 GREATEST(broktime, ticktime) - LEAST(broktime, ticktime) AS abs_b_minus_t
FROM t;

Result:

t_minus_b abs_b_minus_t
02:00:00 02:00:00
28 days 02:00:00 28 days 02:00:00
-88 days -21:00:00 88 days 21:00:00

So, we have the values and their absolutes.

SELECT 
 broktime - ticktime
FROM t
WHERE GREATEST(broktime, ticktime) - LEAST(broktime, ticktime) < INTERVAL '30 DAYS';

Result:

?column?
-02:00:00
-28 days -02:00:00

To check index usage, we disable seqscans:

Then, we run:

EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT 
 broktime - ticktime
FROM t
WHERE GREATEST(broktime, ticktime) - LEAST(broktime, ticktime) < INTERVAL '30 DAYS';

Result:

QUERY PLAN
Bitmap Heap Scan on public.t (cost=12.17..57.59 rows=567 width=16) (actual time=0.041..0.044 rows=2 loops=1)
 Output: (broktime - ticktime)
 Recheck Cond: ((GREATEST(t.broktime, t.ticktime) - LEAST(t.broktime, t.ticktime)) < '30 days'::interval)
 Rows Removed by Index Recheck: 1
 Heap Blocks: lossy=1
 Buffers: shared hit=3
 -> Bitmap Index Scan on t_ix (cost=0.00..12.03 rows=1700 width=0) (actual time=0.027..0.027 rows=10 loops=1)
    Index Cond: ((GREATEST(t.broktime, t.ticktime) - LEAST(t.broktime, t.ticktime)) < '30 days'::interval)
    Buffers: shared hit=2
Planning:
 Buffers: shared hit=1
Planning Time: 0.044 ms
Execution Time: 0.096 ms

So, we see that t_ix is used with the relatively efficient Bitmap

Question 3

excellent!! You have given me a very good starting point, from it I can compose a generated column more suitable to my work. I can change the algorithm of the generated column to fit any requests in the future! Thanks a lot! Good man!!!

Question 4

@Leon - glad to be of help!

Vérace Vérace 31k9 gold badges73 silver badges86 bronze badges · Accepted Answer · 2023-04-23 17:09:30Z

The answer is to create a generated column as follows (all of the code below is available on the fiddle here):

I had an original answer (shown at end of answer), but I've revised it to use a Generated Column (aka "Computed" or "Virtual" column) instead of an Expression Index (aka "Functional Index").

This has the advantages of:

a) It's calculated on insertion and does not have to be recomputed every time and
b) it makes the SQL much clearer - see original answer below.

There's one disadvantage in that it uses more space, but I've found that this is not normally a critical issue (never seen it myself). Unfortunately, PostgreSQL does not yet have virtual generated columns - see link.

Your table definition should be as follows:

CREATE TABLE t 
(
 ticktime TIMESTAMPTZ, 
 broktime TIMESTAMPTZ,
 trade_day DATE,
 -- 
 -- other fields
 --
 abs_b_minus_t INTERVAL GENERATED ALWAYS AS (GREATEST(broktime, ticktime) - LEAST(broktime, ticktime)) STORED
);

Then create an index on abs_b_minus_t:

CREATE INDEX t_ix ON t 
USING BRIN (trade_day, abs_b_minus_t );

Populate:

INSERT INTO t VALUES
('2022-02-14 14:43:55'::TIMESTAMPTZ, '2022-02-14 12:43:55'::TIMESTAMPTZ, '2022-02-14'::DATE),
('2022-03-14 14:43:55'::TIMESTAMPTZ, '2022-02-14 12:43:55'::TIMESTAMPTZ, '2022-03-14'::DATE),
('2022-02-14 14:43:55'::TIMESTAMPTZ, '2022-05-14 12:43:55'::TIMESTAMPTZ, '2022-02-14'::DATE);

Then we run:

SELECT 
 ticktime - broktime AS t_minus_b,
 abs_b_minus_t
FROM t;

Result:

t_minus_b abs_b_minus_t
02:00:00 02:00:00
28 days 02:00:00 28 days 02:00:00
-88 days -21:00:00 88 days 21:00:00

So, we see that it's working - we are obtaining absolute values of the difference between broktime and tradtime.

Now, we can check index usage - we run SET enable_seqscan = OFF; and then:

EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT 
 broktime - ticktime
FROM t
WHERE abs_b_minus_t < INTERVAL '30 DAYS';

Result:

QUERY PLAN
Bitmap Heap Scan on public.t (cost=12.14..39.07 rows=423 width=16) (actual time=0.022..0.025 rows=2 loops=1)
 Output: (broktime - ticktime)
 Recheck Cond: (t.abs_b_minus_t < '30 days'::interval)
 Rows Removed by Index Recheck: 1
 Heap Blocks: lossy=1
 Buffers: shared hit=3
 -> Bitmap Index Scan on t_ix (cost=0.00..12.03 rows=1270 width=0) (actual time=0.017..0.017 rows=10 loops=1)
    Index Cond: (t.abs_b_minus_t < '30 days'::interval)
    Buffers: shared hit=2
Planning:
 Buffers: shared hit=1
Planning Time: 0.042 ms
Execution Time: 0.052 ms

So, we are using t_ix with the BRIN index on our generated field.

Original Answer:

CREATE TABLE t 
(
 ticktime TIMESTAMPTZ, 
 broktime TIMESTAMPTZ,
 trade_day DATE
 -- 
 -- other fields
 --
);

Now, we create our functional index as follows:

CREATE INDEX t_ix ON t 
USING BRIN (trade_day, (GREATEST(broktime, ticktime) - LEAST(broktime, ticktime)));

Populate the table:

INSERT INTO t VALUES
('2022-02-14 14:43:55'::TIMESTAMPTZ, '2022-02-14 12:43:55'::TIMESTAMPTZ, '2022-02-14'::DATE),
('2022-03-14 14:43:55'::TIMESTAMPTZ, '2022-02-14 12:43:55'::TIMESTAMPTZ, '2022-03-14'::DATE),
('2022-02-14 14:43:55'::TIMESTAMPTZ, '2022-05-14 12:43:55'::TIMESTAMPTZ, '2022-02-14'::DATE);

Now we test:

SELECT 
 ticktime - broktime AS t_minus_b,
 GREATEST(broktime, ticktime) - LEAST(broktime, ticktime) AS abs_b_minus_t
FROM t;

Result:

t_minus_b abs_b_minus_t
02:00:00 02:00:00
28 days 02:00:00 28 days 02:00:00
-88 days -21:00:00 88 days 21:00:00

So, we have the values and their absolutes.

SELECT 
 broktime - ticktime
FROM t
WHERE GREATEST(broktime, ticktime) - LEAST(broktime, ticktime) < INTERVAL '30 DAYS';

Result:

?column?
-02:00:00
-28 days -02:00:00

To check index usage, we disable seqscans:

Then, we run:

EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT 
 broktime - ticktime
FROM t
WHERE GREATEST(broktime, ticktime) - LEAST(broktime, ticktime) < INTERVAL '30 DAYS';

Result:

QUERY PLAN
Bitmap Heap Scan on public.t (cost=12.17..57.59 rows=567 width=16) (actual time=0.041..0.044 rows=2 loops=1)
 Output: (broktime - ticktime)
 Recheck Cond: ((GREATEST(t.broktime, t.ticktime) - LEAST(t.broktime, t.ticktime)) < '30 days'::interval)
 Rows Removed by Index Recheck: 1
 Heap Blocks: lossy=1
 Buffers: shared hit=3
 -> Bitmap Index Scan on t_ix (cost=0.00..12.03 rows=1700 width=0) (actual time=0.027..0.027 rows=10 loops=1)
    Index Cond: ((GREATEST(t.broktime, t.ticktime) - LEAST(t.broktime, t.ticktime)) < '30 days'::interval)
    Buffers: shared hit=2
Planning:
 Buffers: shared hit=1
Planning Time: 0.044 ms
Execution Time: 0.096 ms

So, we see that t_ix is used with the relatively efficient Bitmap

excellent!! You have given me a very good starting point, from it I can compose a generated column more suitable to my work. I can change the algorithm of the generated column to fit any requests in the future! Thanks a lot! Good man!!!

Stack Exchange Network

How to efficiently get absolute value of a time interval in Postgresql?

1 Answer 1

Original Answer:

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How to efficiently get absolute value of a time interval in Postgresql?

1 Answer 1

Original Answer:

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions