0

What approaches I could take to optimize the performance of the following joinless query on the following PostgreSQL/TimeScaleDB table? So far, I managed to create the right index which is obviously being used by the query planner. But the query is still not fast enough.

The layout of the table and its indices is:

enter image description here

The query is:

SELECT
 entity_id,
 event_type,
 payload_type,
 encode(last(payload, timestamp), 'escape')::json AS aggregated_value
FROM event_data
WHERE
 entity_id IN ('AA','AB','AC','AD','AE','AF','AG','AH','AI','AJ','AK','AL','AM','AN','AO','AP','AQ','AR','AS','AT','AU','AV','AW','AX','AY','AZ','BA','BB','BC','BD','BE','BF','BG','BH','BI','BJ','BK','BL','BM','BN','BO','BP','BQ','BR','BS','BT','BU','BV','BW','BX')
 AND payload_type IN ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r')
 AND timestamp BETWEEN '2020-06-02T04:52:48.00Z' AND '2020-06-02T07:52:48.00Z'
GROUP BY 1,2,3
ORDER BY 1,2,3
OFFSET 0 LIMIT 2001;

The output of the EXPLAIN ANALYZE is: https://explain.depesz.com/s/okww

Limit (cost=91617.29..92079.57 rows=2001 width=89) (actual time=2026.624..2457.510 rows=800 loops=1)
 -> Finalize GroupAggregate (cost=91617.29..103685.10 rows=52235 width=89) (actual time=2026.622..2457.349 rows=800 loops=1)
 Group Key: _hyper_2_88_chunk.entity_id, _hyper_2_88_chunk.event_type, _hyper_2_88_chunk.payload_type
 -> Gather Merge (cost=91617.29..101987.46 rows=52235 width=89) (actual time=2026.421..2462.836 rows=1600 loops=1)
 Workers Planned: 1
 Workers Launched: 1
 -> Partial GroupAggregate (cost=90617.28..95111.01 rows=52235 width=89) (actual time=2017.985..2427.205 rows=800 loops=2)
 Group Key: _hyper_2_88_chunk.entity_id, _hyper_2_88_chunk.event_type, _hyper_2_88_chunk.payload_type
 -> Sort (cost=90617.28..91385.44 rows=307264 width=121) (actual time=2017.471..2199.512 rows=260608 loops=2)
 Sort Key: _hyper_2_88_chunk.entity_id, _hyper_2_88_chunk.event_type, _hyper_2_88_chunk.payload_type
 Sort Method: external merge Disk: 35000kB
 -> Append (cost=0.42..50922.41 rows=307264 width=121) (actual time=0.145..818.428 rows=260608 loops=2)
 -> Parallel Index Scan using _hyper_2_88_chunk_idx_event_timestamp on _hyper_2_88_chunk (cost=0.42..3866.03 rows=11135 width=121) (actual time=0.145..92.745 rows=10440 loops=2)
 Index Cond: (("timestamp" >= '2020-06-02 04:52:48'::timestamp without time zone) AND ("timestamp" <= '2020-06-02 07:52:48'::timestamp without time zone))
 Filter: ((payload_type = ANY ('{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r}'::text[])) AND (entity_id = ANY ('{AA,AB,AC,AD,AE,AF,AG,AH,AI,AJ,AK,AL,AM,AN,AO,AP,AQ,AR,AS,AT,AU,AV,AW,AX,AY,AZ,BA,BB,BC,BD,BE,BF,BG,BH,BI,BJ,BK,BL,BM,BN,BO,BP,BQ,BR,BS,BT,BU,BV,BW,BX}'::text[])))
 Rows Removed by Filter: 11494
 -> Parallel Index Scan using _hyper_2_90_chunk_idx_entity_id_payload_type_timestamp on _hyper_2_90_chunk (cost=0.42..15838.58 rows=102301 width=121) (actual time=0.045..207.567 rows=86832 loops=2)
 Index Cond: ((entity_id = ANY ('{AA,AB,AC,AD,AE,AF,AG,AH,AI,AJ,AK,AL,AM,AN,AO,AP,AQ,AR,AS,AT,AU,AV,AW,AX,AY,AZ,BA,BB,BC,BD,BE,BF,BG,BH,BI,BJ,BK,BL,BM,BN,BO,BP,BQ,BR,BS,BT,BU,BV,BW,BX}'::text[])) AND (payload_type = ANY ('{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r}'::text[])) AND ("timestamp" >= '2020-06-02 04:52:48'::timestamp without time zone) AND ("timestamp" <= '2020-06-02 07:52:48'::timestamp without time zone))
 -> Parallel Index Scan using _hyper_2_89_chunk_idx_entity_id_payload_type_timestamp on _hyper_2_89_chunk (cost=0.42..15811.99 rows=102444 width=121) (actual time=0.031..274.653 rows=86816 loops=2)
 Index Cond: ((entity_id = ANY ('{AA,AB,AC,AD,AE,AF,AG,AH,AI,AJ,AK,AL,AM,AN,AO,AP,AQ,AR,AS,AT,AU,AV,AW,AX,AY,AZ,BA,BB,BC,BD,BE,BF,BG,BH,BI,BJ,BK,BL,BM,BN,BO,BP,BQ,BR,BS,BT,BU,BV,BW,BX}'::text[])) AND (payload_type = ANY ('{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r}'::text[])) AND ("timestamp" >= '2020-06-02 04:52:48'::timestamp without time zone) AND ("timestamp" <= '2020-06-02 07:52:48'::timestamp without time zone))
 -> Parallel Index Scan using _hyper_2_91_chunk_idx_entity_id_payload_type_timestamp on _hyper_2_91_chunk (cost=0.42..15405.82 rows=91384 width=121) (actual time=0.051..180.641 rows=76520 loops=2)
 Index Cond: ((entity_id = ANY ('{AA,AB,AC,AD,AE,AF,AG,AH,AI,AJ,AK,AL,AM,AN,AO,AP,AQ,AR,AS,AT,AU,AV,AW,AX,AY,AZ,BA,BB,BC,BD,BE,BF,BG,BH,BI,BJ,BK,BL,BM,BN,BO,BP,BQ,BR,BS,BT,BU,BV,BW,BX}'::text[])) AND (payload_type = ANY ('{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r}'::text[])) AND ("timestamp" >= '2020-06-02 04:52:48'::timestamp without time zone) AND ("timestamp" <= '2020-06-02 07:52:48'::timestamp without time zone))
Planning time: 3.622 ms
Execution time: 2478.976 ms

Work mem:

eventdb=# show work_mem;
 work_mem
 ----------
 5242kB
(1 row)
Laurenz Albe
62k4 gold badges57 silver badges93 bronze badges
asked Jun 8, 2020 at 7:10
1
  • @Min-SooPipefeet Thank you for clarifying about timescale-tune. I see that k8s doesn't change work_mem value, while timescale-tune should do it. I don't know if some other parameters were not tuned in k8s, but it might be less important to run, since you are already tuned work_mem. Commented Jun 8, 2020 at 12:37

1 Answer 1

1

Increase work_mem until you get a hash aggregate, and the query will be much faster.

answered Jun 8, 2020 at 11:31

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.