1

I recently started working with a legacy Rails application. Trying to go through with some of the queries used to get analytics related data. Some table contains around 13M records. Running count query against these table itself takes more than a minute. The query against which I am trying to run explain and analyze is taking more than 30 min.

Query

 
 SELECT
 portfolio_id,
 min(order_trigger_date) as order_trigger_date 
 FROM
 `user_orders` 
 INNER JOIN
 `user_portfolios` 
 ON `user_portfolios`.`id` = `user_orders`.`portfolio_id` 
 WHERE
 `user_orders`.`state` IN 
 (
 'executed_success',
 'executed_failure',
 'placed_failure'
 )
 AND `user_orders`.`type` = 'buy' 
 AND `user_orders`.`order_trigger_date` BETWEEN '1969-12-31 18:30:00' AND '2023-12-15 18:29:59' 
 AND `user_portfolios`.`state` = 'active' 
 GROUP BY
 `user_orders`.`portfolio_id`

We have proper indexes in both the tables

user_order
 - state
 - type
 - order_trigger_date
 - portfolio_id
user_portfolios 
 - state

Below is the explain result

+----+-------------+-----------------+------------+-------------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+---------+---------------------------------+---------+----------+----------------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+------------+-------------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+---------+---------------------------------+---------+----------+----------------------------------------------------------------------------------------------------------------------------------------+
| 1 | SIMPLE | user_orders | NULL | index_merge | index_user_orders_on_state,index_user_orders_on_type,index_user_orders_on_portfolio_id,index_order_trigger_date_on_admin_user_orders | index_user_orders_on_type,index_order_trigger_date_on_admin_user_orders | 513,6 | NULL | 3100629 | 66.09 | Using intersect(index_user_orders_on_type,index_order_trigger_date_on_admin_user_orders); Using where; Using temporary; Using filesort |
| 1 | SIMPLE | user_portfolios | NULL | eq_ref | PRIMARY,index_state_on_user_portfolios | PRIMARY | 8 | mydb.user_orders.portfolio_id | 1 | 50.0 | Using where |
+----+-------------+-----------------+------------+-------------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+---------+---------------------------------+---------+----------+----------------------------------------------------------------------------------------------------------------------------------------+

Can anyone suggest me any ways to optimize the above query? What are the alternative methods/ways I can use to fetch this data in faster manner?

Be

Rohit Gupta
2,1248 gold badges20 silver badges25 bronze badges
asked Jan 27, 2024 at 6:22
1
  • What does the EXPLAIN ANALYZE say? Commented Jan 27, 2024 at 15:34

2 Answers 2

2

We have proper indexes in both the tables

In my experience I have never seen MySQL using more than one index per table , so you need a multicolumn index in this case.


Add the following indexes, try again the query and see the execution plan.

user_orders

alter table user_orders 
 add index pid_tp_ortridt_st (portfolio_id,type,order_trigger_date,state) ;

user_portfolios

alter table user_portfolios 
 add index id_st (id,state) ;

See Multiple-Column Indexes

answered Jan 27, 2024 at 16:36
1

It's a nasty. You are filtering on two tables, and the main table needs a two-dimensional index (which does not exist).

I suggest these as possibly helping:

user_orders: INDEX(type, state, order_trigger_date, portfolio_id)
user_orders: INDEX(type, order_trigger_date, portfolio_id, state)
user_portfolios: INDEX(state, id)

You could try this reformulation, too:

SELECT uo.portfolio_id, min(uo.order_trigger_date) as order_trigger_date
 FROM `user_orders` AS uo `user_portfolios` AS up ON up.`id` = uo.`portfolio_id`
 WHERE uo.`state` IN ( 'executed_success', 
 'executed_failure',
 'placed_failure' 
 )
 AND uo.`type` = 'buy'
 AND uo.`order_trigger_date` BETWEEN '1969-12-31 18:30:00' AND '2023-12-15 18:29:59'
 AND up.`state` = 'active'
 AND EXISTS ( SELECT 1
 FROM `user_portfolios` AS up
 WHERE up.`id` = uo.`portfolio_id` 
 )
 GROUP BY uo.`portfolio_id` 

The following won't change speed, but makes more sense:

 AND `order_trigger_date` BETWEEN '1969-12-31 18:30:00'
 AND '2023-12-15 18:29:59'

-->

 AND `order_trigger_date` < '2023-12-15 18:29:59'

That looks like it is "all dates", in which case remove that part of the WHERE clause to make any SELECT variant run faster. If date range comes from user input, build the WHERE clause on the fly to avoid kludges like that BETWEEN.

answered Jan 28, 2024 at 1:33

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.