Oracle query optimization

Question 1

We have a query that runs on a view and its execution time is about 6 min.

SELECT COUNT(*) 
FROM someschema.item_total_violations 
WHERE channel_id = 122507833
AND item_date >= TIMESTAMP'2015-02-01 00:00:00';

Its not a materialized view so there is no index on the item_date.

The base query of the view is below:

SELECT i.id
 ,i.title
 ,i.body
 ,i.description
 ,i.item_date
 ,i.created_date
 ,i.channel_id
 ,i.community_id
 ,i.person_id
 ,i.message_type
 ,ch.client_id
 ,upper(p.NAME) AS person_name_upper
 ,upper(ch.NAME) AS channel_name_upper
 ,p.NAME AS person_name
 ,ch.NAME AS channel_name
 ,ch.connector_type AS channel_type
 ,cm.NAME AS community_name
 ,cm.network_id
 ,nvl(ct2.cnt, 0) + nvl(ct4.cnt, 0) AS total_violations_count
 ,nvl(ct1.cnt, 0) + nvl(ct3.cnt, 0) AS quarantined_violations_count
 ,nvl(ct5.cnt, 0) AS quarantined_attributes_count
 ,nvl(lh_info.cnt_lh, 0) AS legal_holds_count
 ,nvl(lh_info.cnt_lh_w_no_date, 0) AS legal_holds_count_w_no_date
 ,nvl(lh_info.max_hold_until, to_date('1970-01-01', 'yyyy-mm-dd')) AS legal_hold_max_hold_until
FROM item i
 ,channel ch
 ,person p
 ,community cm
 ,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 AND pv.quarantine_status = 'QUARANTINED'
 GROUP BY pv.item_id
 ) ct1
 ,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 GROUP BY pv.item_id
 ) ct2
 ,(
 SELECT /*+ NO_MERGE */ aa.item_id AS item_id
 ,COUNT(*) AS cnt
 FROM item_attachment aa
 ,policy_violation pv
 WHERE aa.id = pv.item_attachment_id
 AND pv.quarantine_status = 'QUARANTINED'
 GROUP BY aa.item_id
 ) ct3
 ,(
 SELECT /*+ NO_MERGE */ aa.item_id AS item_id
 ,COUNT(*) AS cnt
 FROM item_attachment aa
 ,policy_violation pv
 WHERE aa.id = pv.item_attachment_id
 AND pv.item_id IS NULL
 GROUP BY aa.item_id
 ) ct4
 ,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(DISTINCT (pv.item_attribute_id)) AS cnt
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 AND pv.item_attribute_id IS NOT NULL
 AND pv.quarantine_status = 'QUARANTINED'
 GROUP BY pv.item_id
 ) ct5
 ,(
 SELECT lh_rel.item_id AS item_id
 ,count(DISTINCT (lh.id)) AS cnt_lh
 ,max(lh.HOLD_UNTIL) AS max_hold_until
 ,sum(CASE 
 WHEN lh.hold_until IS NULL
 THEN 1
 ELSE 0
 END) AS cnt_lh_w_no_date
 FROM LEGAL_HOLD_RELATION lh_rel
 ,LEGAL_HOLD lh
 WHERE lh.id = lh_rel.legal_hold_id
 AND lh.STATUS = 'A'
 GROUP BY lh_rel.item_id
 ) lh_info
WHERE ch.STATUS = 'A'
 AND i.channel_id = ch.id
 AND i.person_id = p.id
 AND i.community_id = cm.id(+)
 AND i.id = ct1.item_id(+)
 AND i.id = ct2.item_id(+)
 AND i.id = ct3.item_id(+)
 AND i.id = ct4.item_id(+)
 AND i.id = ct5.item_id(+)
 AND i.id = lh_info.item_id(+)

The item table is the biggest and we have tried partitioning the table in chaannel_id but that didn't help our query. I understand that there is no index on item_date in original query and it has effect on performance. Apart from trying out materialized view, is there something that I can improve on the base query of the view?

Question 2

Is there an index on (channel_id) or a composite index on (channel_id, item_date)?

Question 3

Why do you use this view at all for this query?

The view collects violations and other stuff, but your query does not care about those at all, just the number of items. The view lists all items regardless of these because of the outer joins, so you basically perform a lot of unnecessary extra work to collect violations and other stuff (the NO_MERGE hint makes this even worse), then you throw it away and count the items based on a totally different criteria.

The query should be something like this:

select
 count(*) 
from
 item i
 join channel ch on (i.channel_id = ch.id)
 join person p on (i.person_id = p.id)
where
 ch.status = 'A'
 and i.channel_id = 122507833
 and i.item_date >= timestamp'2015-02-01 00:00:00';

Additionally, you don't need materialized views for indexing a column. Just create the index on the column(s) of the base table(s).

On top of the must-have indexes on primary and foreign keys, another possible candidate is item_date column depending on your data distribution.

Question 4

Nice! If there is a foreign key from item (person_id) that references person(id), then the join to person can be removed as well (no idea if the query optimizer can do that.)

Question 5

You can improve efficiency of your query by (for example) replacing:

 ,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 AND pv.quarantine_status = 'QUARANTINED'
 GROUP BY pv.item_id
 ) ct1
 ,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 GROUP BY pv.item_id
 ) ct2
 ,

with:

 ,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 GROUP BY pv.item_id
 ) ct2
 ,

This rewrite can be applied for other sub-selects in your query as well. That should give us:

 [...]
 ,nvl(ct2.cnt_quarantined, 0) + nvl(ct4.cnt_quarantined, 0) AS quarantined_violations_count
 [...]
,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined 
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 GROUP BY pv.item_id
 ) ct2
,(
 SELECT /*+ NO_MERGE */ aa.item_id AS item_id
 ,COUNT(*) AS cnt
 ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined 
 FROM item_attachment aa
 ,policy_violation pv
 WHERE aa.id = pv.item_attachment_id
 AND pv.item_id IS NULL
 GROUP BY aa.item_id
 ) ct4

You might be able to further merge these two rewrite with the use of a left join, but it is difficult to say without knowing more details of you model. Anyhow, this rewrite should reduce the amount of work that your query has to do.

As an advise I would suggest that you for such large queries use ansi joins since it makes it easier to read and understand the query

Balazs Papp Balazs Papp 41.5k2 gold badges29 silver badges47 bronze badges · Answer 1 · 2015-04-06 09:19:49Z

Why do you use this view at all for this query?

The view collects violations and other stuff, but your query does not care about those at all, just the number of items. The view lists all items regardless of these because of the outer joins, so you basically perform a lot of unnecessary extra work to collect violations and other stuff (the NO_MERGE hint makes this even worse), then you throw it away and count the items based on a totally different criteria.

The query should be something like this:

select
 count(*) 
from
 item i
 join channel ch on (i.channel_id = ch.id)
 join person p on (i.person_id = p.id)
where
 ch.status = 'A'
 and i.channel_id = 122507833
 and i.item_date >= timestamp'2015-02-01 00:00:00';

Additionally, you don't need materialized views for indexing a column. Just create the index on the column(s) of the base table(s).

On top of the must-have indexes on primary and foreign keys, another possible candidate is item_date column depending on your data distribution.

Nice! If there is a foreign key from item (person_id) that references person(id), then the join to person can be removed as well (no idea if the query optimizer can do that.)

score 1 · Answer 2 · 2015-04-06 08:34:17Z

You can improve efficiency of your query by (for example) replacing:

 ,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 AND pv.quarantine_status = 'QUARANTINED'
 GROUP BY pv.item_id
 ) ct1
 ,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 GROUP BY pv.item_id
 ) ct2
 ,

with:

 ,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 GROUP BY pv.item_id
 ) ct2
 ,

This rewrite can be applied for other sub-selects in your query as well. That should give us:

 [...]
 ,nvl(ct2.cnt_quarantined, 0) + nvl(ct4.cnt_quarantined, 0) AS quarantined_violations_count
 [...]
,(
 SELECT /*+ NO_MERGE */ pv.item_id
 ,count(*) AS cnt
 ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined 
 FROM policy_violation pv
 WHERE pv.item_id IS NOT NULL
 GROUP BY pv.item_id
 ) ct2
,(
 SELECT /*+ NO_MERGE */ aa.item_id AS item_id
 ,COUNT(*) AS cnt
 ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined 
 FROM item_attachment aa
 ,policy_violation pv
 WHERE aa.id = pv.item_attachment_id
 AND pv.item_id IS NULL
 GROUP BY aa.item_id
 ) ct4

You might be able to further merge these two rewrite with the use of a left join, but it is difficult to say without knowing more details of you model. Anyhow, this rewrite should reduce the amount of work that your query has to do.

As an advise I would suggest that you for such large queries use ansi joins since it makes it easier to read and understand the query

Stack Exchange Network

Oracle query optimization

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Oracle query optimization

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions