Query against history table not using index (Oracle 12c)

Question 1

I have a history table which has an 'ID' and 'TIMESTAMP' column as such

CREATE TABLE hist (
 HIST_ID INTEGER,
 HIST_TIMESTAMP TIMESTAMP,
 ID INTEGER, -- this is the id of the table that is being tracked
 --OTHER COLS
);

I also have an index on this table as such

CREATE INDEX hist_ix ON hist (ID, HIST_TIMESTAMP);

This table has a lot of inserts against it and currently has about 30m rows in it.

When I try to run the following query, oracle does a full table scan instead of using the index (which .. at least I believe .. it should be able to use).

SELECT ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= <<A TIMESTAMP>> GROUP BY ID;

It seems to me that Oracle should be able to use the index to quickly identify which id/timestamp pair is just to the "left" of a specific point in time quickly via looking at the id/timestamp index on an id-by-id basis, but it's insisting on a full table scan.

Any help would be appreciated to get this query running quicker.

I have ran the following to make sure the statistics were up to date

EXEC DBMS_STATS.GATHER_TABLE_STATS('<meh>','hist');

Also, there are about 1k distinct ID values in the hist table.

With regard to data distribution.. Of the ~1k IDs, 50 have less than 100 entries in the table, 70 have between 100 and 1000 entries, 146 have between 1000 and 10000 entries, and the rest range from 10k to 60k entries. Over half of the entries have at least 30k records.

Question 2

Are the statistics up to date? How many (approx) distinct IDs?

Question 3

The statistics ARE up to date .. there are about 1k distinct IDs. I will add this information to question body.

Question 4

Hum, I hadn't looked at the query properly. Does the plan change if you "invert" the query (min / tstmp > foo)?

Question 5

No .. MIN and MAX both have the exact same explain plan including cardinality/cost values. :/ (MIN using <= and MAX using >=)

Question 6

Do I maybe need to get some column statistics? (I was under the impression that table statistics included column statistics) .. or maybe do an dbms_stats of the index?

Question 7

Index usage is obviously possible, but optional.

CREATE TABLE hist (
 HIST_ID INTEGER,
 HIST_TIMESTAMP TIMESTAMP,
 ID INTEGER -- this is the id of the table that is being tracked
 --OTHER COLS
);
CREATE INDEX hist_ix ON hist (ID, HIST_TIMESTAMP);
explain plan for SELECT ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= :B1 GROUP BY ID;
select * from table(dbms_xplan.display);
Plan hash value: 1027924405
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 1 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT| | 1 | 26 | 1 (0)| 00:00:01 |
|* 2 | INDEX FULL SCAN | HIST_IX | 1 | 26 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 2 - access("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
 filter("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
Note
-----
 - dynamic statistics used: dynamic sampling (level=2)

It is not true that an index can be used only for the leading columns:

explain plan for SELECT /*+ INDEX_SS(hist hist_ix) */ ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= :B1 GROUP BY ID;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2669193891
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 1 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT| | 1 | 26 | 1 (0)| 00:00:01 |
|* 2 | INDEX SKIP SCAN | HIST_IX | 1 | 26 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 2 - access("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
 filter("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
Note
-----
 - dynamic statistics used: dynamic sampling (level=2)

Another method would be an index fast full scan (INDEX_FFS hint).

If you force the usage of your index with hints, then compare the cost of the plan with full table scan and the plan with index access path. It is simply a cost based decision with a simple example like this.

If you can not even force the usage of your index, I would search the problem somewhere else. For example your index is in UNUSABLE state (check USER_INDEXES.STATUS) or it was made INVISIBLE (USER_INDEXES.VISIBILITY).

Question 8

Turns out I had my hint syntax not quite right - I can get it to use an index via a hint (via just INDEX or INDEX_SS .. it picks full table scan when I try INDEX_FFS). Performance still isn't that great, though. Hum.. :/ Thank you for this answer, though - it does shed a bit of light on some things. I will likely accept it if nothing else comes in by the end of the day. :)

Question 9

Your index is a composite index on both hist_id AND hist_timestamp. Since hist_id is the leading portion of that index, the index can't be used when your WHERE predicate is on just the trailing hist_timestamp portion.

Question 10

With a cardinality of about 1:30000 I suspect an index skip scan would be used so the fact that ID is the leading attribute shouldn't matter. Though I understand it can be more complicated than that and clearly the index isn't being used.

Question 11

I understand that Oracle MAY think like what you are saying .. however, I would hope that the GROUP BY clause on ID would give Oracle a hint that it could still use the index that starts with ID.

Balazs Papp Balazs Papp 41.5k2 gold badges29 silver badges47 bronze badges · Accepted Answer · 2016-05-10 19:03:14Z

Index usage is obviously possible, but optional.

CREATE TABLE hist (
 HIST_ID INTEGER,
 HIST_TIMESTAMP TIMESTAMP,
 ID INTEGER -- this is the id of the table that is being tracked
 --OTHER COLS
);
CREATE INDEX hist_ix ON hist (ID, HIST_TIMESTAMP);
explain plan for SELECT ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= :B1 GROUP BY ID;
select * from table(dbms_xplan.display);
Plan hash value: 1027924405
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 1 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT| | 1 | 26 | 1 (0)| 00:00:01 |
|* 2 | INDEX FULL SCAN | HIST_IX | 1 | 26 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 2 - access("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
 filter("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
Note
-----
 - dynamic statistics used: dynamic sampling (level=2)

It is not true that an index can be used only for the leading columns:

explain plan for SELECT /*+ INDEX_SS(hist hist_ix) */ ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= :B1 GROUP BY ID;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2669193891
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 1 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT| | 1 | 26 | 1 (0)| 00:00:01 |
|* 2 | INDEX SKIP SCAN | HIST_IX | 1 | 26 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 2 - access("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
 filter("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
Note
-----
 - dynamic statistics used: dynamic sampling (level=2)

Another method would be an index fast full scan (INDEX_FFS hint).

If you force the usage of your index with hints, then compare the cost of the plan with full table scan and the plan with index access path. It is simply a cost based decision with a simple example like this.

If you can not even force the usage of your index, I would search the problem somewhere else. For example your index is in UNUSABLE state (check USER_INDEXES.STATUS) or it was made INVISIBLE (USER_INDEXES.VISIBILITY).

Turns out I had my hint syntax not quite right - I can get it to use an index via a hint (via just INDEX or INDEX_SS .. it picks full table scan when I try INDEX_FFS). Performance still isn't that great, though. Hum.. :/ Thank you for this answer, though - it does shed a bit of light on some things. I will likely accept it if nothing else comes in by the end of the day. :)

Stack Exchange Network

Query against history table not using index (Oracle 12c)

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Query against history table not using index (Oracle 12c)

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions