I have a history table which has an 'ID' and 'TIMESTAMP' column as such
CREATE TABLE hist (
HIST_ID INTEGER,
HIST_TIMESTAMP TIMESTAMP,
ID INTEGER, -- this is the id of the table that is being tracked
--OTHER COLS
);
I also have an index on this table as such
CREATE INDEX hist_ix ON hist (ID, HIST_TIMESTAMP);
This table has a lot of inserts against it and currently has about 30m rows in it.
When I try to run the following query, oracle does a full table scan instead of using the index (which .. at least I believe .. it should be able to use).
SELECT ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= <<A TIMESTAMP>> GROUP BY ID;
It seems to me that Oracle should be able to use the index to quickly identify which id/timestamp pair is just to the "left" of a specific point in time quickly via looking at the id/timestamp index on an id-by-id basis, but it's insisting on a full table scan.
Any help would be appreciated to get this query running quicker.
I have ran the following to make sure the statistics were up to date
EXEC DBMS_STATS.GATHER_TABLE_STATS('<meh>','hist');
Also, there are about 1k distinct ID values in the hist table.
With regard to data distribution.. Of the ~1k IDs, 50 have less than 100 entries in the table, 70 have between 100 and 1000 entries, 146 have between 1000 and 10000 entries, and the rest range from 10k to 60k entries. Over half of the entries have at least 30k records.
2 Answers 2
Index usage is obviously possible, but optional.
CREATE TABLE hist (
HIST_ID INTEGER,
HIST_TIMESTAMP TIMESTAMP,
ID INTEGER -- this is the id of the table that is being tracked
--OTHER COLS
);
CREATE INDEX hist_ix ON hist (ID, HIST_TIMESTAMP);
explain plan for SELECT ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= :B1 GROUP BY ID;
select * from table(dbms_xplan.display);
Plan hash value: 1027924405
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 1 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT| | 1 | 26 | 1 (0)| 00:00:01 |
|* 2 | INDEX FULL SCAN | HIST_IX | 1 | 26 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
filter("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
It is not true that an index can be used only for the leading columns:
explain plan for SELECT /*+ INDEX_SS(hist hist_ix) */ ID, MAX(HIST_TIMESTAMP) FROM hist WHERE HIST_TIMESTAMP <= :B1 GROUP BY ID;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2669193891
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 1 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT| | 1 | 26 | 1 (0)| 00:00:01 |
|* 2 | INDEX SKIP SCAN | HIST_IX | 1 | 26 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
filter("HIST_TIMESTAMP"<=TO_TIMESTAMP(:B1))
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
Another method would be an index fast full scan (INDEX_FFS hint).
If you force the usage of your index with hints, then compare the cost of the plan with full table scan and the plan with index access path. It is simply a cost based decision with a simple example like this.
If you can not even force the usage of your index, I would search the problem somewhere else. For example your index is in UNUSABLE
state (check USER_INDEXES.STATUS
) or it was made INVISIBLE
(USER_INDEXES.VISIBILITY
).
-
Turns out I had my hint syntax not quite right - I can get it to use an index via a hint (via just
INDEX
orINDEX_SS
.. it picks full table scan when I tryINDEX_FFS
). Performance still isn't that great, though. Hum.. :/ Thank you for this answer, though - it does shed a bit of light on some things. I will likely accept it if nothing else comes in by the end of the day. :)Joishi Bodio– Joishi Bodio2016年05月10日 20:21:38 +00:00Commented May 10, 2016 at 20:21
Your index is a composite index on both hist_id AND hist_timestamp. Since hist_id is the leading portion of that index, the index can't be used when your WHERE predicate is on just the trailing hist_timestamp portion.
-
1With a cardinality of about 1:30000 I suspect an index skip scan would be used so the fact that ID is the leading attribute shouldn't matter. Though I understand it can be more complicated than that and clearly the index isn't being used.BriteSponge– BriteSponge2016年05月10日 08:32:32 +00:00Commented May 10, 2016 at 8:32
-
I understand that Oracle MAY think like what you are saying .. however, I would hope that the
GROUP BY
clause onID
would give Oracle a hint that it could still use the index that starts withID
.Joishi Bodio– Joishi Bodio2016年05月10日 16:18:32 +00:00Commented May 10, 2016 at 16:18
min
/tstmp > foo
)?