Name Null Type
-------- -------- ---------
ID NOT NULL NUMBER(4)
GROUP_ID NUMBER(4)
TEXT CLOB
There is a btree index on group_id
. Here's how many rows each group_id
has and the corresponding percentage:
GROUP_ID COUNT PCT
---------------------- ---------------------- ----------------------
1 1 1
2 2 1
3 4 3
4 8 6
5 16 12
6 32 24
7 64 47
8 9 7
I ran this
EXEC DBMS_STATS.GATHER_SCHEMA_STATS(ownname=>'TEST', cascade=>true);
Which, if I understand correctly, will gather stats for the optimizer.
Now, I remember seeing that Oracle will not use the index and perform a full table scan instead if it's retrieving more than 5% or so of all rows. However, when I ran this query, it only started performing a FTS when group_id
was 7, which has 47% of all rows.
Is this the way it's supposed to be?
2 Answers 2
I remember seeing that Oracle will not use the index and perform a full table scan instead if it's retrieving more than 5% or so of all rows.
This is a "rule of thumb" and shouldn't be taken as a prediction. The Oracle CBO chooses an execution plan based on the estimated 'cost' of the options. The estimated cost will depend on various parameters, and the complexity increases with each Oracle release.
You can use hints and explain plan
to get a more detailed idea of the relative cost of two plans - but this should be done in a single query because the 'cost' is not guaranteed to be an absolute measure outside of a single plan:
select /*+ FULL(foo) */ * from foo where group_id=10
union all
select /*+ INDEX(i_foo) */ * from foo where group_id=10;
example explain plan with relative costs
Is this the way it's supposed to be?
In short: Yes.
Unless you have identified a particular performance issue you should trust the CBO to choose the correct path (and not, for example, use hints except for testing and experimenting like above). If you have identified a problem, the next step is to start investigating whether the CBO is making incorrect assumptions and how to give it better information - rather than assuming it is broken and/or trying to circumvent it.
Oracle also tracks the number of blocks that will be retrieved using the index values. It may be your data is clustered by group_id. In that case, it may make sense to use the index when retrieving more than 5% of the rows. If the data is sufficiently clustered for values 5 and 6, then using the index may be roughly equivalent to doing full table scan of a few blocks.
The relevant calculation will involve the number of index blocks needing to be retrieved, the number of data blocks needing to be retrieved, and the CPU required to access the data. There are tuning parameters which adjust the relative costs of retrieving index versus data blocks. This can change the plan that is being used.
SELECT * FROM my_table WHERE group_id = X