Making the Oracle optimizer use a virtual column to find out about a partition

Question 1

I have a large table that has three columns like this:

"START_DATE" DATE,
"START_VALUE" NUMBER(10,7)
"START_DATE_VALUE" NUMBER(18,7) 
 GENERATED ALWAYS AS 
 (
 (extract(YEAR FROM START_DATE) * 10000 + 
 extract(MONTH FROM START_DATE)*100 + 
 extract(DAY FROM START_DATE))*power(10,3) + 
 (START_VALUE+180)
 ) VIRTUAL

The START_DATE_VALUE column is a virtual column that is used for partitioning. However, when I have a query like this:

select * 
from mytable
where
 start_date > to_date('02-01-2012', 'MM-DD-YYYY')
 and start_value > 120.23452

It scans all partitions for the result. How can I make Oracle use the virtual column and then pick just the right partition to work on it?

Sorry my table definition is very large, I can't copy it in here.

Question 2

What happens if you query by start_date_value > 20120201300.23452 ? Does it only scan a single partition?

Question 3

what partition is the right partition for your query?

Question 4

So it should only scan partition that has START_DATE_VALUE > 20120201300.23452 only.

Question 5

It only scan partitions that has start_date_value > 20120201300.23452

Question 6

I think you have a problem with your virtual column definition. For your special values 2012年02月01日 (in YYYY-MM-DD format) and 120.23452 the value of the virtual column is

2012*10000+2*100+1*1000+180+120.23452 =わ 20120000+たす200+たす1000+たす300.23452 =20121500.23452

and not

20120201300.23452

as you expected.

Also check if your virtual column is the column your table is partitioned by.

From the VLDB and Partitioning Guide:

Virtual column-based partitioned tables benefit from partition pruning for statements that use the virtual column-defining expression in the SQL statement.

So I think select-statement

select * from mytable where start_date > todate('02-01-2012', 'MM-DD-YYYY') and start_value > 120.23452

should be something like

select * from mytable where (
 extract(YEAR FROM START_DATE) * 10000 + 
 extract(MONTH FROM START_DATE)*100 + 
 extract(DAY FROM START_DATE))*power(10,3) + 
 (START_VALUE+180)
) > 20121500.23452

for partition pruning to take place.

The

todate

function you use in your select statement does not exist in oracle sql. The name of the function is

to_date

.

Question 7

thank you for pointing out my mistake in calculation. In this test, I want to know if Oracle is smart enough to figure out that it needs to use the virtual column to go to a subset of partitions instead of going to every single partition. I have more than 3500 partitions and it really hurt performance doing that way. Can we influence the query plan somehow to make it use the virtual column even though we don't specify it?

Question 8

no, i think it will not be "smart enough". I think it will work if you use the same way as i do in my example. so it is defined in the manual.

Question 9

"How can I make oracle to use the virtual column and then pick just the right partition to work on it."

You know there's a relationship between (START_DATE, START_VALUE) and START_DATE_VALUE but alas the optimizer doesn't. All it knows is that you are querying on two columns which don't feature in your partitioning strategy.

You haven't posted your partition key, so I've made a guess at what you're doing in this test case:

create table t23
 (id number not null primary key
 , start_date date not null
 , start_value number not null
 , start_date_value number GENERATED ALWAYS AS
 (
 (extract(YEAR FROM START_DATE) * 10000 +
 extract(MONTH FROM START_DATE)*100 +
 extract(DAY FROM START_DATE))*power(10,3) +
 (START_VALUE+180)
 ) VIRTUAL
 )
 PARTITION BY range (start_date_value)
 (
 PARTITION range_10 values LESS THAN (1900000000) ,
 PARTITION range_20 values LESS THAN (2000000000),
 PARTITION range_30 values LESS THAN (2100000000),
 PARTITION range_40 values LESS THAN (2200000000),
 PARTITION range_50 values LESS THAN (11000000000),
 PARTITION range_60 values LESS THAN (12000000000),
 PARTITION range_70 values LESS THAN (13000000000),
 PARTITION range_80 values LESS THAN (to_date('01-JAN-1400')),
 PARTITION range_90 values LESS THAN (15000000000),
 PARTITION range_100 values LESS THAN (16000000000),
 PARTITION range_110 values LESS THAN (17000000000),
 PARTITION range_120 values LESS THAN (18000000000),
 PARTITION range_130 values LESS THAN (19000000000),
 PARTITION range_140 values LESS THAN (20000000000),
 PARTITION range_150 values LESS THAN (21000000000),
 PARTITION range_160 values LESS THAN (22000000000),
 PARTITION range_mx values LESS THAN (maxvalue)
 )
/

It's quite easy to reproduce the scenario you describe. All the rows for the query are in the partition range_80 but the explain plan shows a search of all partitions:

SQL> explain plan for
 select * from t23
 where start_date between to_date('31-dec-1390') 
 and to_date('01-jan-1393')
 and start_value between 1050 and 4999
/
 2 3 4 5 6 
Explained.
SQL> select * from table(dbms_xplan.display)
 2 /
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 4042841927
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 250 (1)| 00:00:03 | | |
| 1 | PARTITION RANGE ALL| | 1 | 25 | 250 (1)| 00:00:03 | 1 | 17 |
|* 2 | TABLE ACCESS FULL | T23 | 1 | 25 | 250 (1)| 00:00:03 | 1 | 17 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 2 - filter("START_VALUE"<=4999 AND "START_DATE">=TO_DATE(' 1390年12月31日 00:00:00',
 'syyyy-mm-dd hh24:mi:ss') AND "START_DATE"<=TO_DATE(' 1393年01月01日 00:00:00',
 'syyyy-mm-dd hh24:mi:ss') AND "START_VALUE">=1050)
16 rows selected.
SQL>

So you have two options. The first option is to use START_DATE_VALUE in your query. Presumably there's a reason why you're not applying this obvious solution; probably because that column has no business meaning .

The alternative is to change the partitioning strategy. This should be quite simple to achieve. Your START_DATE_VALUE column basically orders row by START_VALUE within START_DATE. You can get the same effect with sub-partitioning.

So here is table T42. It has range partitions for START_DATE and range subpartitions for START_VALUE (the template clause applies the same subpartitions to each partition):

create table t42
 (id number not null primary key
 , start_date date not null
 , start_value number not null
 )
 PARTITION BY range (start_date)
 SUBPARTITION BY range (start_value)
 SUBPARTITION TEMPLATE(
 SUBPARTITION lowval VALUES LESS THAN (1000) ,
 SUBPARTITION medval VALUES LESS THAN (5000) ,
 SUBPARTITION highval VALUES LESS THAN (MAXVALUE) 
 )
(
 PARTITION range_10 values LESS THAN (to_date('01-JAN-700')),
 PARTITION range_20 values LESS THAN (to_date('01-JAN-800')),
 PARTITION range_30 values LESS THAN (to_date('01-JAN-900')),
 PARTITION range_40 values LESS THAN (to_date('01-JAN-1000')),
 PARTITION range_50 values LESS THAN (to_date('01-JAN-1100')),
 PARTITION range_60 values LESS THAN (to_date('01-JAN-1200')),
 PARTITION range_70 values LESS THAN (to_date('01-JAN-1300')),
 PARTITION range_80 values LESS THAN (to_date('01-JAN-1400')),
 PARTITION range_90 values LESS THAN (to_date('01-JAN-1500')),
 PARTITION range_100 values LESS THAN (to_date('01-JAN-1600')),
 PARTITION range_110 values LESS THAN (to_date('01-JAN-1700')),
 PARTITION range_120 values LESS THAN (to_date('01-JAN-1800')),
 PARTITION range_130 values LESS THAN (to_date('01-JAN-1900')),
 PARTITION range_140 values LESS THAN (to_date('01-JAN-2000')),
 PARTITION range_150 values LESS THAN (to_date('01-JAN-2100')),
 PARTITION range_160 values LESS THAN (to_date('01-JAN-2200')),
 PARTITION range_mx values LESS THAN (maxvalue)
 )
/

Just to prove there's nothing up my sleeve I will populate T42 with the exact same data from T23:

SQL> insert into t42
 select id, start_date, start_value
 from t23
 /
 2 3 4 
232800 rows created.
SQL> commit;
Commit complete.
SQL> EXEC DBMS_STATS.gather_table_stats(USER, 'T42')
PL/SQL procedure successfully completed.
SQL> explain plan for
 select * from t42
 where start_date between to_date('31-dec-1390') 
 and to_date('01-jan-1393')
 and start_value between 1050 and 4999
/
SQL> SQL> 2 3 4 5 6 
Explained.
SQL>

As you can see explain plan now picks a single sub-partition within a single partition. Precision pruning!

SQL> select * from table(dbms_xplan.display)
 2 /
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 2460612001
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 27 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE | | 1 | 17 | 27 (0)| 00:00:01 | 8 | 8 |
| 2 | PARTITION RANGE SINGLE| | 1 | 17 | 27 (0)| 00:00:01 | 2 | 2 |
|* 3 | TABLE ACCESS FULL | T42 | 1 | 17 | 27 (0)| 00:00:01 | 23 | 23 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 3 - filter("START_VALUE"<=4999 AND "START_DATE">=TO_DATE(' 1390年12月31日 00:00:00',
 'syyyy-mm-dd hh24:mi:ss') AND "START_DATE"<=TO_DATE(' 1393年01月01日 00:00:00',
 'syyyy-mm-dd hh24:mi:ss') AND "START_VALUE">=1050)
17 rows selected.
SQL>

"The reason I don't want to use subpartition is because of my text index."

I suppose the pertinent question is, do you need subpartitions? I mean, how pick is your table (number of rows)? How many rows per day? Could you scrape by with only one partition per START_DATE day and forget about START_VALUE ?

Question 10

Thanks APC. The reason I don't want to use subpartition is because of my text index. I don't want to have a domain text index because I don't want to rebuild it when I split my max partition. I was trying to come up with an alternative by using this virtual column.

miracle173 miracle173 7,79728 silver badges42 bronze badges · Accepted Answer · 2012-04-09 06:48:48Z

I think you have a problem with your virtual column definition. For your special values 2012年02月01日 (in YYYY-MM-DD format) and 120.23452 the value of the virtual column is

2012*10000+2*100+1*1000+180+120.23452 =わ 20120000+たす200+たす1000+たす300.23452 =20121500.23452

and not

20120201300.23452

as you expected.

Also check if your virtual column is the column your table is partitioned by.

From the VLDB and Partitioning Guide:

Virtual column-based partitioned tables benefit from partition pruning for statements that use the virtual column-defining expression in the SQL statement.

So I think select-statement

select * from mytable where start_date > todate('02-01-2012', 'MM-DD-YYYY') and start_value > 120.23452

should be something like

select * from mytable where (
 extract(YEAR FROM START_DATE) * 10000 + 
 extract(MONTH FROM START_DATE)*100 + 
 extract(DAY FROM START_DATE))*power(10,3) + 
 (START_VALUE+180)
) > 20121500.23452

for partition pruning to take place.

The

todate

function you use in your select statement does not exist in oracle sql. The name of the function is

to_date

.

thank you for pointing out my mistake in calculation. In this test, I want to know if Oracle is smart enough to figure out that it needs to use the virtual column to go to a subset of partitions instead of going to every single partition. I have more than 3500 partitions and it really hurt performance doing that way. Can we influence the query plan somehow to make it use the virtual column even though we don't specify it?
no, i think it will not be "smart enough". I think it will work if you use the same way as i do in my example. so it is defined in the manual.

Stack Exchange Network

Making the Oracle optimizer use a virtual column to find out about a partition

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Making the Oracle optimizer use a virtual column to find out about a partition

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions