I have a table with ~12M rows of data in it. Here is the table structure:
SYSTEM_ID
BATCH_ID
MEASUREMENT_INDEX,
PARAMETER_ONE,
PARAMETER_TWO
The primary key is made up of the first three columns. I also have an index on SYSTEM_ID ASC
, BATCH_ID ASC
.
If I do the following query:
SELECT PARAMETER_ONE
FROM RESULTS_TABLE
WHERE SYSTEM_ID=1
AND BATCH_ID=100;
The results come back in about 0.002 seconds.
However, when I try to do ANY aggregate function, e.g. SUM
, MAX
, AVG
, STDDEV
, the query time raises to around 400 seconds! I have tried various permutations of this query e.g.:
SELECT AVG(PARAMETER_ONE)
FROM ( SELECT PARAMETER_ONE
FROM RESULTS_TABLE
WHERE SYSTEM_ID=1
AND BATCH_ID=100
);
And:
SELECT DISTINCT
AVG(PARAMETER_ONE) over (partition by system_id, batch_id)
FROM RESULTS_TABLE
WHERE SYSTEM_ID = 123
AND BATCH_ID = 10;
Neither of these seems to make a difference. Has anyone else had this problem??
===========================
MORE DETAILS
The explain plan for the aggregate function query is:
Plan hash value: 2759933517
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 32 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 32 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| RESULTS_DATA | 1 | 32 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | RESULTS_DATA_INDEX1 | 1 | | 1 (0)| 00:00:01 |
Predicate Information (identified by operation id):
3 - access("SYSTEM_ID"='123' AND "BATCH_ID"='10')
The explain plan for just a simple select is:
Plan hash value: 1958859493
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 32 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| RESULTS_DATA | 1 | 32 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | RESULTS_DATA_INDEX1 | 1 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("SYSTEM_ID"='123' AND "BATCH_ID"='10')
What is surprising is that the explain plan for both seems to claim that it takes only 1 second to carry out that query, but it definitely does not, either in SQL Developer or SQL Plus...
==================================================
UPDATE
Here is the autotrace of the query with aggregate function as requested:
SQL> set autotrace traceonly explain statistics
SQL> SELECT avg(tail_intensity) FROM RESULTS_DATA WHERE SYSTEM_ID = '12
3' AND BATCH_ID = '10';
Execution Plan
----------------------------------------------------------
Plan hash value: 2759933517
--------------------------------------------------------------------------------
-----------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 32 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 32 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| RESULTS_DATA | 1 | 32 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | RESULTS_DATA_INDEX1 | 1 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("SYSTEM_ID"='123' AND "BATCH_ID"='10')
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
437569 consistent gets
437129 physical reads
0 redo size
433 bytes sent via SQL*Net to client
416 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
1 Answer 1
Thanks for all your input - I seem to have solved it. I recalculated table statistics using:
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'RESULTS
_DATA');
After this, the queries worked fine.
DBMS_XPLAN
package? In particular, we're going to want to see columns for things like cardinality, cost, and estimated time as well as the filter predicates.set autotrace traceonly explain statistics
and run the queries)? Are your stats accurate?437129 physical reads
is 3Gb of disk IO (assuming 8k block size). Stats must be way out.