9

When doing a count (aggregate) SQL query, what can speed up the execution time in these 3 database systems? I'm sure many things could speed it up (hardware for one), but I'm just a novice DBA, so I'm sure I'll be getting a few answers here. I migrated about 157 million rows to a SQL Server database, and this query is taking forever. But in my source Netezza database, it takes seconds.

For example:

Netezza 6:

SELECT COUNT(*) FROM DATABASENAME..MYTABLE

Oracle 11g:

SELECT COUNT(*) FROM MYTABLE

SQL Server 2012:

SELECT COUNT(*) FROM DATABASENAME.[dbo].[MYTABLE]
asked Oct 20, 2012 at 21:37
2
  • Might look at this question: stackoverflow.com/questions/11130448/sql-count-performance Commented Oct 20, 2012 at 22:36
  • @JonSeigel we're doing incremental loads, and we're comparing records between database systems each day to make sure the counts add up. So repeatedly. Commented Oct 22, 2012 at 23:36

4 Answers 4

10

Netezza is an appliance that is designed to excel at large table scans, so that's why you're getting such fast results on that system.

For your SQL Server, you can greatly speed up the row count by querying from the sys.dm_db_partition_stats DMV.

SELECT s.name AS [Schema], o.name AS [Table], SUM(p.row_count) AS [RowCount]
FROM sys.dm_db_partition_stats p JOIN sys.objects o
ON p.object_id = o.object_id JOIN sys.schemas s
ON o.schema_id = s.schema_id
WHERE p.index_id < 2
AND o.object_id = object_id('MyTable')
GROUP BY o.name, s.name;

In a high transaction environment, this DMV is not guaranteed to be 100% accurate. But from your question, it sounds like you are just doing row counts to verify each table after your migration, so this query should work for you.

Jack Douglas
40.6k16 gold badges106 silver badges179 bronze badges
answered Oct 21, 2012 at 16:27
2
  • 4
    @Phil why? If you loop through the tables and perform an expensive SELECT COUNT(*) from each one - how accurate is the first result once you've reached the last table? Commented Oct 22, 2012 at 20:33
  • 1
    For clarity, Phil had said: "Using the data dictionary,which does not provide 100% accurate results is bad advice. In my opinion the answer should either be edited to remove the suggestion or deleted - remember people google for such answers and will blindly cut and paste..." I agree that the disclaimer is important (and there are allegedly some edge cases where the metadata does not return sensible results), I disagree that using the metadata views in general is bad advice. Commented Oct 22, 2012 at 23:19
5

Here's a SQL Server solution that uses COUNT_BIG inside an indexed view. This will get you a transactionally-consistent count without the overhead of big table or index scans, and without the need for the storage required for the latter:

CREATE TABLE [dbo].[MyTable](id int);
GO
CREATE VIEW [dbo].[MyTableRowCount]
 WITH SCHEMABINDING
AS
 SELECT
 COUNT_BIG(*) AS TableRowCount
 FROM [dbo].[MyTable];
GO
CREATE UNIQUE CLUSTERED INDEX IX_MyTableRowCount
 ON [dbo].[MyTableRowCount](TableRowCount);
GO
SELECT
 TableRowCount
 FROM [dbo].[MyTableRowCount] WITH(NOEXPAND);

This will require a single initial scan (no getting away from this), and add a bit of overhead to incremental table data manipulations. If you're doing big operations with lots of data (as opposed to many small operations), I think the overhead on changes should be negligible.

answered Oct 22, 2012 at 18:08
2
  • @SQLKiwi: How come reads are blocked pre-2012? SQL Server bug? Commented Oct 23, 2012 at 2:53
  • @JonSeigel - My 0,05ドル: Normal clustered indexes on normal table created offline applies an Sch-M lock on table. On a view, of course it's not needed but this means an alteration on the Create Index operation to create an special case for indexed view - which was done for SQL2012. IMHO, of course. Commented Oct 23, 2012 at 18:00
3

In Oracle, a binary tree index on a NOT NULL column can be used to answer a COUNT(*). It will be faster in most cases than a FULL TABLE SCAN because indexes are usually smaller than their base table.

However, a regular binary tree index will still be huge with 157 Mrows. If your table is not updated concurrently (ie. only batch load process), then you might want to use a bitmap index instead.

The smallest bitmap index would be something like this:

CREATE BITMAP INDEX ix ON your_table(NULL);

Null entries are taken into account by a bitmap index. The resulting index will be tiny (20-30 8k blocks per million row) compared to either a regular binary tree index or the base table.

The resulting plan should show the following operations:

----------------------------------------------
| Id | Operation | Name | 
----------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT AGGREGATE | |
| 2 | BITMAP CONVERSION COUNT | |
| 3 | BITMAP INDEX FAST FULL SCAN| IX |
----------------------------------------------

If your table is updated concurrently, a bitmap index with a unique value will be a point of contention and shouldn't be used.

answered Oct 22, 2012 at 9:57
3

In Oracle, simple count query is often executed by scanning an index instead of whole table. The index must be bitmap index or defined on a column with NOT NULL constraint. For more complex queries that require full table scan, you could use parallel query.

To enable parallel query (Enterprise Edition required), you can use optimizer hint:

select /*+ PARALLEL(mytable, 12) */ count(*) from mytable;

Or enable parallel query for all queries on the table:

alter table mytable parallel 12;
answered Oct 22, 2012 at 9:07

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.