What can speed up a SQL count query?

Question 1

When doing a count (aggregate) SQL query, what can speed up the execution time in these 3 database systems? I'm sure many things could speed it up (hardware for one), but I'm just a novice DBA, so I'm sure I'll be getting a few answers here. I migrated about 157 million rows to a SQL Server database, and this query is taking forever. But in my source Netezza database, it takes seconds.

For example:

Netezza 6:

SELECT COUNT(*) FROM DATABASENAME..MYTABLE

Oracle 11g:

SELECT COUNT(*) FROM MYTABLE

SQL Server 2012:

SELECT COUNT(*) FROM DATABASENAME.[dbo].[MYTABLE]

Question 2

Might look at this question: stackoverflow.com/questions/11130448/sql-count-performance

Question 3

@JonSeigel we're doing incremental loads, and we're comparing records between database systems each day to make sure the counts add up. So repeatedly.

Question 4

Netezza is an appliance that is designed to excel at large table scans, so that's why you're getting such fast results on that system.

For your SQL Server, you can greatly speed up the row count by querying from the sys.dm_db_partition_stats DMV.

SELECT s.name AS [Schema], o.name AS [Table], SUM(p.row_count) AS [RowCount]
FROM sys.dm_db_partition_stats p JOIN sys.objects o
ON p.object_id = o.object_id JOIN sys.schemas s
ON o.schema_id = s.schema_id
WHERE p.index_id < 2
AND o.object_id = object_id('MyTable')
GROUP BY o.name, s.name;

In a high transaction environment, this DMV is not guaranteed to be 100% accurate. But from your question, it sounds like you are just doing row counts to verify each table after your migration, so this query should work for you.

Question 5

@Phil why? If you loop through the tables and perform an expensive SELECT COUNT(*) from each one - how accurate is the first result once you've reached the last table?

Question 6

For clarity, Phil had said: "Using the data dictionary,which does not provide 100% accurate results is bad advice. In my opinion the answer should either be edited to remove the suggestion or deleted - remember people google for such answers and will blindly cut and paste..." I agree that the disclaimer is important (and there are allegedly some edge cases where the metadata does not return sensible results), I disagree that using the metadata views in general is bad advice.

Question 7

Here's a SQL Server solution that uses COUNT_BIG inside an indexed view. This will get you a transactionally-consistent count without the overhead of big table or index scans, and without the need for the storage required for the latter:

CREATE TABLE [dbo].[MyTable](id int);
GO
CREATE VIEW [dbo].[MyTableRowCount]
 WITH SCHEMABINDING
AS
 SELECT
 COUNT_BIG(*) AS TableRowCount
 FROM [dbo].[MyTable];
GO
CREATE UNIQUE CLUSTERED INDEX IX_MyTableRowCount
 ON [dbo].[MyTableRowCount](TableRowCount);
GO
SELECT
 TableRowCount
 FROM [dbo].[MyTableRowCount] WITH(NOEXPAND);

This will require a single initial scan (no getting away from this), and add a bit of overhead to incremental table data manipulations. If you're doing big operations with lots of data (as opposed to many small operations), I think the overhead on changes should be negligible.

Question 8

@SQLKiwi: How come reads are blocked pre-2012? SQL Server bug?

Question 9

@JonSeigel - My 0,05ドル: Normal clustered indexes on normal table created offline applies an Sch-M lock on table. On a view, of course it's not needed but this means an alteration on the Create Index operation to create an special case for indexed view - which was done for SQL2012. IMHO, of course.

Question 10

In Oracle, a binary tree index on a NOT NULL column can be used to answer a COUNT(*). It will be faster in most cases than a FULL TABLE SCAN because indexes are usually smaller than their base table.

However, a regular binary tree index will still be huge with 157 Mrows. If your table is not updated concurrently (ie. only batch load process), then you might want to use a bitmap index instead.

The smallest bitmap index would be something like this:

CREATE BITMAP INDEX ix ON your_table(NULL);

Null entries are taken into account by a bitmap index. The resulting index will be tiny (20-30 8k blocks per million row) compared to either a regular binary tree index or the base table.

The resulting plan should show the following operations:

----------------------------------------------
| Id | Operation | Name | 
----------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT AGGREGATE | |
| 2 | BITMAP CONVERSION COUNT | |
| 3 | BITMAP INDEX FAST FULL SCAN| IX |
----------------------------------------------

If your table is updated concurrently, a bitmap index with a unique value will be a point of contention and shouldn't be used.

Question 11

In Oracle, simple count query is often executed by scanning an index instead of whole table. The index must be bitmap index or defined on a column with NOT NULL constraint. For more complex queries that require full table scan, you could use parallel query.

To enable parallel query (Enterprise Edition required), you can use optimizer hint:

select /*+ PARALLEL(mytable, 12) */ count(*) from mytable;

Or enable parallel query for all queries on the table:

alter table mytable parallel 12;

Patrick Keisler Patrick Keisler 9077 silver badges8 bronze badges · Accepted Answer · 2012-10-21 16:27:15Z

Netezza is an appliance that is designed to excel at large table scans, so that's why you're getting such fast results on that system.

For your SQL Server, you can greatly speed up the row count by querying from the sys.dm_db_partition_stats DMV.

SELECT s.name AS [Schema], o.name AS [Table], SUM(p.row_count) AS [RowCount]
FROM sys.dm_db_partition_stats p JOIN sys.objects o
ON p.object_id = o.object_id JOIN sys.schemas s
ON o.schema_id = s.schema_id
WHERE p.index_id < 2
AND o.object_id = object_id('MyTable')
GROUP BY o.name, s.name;

In a high transaction environment, this DMV is not guaranteed to be 100% accurate. But from your question, it sounds like you are just doing row counts to verify each table after your migration, so this query should work for you.

@Phil why? If you loop through the tables and perform an expensive SELECT COUNT(*) from each one - how accurate is the first result once you've reached the last table?
For clarity, Phil had said: "Using the data dictionary,which does not provide 100% accurate results is bad advice. In my opinion the answer should either be edited to remove the suggestion or deleted - remember people google for such answers and will blindly cut and paste..." I agree that the disclaimer is important (and there are allegedly some edge cases where the metadata does not return sensible results), I disagree that using the metadata views in general is bad advice.

Stack Exchange Network

What can speed up a SQL count query?

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

What can speed up a SQL count query?

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions