I have 3 indexed fields in a query: int
, int
, and varchar(250)
.
The query performs well when all 3 conditions are specified with real values. The int
columns always have values, but there are plenty of empty string varchar
values. Queries with the empty string varchar
parameter perform 2-3x slower than those that search a real string (e.g. 'hello'). The query against the varchar
column is a straight WHERE
clause (i.e. no LIKE
, just =
).
I've searched around a bit but really only seem to see academic type discussions around this and I, frankly, don't really care about how they can mean different things. I only care about the performance of the queries against a NULL
or empty string varchar
column.
Is this empty string the cause of the slowness? Would a NULL
in it's place improve things? I can easily turn existing empty strings into NULL
s and put some new logic in to make sure empty strings are always put in as NULL
s. I just figured I'd ask here to get the expert opinion on this.
I'll be toying around with this anyway but it'd be nice to get an outside view telling me if I'm just spinning my wheels on it, if that's the case.
1 Answer 1
Using your info about the distribution of values (and some guessing), I would say that your problem is that MSSQL is doing an table scan when receiving an WHERE
clause for empty string
.
Why? Suposing that your index on varchar
column does not cover the query (in other words, include all the columns referenced in the query); MSSQL have to do an seek to get the PK's columns (if it is clustered) - or record-id (if not) - and using that in a bookmark lookup to get all the columns it need.
Where comes the table (or clustered index) scan? Simple: based on index statistics, if the number of pages touched to get that data is over than 25%-33% of the total pages of the table, MSSQL considers that is too much expensive to use the index and goes the table (or clustered index) scan way.
This is discussed in deeper details on this dba.SE question.
As to solve the problem, you can:
- create (or modify an existing index to be) an covering index to your query
- Reduce, if possible, the columns on the query - so it become covered by an existing index
Explore related questions
See similar questions with these tags.
empty string
is, for example, a third of the total values on the column? More? Less? We need more information...