5

Well, I really have read so many articles, used different techniques and done some random things to do this.

My problem was that a have big tables (over one million rows) and some small (with few hundreds rows) - we are talking about 8 inner joins involved.

What reduces the execution time from over 2 minutes to 30 seconds was very strange for me and I was not able to find why this happens.

When I select columns from the tables I cast them. What I did was to cast the column to the most small possible type.

For example:

  • nvarchar(4000) to nvarchar(50 or 25)
  • bigint to int
  • int to tinyint

The result was over 1 minute and 30 second better execution time. Why this happen?

For example if my variable is string with length 10 the nvarchar(4000) and the nvarchar(50) will cast it to nvarchar(10) (or something close to that). So, why when I reduce the outcome type from cast the things go better?

Something more - I make a lot of test to check which function is better - cast or convert (do test for string to string, string to number, number to string) but was not able to define which works better. Sometime the convert gives me few seconds better execution time, but not enough to make a conclusion. Has anyone do something like this and succeeded in?

Thank for your time in advance.

asked Jan 26, 2012 at 13:36
2
  • 1
    Do you have explicit foreign keys between tables? I assume the CASTs are on these columns... Commented Jan 26, 2012 at 13:43
  • @Martin Smith You are right. I have compared the execution plans and they were the same. Maybe the difference that I get (a few seconds) is results of the traffic to the server. Anyway. I have read a lot of new things of how to optimize my view. The solution is one - make it a index view. I was amazed by the big difference in performance that I get with the test. Perfect. But, unfortunately, in my real case (that I have written about in this thread) I am not able to create a indexed view because of the limitations...So, waiting for this year SQLServer and hope it will be more easy to do this. Commented Jan 28, 2012 at 9:42

4 Answers 4

10

The reason the performance is better is that the smaller data types have much less working set - see the execution plan (http://www.red-gate.com/our-company/about/book-store/assets/sql-server-execution-plans.pdf)

I expect you get the largest benefit from the nvarchar(4000) to nvarchar(50) - that's a reduction of 80x - and nvarchar(4000) can use up to 8K! of space. For a key, that is not a good idea.

In addition, the fact that there are no foreign keys probably mean you don't have a very good indexing strategy either. If you did have indexes (even for ridiculously large columns), you would probably find they could outperform the cast since it probably wouldn't spool as much.

In general, you don't want any operations on your keys in the join if at all possible, especially for large data sets.

answered Jan 26, 2012 at 13:59
0
1

To guess the motivation for the time difference you should compare the two query plans (with and without the casts, or with casts and converts) and the relative costs; you'll be able to identify in which phase there is a cost change.

answered Jan 26, 2012 at 13:56
1

if you need to join a nvarchar(4000) to a nvarchar(10) and million of rows are involved, Id use a presisted computed column where you do a LEFT(long_column,10). Depending on the query you can even index that computed column. I'll bet your join will preform much better.

answered Jan 26, 2012 at 14:02
2
  • 1
    true, but its good to remember that a persisted computed column will basically make a copy of the original column on the HD. Commented Jan 26, 2012 at 14:09
  • @Diego, other than some sort of redesign, not much else you can do Commented Jan 26, 2012 at 14:14
0

You can take care of below point as well for Inner Joins

Please check the difference between these two queries

SELECT T1.ColumnName1,
 T2.ColumnName2
FROM (SELECT ColumnName1
 FROM T1
 WHERE ID = 10)T1
 INNER JOIN T2
 ON T2.ID = T1.ID
SELECT T1.ColumnName1,
 T2.ColumnName2
FROM T1
 INNER JOIN T2
 ON T2.ID = T1.ID
WHERE ID = 10 

You can add Non Clustered Index as well to get rid of scanning the complete table

But this is normally handy when the query returns one record as being used in my Sub Query example:)

Martin Smith
88.4k15 gold badges257 silver badges357 bronze badges
answered Jan 26, 2012 at 14:05
2
  • I have read about this when I was trying to find a way to optimize my joins. It was said that there is no difference, other people told that when you move the where after the join it will definitely help, but personally, in my case - i move all where clauses after the last join and the result was the same. Commented Jan 26, 2012 at 14:30
  • 2
    This rewrite makes no difference other than adding a layer of obfuscation. The QO will easily transform one to the other. The point about NCIs seems somewhat generic - how are you relating this to the question? Commented Jan 26, 2012 at 15:24

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.