Performance issue in Sql Server

Question 1

I have a master product table as like below :

CREATE TABLE dbo.[products](
 [id] INT NOT NULL IDENTITY(1, 1),
 [product_code] VARCHAR(100) NOT NULL UNIQUE,
 [price] FLOAT NOT NULL,
 [brand] VARCHAR(100) NOT NULL,
 [colour] VARCHAR(100) NOT NULL
);

So, if I create this table, a clustered index will be created on id column and a non-clustered index on product_code column.

And I am using this table in a website to show the products. And I will be using the sql query like below.

Query 1

SELECT * FROM dbo.[products]
WHERE [brand] IN ('brand1', 'brand2', 'brand3', 'brand4', 'brand5')
AND [colour] IN ('colour1', 'colour1', 'colour1')
ORDER BY [product_code];

And there is an another option to search for bulk of product_codes like below.

Query 2

SELECT * FROM dbo.[products]
WHERE [product_code] IN ('product_code1', 'product_code2', 'product_code3');

The conditions can be more in Query 1 and product codes can be more in Query 2.

Did I created the index properly?

Or is there any better way to improve the performance?

Question 2

I hope you are doing a more selective select that select * from.... Why? Because of performance. Why transport data that you don't require. Why is "Select * from table" considered bad practice and Why is SELECT * considered harmful? and What is the reason not to use select *?.

Question 3

@hot2use : Yes, I will not be using select *.....

Question 4

Short answer: No, you need more indexes.

If you're going to create a significant number of queries involving [brand] and [colour], you should index both columns:

CREATE INDEX products_colour_idx ON dbo.products (colour); 
CREATE INDEX products_brand_idx ON dbo.products (brand);

If you mostly query using both columns, and there are not queries involving colour that do not also involve a brand, a multi-column index would be better, because by using just one index, the database can retrieve all the relevant rows. You would create it this way:

CREATE INDEX products_brand_colours_idx ON dbo.products (brand, colour);

The order of the columns [(brand, colour) vs. (colour, brand)] should be normally chosen in such a way that the most selective one comes first (basically: the one with more different values, first).

If you happen to have a significant number of queries with brand, colour and both brand and colour, you should have at least the two first indexes; if you need the fastest possible speed, have one index for (brand, colour) and another one for (colour). You don't need an index on purpose for (brand), the (brand, colour) one is good for looking for brands.

[product_code] will be indexed by the database automatically to enforce the UNIQUE constraint on it. See Create Unique Constraints and Unique Constraints and Unique Indexes. You don't need to create an index explicitly even if query_2 is frequent.

Question 5

Comments are not for extended discussion; this conversation has been moved to chat.

Question 6

For the multicolumn index, see joanolo's post. I'd like to raise attention to your clustered index however.

A clustered index is the most efficient index a table can have. Since your product code is already unique, you might want to consider making it clustered instead. But there are considerations.

The main reason for why it's usually advised to create a clustered index on an identity column, is because a) a clustered index controls the order of the data on the disk, so you want to have it in a column with an ever increasing value to avoid fragmentation. And b) usually this identity column is also the primary key, which means other tables may contain foreign key references to that field, which means that if it's clustered, all such relational queries perform much faster due to the clustered index not only being the fastest, but also containing all the information of the same row (even other columns on the same row).

The full picture is difficult to explain in short here, I suggest you read up on it more. But basically the point is this:

If you know that you will NOT get many new rows, OR if you know you can handle the fragmentation (by for example using an appropriate compromise with index fillfactor, or reasonably regular index maintenance), OR if you know that from a business logic point of view, the product codes will be added in an ever increasing numerical / alphabetical order which means there will be no fragmentation in the first place... AND if you know that there aren't too many references to the identity column, or that the queries which those references are used for, can handle it... In this case, it might be optimal to create a clustered index on product_code instead, and if required, then place a nonclustered index (primary key or otherwise) on the identity column.

Note, do NOT do any of this unless you know what you're doing. But from an optimizing point of view, I figured it would be good to mention options. Your current setup, with joanolo's suggestions, matches the default recommendations. Anything more, including everything I've said in this post, assumes you have a far more detailed understanding of what's going on, how the data is structured, and used.

Question 7

Thanks alot for your answer. I will be moving forward with the primary key to the identity column and unique key to the product_code column and multicolumn index to the other columns which are frequently using.

joanolo joanolo 13.7k8 gold badges39 silver badges67 bronze badges · Accepted Answer · 2017-01-18 08:56:42Z

Short answer: No, you need more indexes.

If you're going to create a significant number of queries involving [brand] and [colour], you should index both columns:

CREATE INDEX products_colour_idx ON dbo.products (colour); 
CREATE INDEX products_brand_idx ON dbo.products (brand);

If you mostly query using both columns, and there are not queries involving colour that do not also involve a brand, a multi-column index would be better, because by using just one index, the database can retrieve all the relevant rows. You would create it this way:

CREATE INDEX products_brand_colours_idx ON dbo.products (brand, colour);

The order of the columns [(brand, colour) vs. (colour, brand)] should be normally chosen in such a way that the most selective one comes first (basically: the one with more different values, first).

If you happen to have a significant number of queries with brand, colour and both brand and colour, you should have at least the two first indexes; if you need the fastest possible speed, have one index for (brand, colour) and another one for (colour). You don't need an index on purpose for (brand), the (brand, colour) one is good for looking for brands.

[product_code] will be indexed by the database automatically to enforce the UNIQUE constraint on it. See Create Unique Constraints and Unique Constraints and Unique Indexes. You don't need to create an index explicitly even if query_2 is frequent.

Comments are not for extended discussion; this conversation has been moved to chat.

Stack Exchange Network

Performance issue in Sql Server

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Performance issue in Sql Server

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions