Wednesday, May 26, 2010

When to use the INCLUDE option in the CREATE INDEX statement.

In SQL 2005\2008\2008 R2 you can create non clustered indexes with the INCLUDE option. Sometimes I got questions why you should use the INCLUDE option. In this blog I will explain why and when you should use it. An index will store all data in the order of the content of the indexed columns. Assume we have created next index:

CREATE INDEX MyIndexName ON MyTable (ColumnA, ColumnB, ColumnC)

All data will be stored first on ColumnA and then on ColumnB and then on ColumnC. Let's take next records:
ColumnA, ColumnB, ColumnC
Record 1: XX,YY,1
Record 2: XX,YY,6
Record 3: XX,YY,10
Record 4: XX,YY,15
Record 5: XX,YY,50

Assume the value 50 of columnC of record 5 will change to 2. With the MyIndexName on MyTable the records in the index will be re-ordered.

ColumnA, ColumnB, ColumnC

Record 1: XX,YY,1
Record 5: XX,YY,2
Record 2: XX,YY,6
Record 3: XX,YY,10
Record 4: XX,YY,15

This will cost disk I/O to change the index order and can result in index defragmentation.

With the INCLUDE option the index records will not be re-ordered for changes in the values of INCLUDE columns.
CREATE INDEX MyIndexName2 ON MyTable (ColumnA, ColumnB) INCLUDE (ColumnC)

So a change in the value of ColumnC will not result in a index re order. Only the value of Column C is updated.

ColumnA, ColumnB, ColumnC

Record 1: XX,YY,1
Record 2: XX,YY,6
Record 3: XX,YY,10
Record 4: XX,YY,15
Record 5: XX,YY,2

When to use the INCLUDE option? Answer to this question is really simple. In case you want to create a covering index:
Anything in the WHERE clause would be a key, anything in the SELECT would be an included column.

Example next query:
SELECT ColumnC FROM MyTable WHERE ColumA = 'XX' AND ColumnB = 'YY'

For this query MyIndexName2 is much more efficient.

No comments: