In this article I will make attempt to outline differences between clustered and non-clustered indexes in SQL Server. Be fully aware that it is not going to be comprehensive explanation, but rather brief overview highlighting purpose of both types of indexes.
Indexes are one of the most important part of database optimization, especially for large and growing database tables. They work in very similar way to index in a book – just imagine massive address book, with thousands pages but no index. Searching this book for specific address is nearly impossible without index.
When query is ran against SQL Server database, Query Optimizer analyses all available indexes for tables involved, and creates most efficient execution plan to run statement. SQL Server reads data in 8KB pages, so if table is small enough to fit on one or two pages, there is no need for an index at all, as SQL Server will only have to make one or two read operations. This operation is called Full Table Scan and for small table it is the most efficient way of pulling the data out. On the other hand, if table has 1,000,000 records and size of row allows reading only 50 rows per page, number of read operations rises to 20,000. That clearly will take a lot of time and this is where indexes come into play. Indexes can massively improve SQL query, however is worth mentioning, that it does not come for free – indexes can take a lot of storage. When index is involved in selecting data from table, this operation is called Index Seek. In general we can distinguish two types of indexes: clustered and non-clustered.
This index assumes that data in a table is physically sorted in specific order. This means that when new row is inserted, it physically goes to the place determined by index, allowing data after new row to be moved to next page if required. The table can only have one clustered index. This index is specifically useful if we use BETWEEN statement in WHERE clause of our statement (i.e. SELECT * FROM Table WHERE Col1 BETWEEN ‘A’ AND ‘Z’).
When this type of index is used logical order of index does not match physical order of data stored on the disk. Index plays a role of “pointer to row”, allowing faster access to the row that is requested.
A non-clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a non-clustered index does not consist of the data pages. Instead, the leaf nodes contain index rows. This type of index works best when used having = in WHERE clause (i.e. SELECT * FROM Table WHERE Col1 = ‘A’) or if the column is used in JOIN.
- http://msdn.microsoft.com/en-us/library/ms188783.aspx – MSDN on indexes
- http://en.wikipedia.org/wiki/Index_(database) – Wikipedia on indexes
- http://www.sql-server-performance.com/articles/per/index_data_structures_p1.aspx – good article explaining difference between clustered and non-clustered indexes much deeper