Difference between Clustered and Non-Clustered Indexes in SQL Server?
Indexes are used to speed-up query process in SQL Server, resulting in high performance. They are similar to textbook indexes. In textbooks, if you need to go to a particular chapter, you go to the index, find the page number of the chapter and go directly to that page. Without indexes, the process of finding your desired chapter would have been very slow.
The same applies to indexes in databases. Without indexes, a DBMS has to go through all the records in the table in order to retrieve the desired results. This process is called table-scanning and is extremely slow. On the other hand, if you create indexes, the database goes to that index first and then retrieves the corresponding table records directly.
There are two types of Indexes in SQL Server:
- Clustered Index
- Non-Clustered Index
What are clustered indexes in the SQL Server?
Every table in the database has a natural order. If the order is random and it is not defined explicitly then the table is known as the heap here. Except for the few cases, heaps are not considered as a good programming practice because the SQL server has no meta-knowledge about how data is stored within a heap. If there is not a random heap, it means table data is stored in some order and the physical storage of data is defined by the clustered indexes in SQL. Every table can have only one clustered index because data can be stored in a single order only. It signifies that data in a table stored physically on the disk cannot have more than one order.
What are nonclustered indexes in the SQL Server?
A nonclustered index contains the ordered data for the columns specified in that index with pointers that will guide where to find the rest of the data from that row. The major difference between clustered index and nonclustered index is that nonclustered indexes are a two-step process where you should first the value of interest then go and look up the rest of the data of that row where it actually exists on the disk. The non-clustered index is an index structure separate from the data stored in the table that renders one or more selected columns. The non-clustered index is created to improve the performance of frequently used queries that are not managed well by clustered indexes. It is like a notebook where the index page is put separately at the beginning of the book.
Characteristic of Clustered Index
- Default and sorted data storage
- Use just one or more than one columns for an index
- Helps you to store Data and index together
- Fragmentation
- Operations
- Clustered index scan and index seek
- Key Lookup
Characteristics of Non-clustered Indexes
- Store key values only
- Pointers to Heap/Clustered Index rows
- Allows Secondary data access
- Bridge to the data
- Operations of Index Scan and Index Seek
- You can create a nonclustered index for a table or view
- Every index row in the nonclustered index stores the nonclustered key value and a row locator
Example of Clustered Indexes
Take the example of the “Student_info” table to understand the practical aspects of clustered indexing in the SQL server. This table contains “Roll_No” as the primary key and the clustered index is self-created here on the primary key that will sort the “student_info” table as per the “Roll_No”. In other words, clustered indexing works as the dictionary where sorting order is alphabetical and there is no need for defining any additional index page.
CREATE TABLE Student_info
(
ROLL_NO int(10) primary key,
NAME varchar(20),
DEPARTMENT varchar(20),
);
insert into Student_info values(154395262, 'Shivam Anand', 'CSE')
insert into Student_info values(154395256, 'Shaiv Singhaniya', 'CSE')
insert into Student_info values(154395222, 'Suman Kapoor', 'ECE')
SELECT * FROM Student_info
ROLL_NO | NAME | DEPARTMENT |
---|---|---|
154395222 | Suman Kapoor | ECE |
154395256 | Shaiv Singhaniya | CSE |
154395262 | Shivam Anand | CSE |
If you want to define clustered indexes for other columns then first you should remove the primary key then you can remove the first index too. Keep in mind that defining a column as the primary key will make that column the clustered index of that table. To set any other column as the clustered index, we should remove the previous one with the following procedure:
ROLL_NO | NAME | DEPARTMENT |
---|---|---|
154395262 | Shivam Anand | CSE |
154395256 | Shaiv Singhaniya | CSE |
154395222 | Suman Kapoor | ECE |
Output: Index
NAME | ROW_ADDRESS |
---|---|
Shivam Anand | 1 |
Suman Kapoor | 3 |
Shaiv Singhaniya | 2 |
The syntax for creating a non-clustered index is similar to that of clustered index. However, in case of non-clustered index keyword “NONCLUSTERED” is used instead of “CLUSTERED”. Take a look at the following script.
USE schooldb
INSERT INTO student
VALUES
(6, 'Kate', 'Female', '03-JAN-1985', 500, 'Liverpool'),
(2, 'Jon', 'Male', '02-FEB-1974', 545, 'Manchester'),
(9, 'Wise', 'Male', '11-NOV-1987', 499, 'Manchester'),
(3, 'Sara', 'Female', '07-MAR-1988', 600, 'Leeds'),
(1, 'Jolly', 'Female', '12-JUN-1989', 500, 'London'),
(4, 'Laura', 'Female', '22-DEC-1981', 400, 'Liverpool'),
(7, 'Joseph', 'Male', '09-APR-1982', 643, 'London'),
(5, 'Alan', 'Male', '29-JUL-1993', 500, 'London'),
(8, 'Mice', 'Male', '16-AUG-1974', 543, 'Liverpool'),
(10, 'Elis', 'Female', '28-OCT-1990', 400, 'Leeds');
use schooldb CREATE NONCLUSTERED INDEX IX_tblStudent_NameON student(name ASC)
The above script creates a non-clustered index on the “name” column of the student table. The index sorts by name in ascending order. As we said earlier, the table data and index will be stored in different places. The table records will be sorted by a clustered index if there is one. The index will be sorted according to its definition and will be stored separately from the table.
id | name | gender | DOB | total_score | City |
1 | Jolly | Female | 1989-06-12 00:00:00.000 | 500 | London |
2 | Jon | Male | 1974-02-02 00:00:00.000 | 545 | Manchester |
3 | Sara | Female | 1988-03-07 00:00:00.000 | 600 | Leeds |
4 | Laura | Female | 1981-12-22 00:00:00.000 | 400 | Liverpool |
5 | Alan | Male | 1993-07-29 00:00:00.000 | 500 | London |
6 | Kate | Female | 1985-01-03 00:00:00.000 | 500 | Liverpool |
7 | Joseph | Male | 1982-04-09 00:00:00.000 | 643 | London |
8 | Mice | Male | 1974-08-16 00:00:00.000 | 543 | Liverpool |
9 | Wise | Male | 1987-11-11 00:00:00.000 | 499 | Manchester |
10 | Elis | Female | 1990-10-28 00:00:00.000 | 400 | Leeds |
name | Row Address |
Alan | Row Address |
Elis | Row Address |
Jolly | Row Address |
Jon | Row Address |
Joseph | Row Address |
Kate | Row Address |
Laura | Row Address |
Mice | Row Address |
Sara | Row Address |
Wise | Row Address |
Notice, here in the index every row has a column that stores the address of the row to which the name belongs. So if a query is issued to retrieve the gender and DOB of the student named “Jon”, the database will first search the name “Jon” inside the index. It will then read the row address of “Jon” and will go directly to that row in the “student” table to fetch gender and DOB of Jon.
Conclusion
From the discussion we find following differences between clustered and non-clustered indexes.
- There can be only one clustered index per table. However, you can create multiple non-clustered indexes on a single table.
- Clustered indexes only sort tables. Therefore, they do not consume extra storage. Non-clustered indexes are stored in a separate place from the actual table claiming more storage space.
- Clustered indexes are faster than non-clustered indexes since they don’t involve any extra lookup step.