Saturday, October 01, 2011

Does Clustered Index physically orders the data within a table?

Quite a lot of time I have come across people saying "Clustered Index Physically sorts the data inside the table based on the Clustered Index Keys". It's not completely true!

The short answer to them is there is something called "Row offset Array" in a PAGE which manages the order of rows in a PAGE.

For a long answer with a sample script read on...

There can lot of rows inserted one after the other and as we know each row can vary in size. That being the case how does SQL Server know where a row begins and where does it end?

That's where ROW OFFSET table/array comes into picture. A ROW OFFSET array starts at the end of the PAGE, and it contains one entry for each row on the PAGE. Each entry records how far the first Byte of the row is from the start of the PAGE. The entries in the row offset table are in reverse sequence from the sequence of the rows on the PAGE.

CREATE TABLE ci_order_example

INSERT INTO ci_order_example
SELECT 10, 'Vadivel' UNION ALL 
SELECT 12, 'Sailakshmi' UNION ALL
SELECT 31, 'Alpha' 

Lets see how the data has got stored internally. I am testing all this in my testing database named 'TestBed'.

--The below command will inform SQL Server to return the results to the client/console
--instead of to the error log!
DBCC TraceOn(3604)

--We got to make use of DBCC IND for finding out the page IDs to look at using DBCC PAGE
--Pick the PagePID (the page number in the file) whose IAMFID / IAMPID is not NULL
DBCC IND ('TestBed', 'ci_order_example', 1)

--The Offset Table section shows the contents of the row offset array at the end of the page
-- first row will be indicated as Slot 0
DBCC PAGE ('TestBed',1,243,2)

In the above example, while inserting, the records got inserted in these order. Because these records are already entered in a sorted fashion you can see it is in a sequence :)

0 - 96 - Vadivel
1 - 118 - Sailakshmi
2 - 143 - Alpha

Let's add few more records and then check how the records have got stored now.

INSERT INTO ci_order_example
SELECT 2, 'Amala Paul'

DBCC PAGE ('TestBed',1,243,2)

You can see that the place the record got added that location is not changed at all. Instead based on the clustered index key it maintains the correct order within this Row Offset table/array.

0 - 163 - Sneha
1 - 183 - Amala Paul
2 - 96 - Vadivel
3 - 118 - Sailakshmi
4 - 143 - Alpha

Hope this post would help get this misconception cleared (or) Can i say we have tried answering to one of the many SQL Server Myths :)

1 comment:

Dhananjay said...

then why "select * from tbl" when clustered indexed shows the rows in sorted order ?