Monday, November 13, 2006

Example for Creating and using Partitions in SQL Server 2005

Lets assume that we have table which contains records of our company transaction starting from the date when our company was started 15 years ago! Hope you would understand that the table would have hell a lot of data as it would be holding 15 years of data. But effectively we might be using only last 2 months or 1 year data at the max (very frequently).

For each query, it would be processing through this huge data. Bottomline as the table grows larger the performance would go for a toss, also scalabiity and managing data would also be difficult.

Hope I have made the point clear. With the help of partitioning a table we can achieve a great level of performance; also managing tables easier. In our case, one of the way to increase the performance would be to “Partition” the data on a yearly basis and stored on a different filegroup (SQL 2005 allows you to partition tables based on specific data usage patterns using defined ranges or lists).

For further theoritical knowledge on this subject check out my earlier post on this topic.

OK enough of theory :) let’s get into the example right away.

--Add a filegroup
Alter Database VadivelTesting Add FileGroup FileGrp1
Alter Database VadivelTesting Add FileGroup FileGrp2

--Add few Files into the existing FileGroups.
Alter Database VadivelTesting
Add File
(
Name = Vel2006Q3,
Filename = 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\Vel2006Q3.ndf',
Size = 1MB,
Maxsize = 100MB,
Filegrowth = 1MB
)
To FileGroup FileGrp1

Alter Database VadivelTesting
Add File
(
Name = Vel2006Q4,
Filename = 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\Vel2006Q4.ndf',
Size = 1MB,
Maxsize = 100MB,
Filegrowth = 1MB
)
To FileGroup FileGrp2

-- Creating a partition function
Create Partition Function PF_DOB_Range (DateTime)
As Range Right
For Values
(
'01-01-1960',
'01-01-1980'
)


While creating a “Partition Function” ranges are of two types. They are, “Range Right” and “Range Left”. If it is mentioned as "Range Right" then it works like this:

1. All records whose DOB is < Jan 1, 1960 would fall into Partion 1
2. All records whose DOB is >= Jan 1, 1960 and < Jan 1, 1980 would fall on Partion 2
3. All records whose DOB is >= Jan 1, 1980 would fall under Partition 3.

If it is mentioned as "Range Left" then it works like this:

1. All records whose DOB is <= Jan 1, 1960 would fall into Partion 1
2. All records whose DOB is > Jan 1, 1960 and <= Jan 1, 1980 would fall on Partion 2
3. All records whose DOB is > Jan 1, 1980 would fall under Partition 3.

-- Creating a partition scheme
Create Partition Scheme PS_DOB_1
As Partition PF_DOB_Range To
(FileGrp1, FileGrp2, [PRIMARY])

-- Create a Partioned Table using the above Scheme.
Create Table EmpDetails
(
EmpID int identity,
EmpName Varchar(25),
DateOfBirth datetime,
Salary int
)
On PS_DOB_1(DateOfBirth)

Tip: In the create table syntax after ON keyword normally FileGroup name would be specified. But in the above example we have mentioned a Partition Scheme name. On seeing a script the way to identify whether it is a Scheme name or FileGroup name is by checking whether it has any parameters or not. That is, Scheme name comes with a parameter where-as Filegroup doesn't have a parameter.

-- Insert Dummy records
Insert into dbo.EmpDetails values ('Vadivel','1976-03-23',5000)
Insert into dbo.EmpDetails values ('Sailakshmi','1975-03-23',6000)
Insert into dbo.EmpDetails values ('Vicky','1986-04-30',7000)
Insert into dbo.EmpDetails values ('Test','1954-02-02',2000)
Insert into dbo.EmpDetails values ('Sivaji','1960-02-02',3000)
Insert into dbo.EmpDetails values ('Xyz','1960-01-01',2000)
Insert into dbo.EmpDetails values ('SSS','1980-01-01',3000)
Insert into dbo.EmpDetails values ('AAA','1959-12-31',3000)


That’s all. Going forward based on the select query “SQL Engine” internally decides on the “Partition” to query and fetches the record automatically. Bottomline, we need not do anything specific to segregate records from different partition while writing to the T-SQL statements or Stored Procedure.

That said, if at all you want to verify the partition number into which a set of partitioning column values would be mapped for any specified partition function, you can make use of $Partition function. In other words, $Partition is used to find the data’s in appropriate partition.

i) Fetch all records which got inserted into First Partition
Select * from dbo.EmpDetails Where $PARTITION.PF_DOB_Range(DateOfBirth) = 1

ii) Fetch all records which got inserted into Second Partition
Select * from dbo.EmpDetails Where $PARTITION.PF_DOB_Range(DateOfBirth) = 2

iii) Fetch all records which got inserted into Third Partition
Select * from dbo.EmpDetails Where $PARTITION.PF_DOB_Range(DateOfBirth) = 3

iv) This query would fetch you the partition number and the number of records in it.
Select * from Sys.Partitions where object_id = object_id('VadivelTesting.dbo.EmpDetails')

Or

Select object_name(object_id), partition_number, rows from Sys.Partitions where object_id = object_id('VadivelTesting.dbo.EmpDetails')

Another way to verify the partition ranges!

Select $partition.PF_DOB_Range(DateOfBirth) As 'Partition Number', count(*) As 'Rows' From dbo.EmpDetails Group by $partition.PF_DOB_Range(DateOfBirth) Order by 'Partition Number'

Or

Select EmpID,EmpName, Salary, $partition.PF_DOB_Range(DateOfBirth) AS 'Partition Number' From dbo.EmpDetails Order by 'Partition Number', EmpID

v) To display the list of filegroups in a database and to know which FileGroup is the default.
Select * from sys.filegroups


I strongly suggest you to read this MSDN article without fail.

Technorati tags:

3 comments:

Anonymous said...

Great Information Vadivel!

I am working with an existing database with large tables and have been searching on the best method of partitioning an exisiting table full of data. I have yet to find info with regards to either altering the table or if you need to create a temp table to load the data, drop existing table, create a new partitioned table and then reload the data.

I guess i'm not sure if i'm missing anything and am just making things more difficult.

Anyways, any advise you can provde would be greatly appreciated.

Thanks

Vadivel said...

Thanks!

Just a thought regarding your problem:

1. Create an empty table of similar schema and on the same file group

2. Load data to this new table
3. Use 'Split Partition' to create a new partition
4. Use 'Switch To' to push records from the new table to the new partition.

Hope this helps!

Anonymous said...

Your split partitioning example has helped me alot.

There is one problem that I was wondering about.

I'm using a modified version of table partitionig where there will no sliding window or merging of partitions. We just keep adding on partitions and file groups based on a monthly date range.

With that said I'm a junior programmer, and I have just modified your code a bit to use montly ranges instead of quaterly ranges.(partitioning on a montly basis)

All worked fine until I used the Split Range function. What it did was move the next month of data into PRIMARY TABLE, instead of the FileGroup specified.

Any help would be greatly appreciated.