Lee posted on June 6, 2010 18:07

Where I am consulting we’re getting ready to set up a full-blown partitioning design for a fact table that will hold data from, roughly, 50 countries. Someone asked the other day about locking and DML so I experimented in another post on the new (LOCK_ESCALATION = AUTO) feature for SQL Server 2008.  Worked great doing simultaneous DML operations to each partition scheme, definitely good news especially for the setup that we’re going since each country may be reading, writing, loading data at various points throughout the day.  I was able to insert chunks of data into one partition while updating all of the data in another, and deleting data from yet another while witnessing no problems in locking or blocking which is what was anticipated. I also skewed the data greatly as well, since previous versions you might expect escalated locking situations if you acted upon the greater majority of the data in a table, on up to a table lock.

Another question asked - how well would indexing work and are we certain to experience fewer reads and writes, leading to fewer I/Os, for the “current” partitions which hold smaller data sets?  (The setup may have an archive partition, a current partition, and various partitions by year in between so that queries would have to traverse the most recent, say, six months of data.  Probably a common setup in a lot of shops).  Because I haven’t tested this and really haven’t read a lot lately about partitioning – I know there is a ton of good stuff out there, such as the article by Kimberly Tripp – I just dove with an example to verify.

I set up the following and inserted lots of nice data using my new VS 2010 install and the Data Generator.  Here’s what I did:

USE Test
BEGIN TRY
    DROP TABLE dbo.PartitionReadsTest
END TRY

BEGIN CATCH
END CATCH
GO

CREATE PARTITION FUNCTION funct_Partition_yr(INT) 
AS RANGE RIGHT FOR VALUES (2007,2008,2009,2010)
GO
CREATE PARTITION SCHEME partition_yr 
AS PARTITION funct_Partition_yr   TO ([PRIMARY], secondary, tertiary, quaternary, quinary)
GO

CREATE TABLE PartitionReadsTest 
(    
      [Year] int NOT NULL
    , FName varchar (40)
    , LName varchar (40)
    , BirthDate datetime NOT NULL 
    , Country nvarchar (4) NOT NULL
) ON partition_yr([Year])
GO
 
CREATE CLUSTERED INDEX cl_idx_PartitionReadsTest ON PartitionReadsTest (Year) 
ON partition_yr(Year) GO

Here’s the distribution of my data:

2007 812064
2008 181261
2009 6468
2010 207

 

The following query gives me this portion of the actual execution plan:

SELECT COUNT(1)
FROM PartitionReadsTest 
WHERE [Year] = 2007
AND BirthDate between '1/1/1960' and '2/1/2007'

Table 'PartitionReadsTest'. Scan count 3, logical reads 10078, physical reads 186, read-ahead reads 9152, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

image

Creating this index improves the query:

CREATE NONCLUSTERED INDEX idx_PartitionedIndex ON PartitionReadsTest (year, BirthDate)

Table 'PartitionReadsTest'. Scan count 1, logical reads 805, physical reads 0, read-ahead reads 6, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

image

If I create the index on the partition, it appears that I get fewer rows to traverse and a better subtree cost:

CREATE NONCLUSTERED INDEX idx_PartitionedIndex ON PartitionReadsTest (year, BirthDate) 
ON partition_yr (Year) GO

image

Definitely worth looking into indexing of these partitions a bit further. 

Oh sorry, got off subject here.  For certain SQL Server does do less work on a partitioned table when selecting from one with fewer rows, which answers their original question.

Lee

 

------------------------

Oh, tarter sauce.

sponge


Posted in: SQL Server 2008 , TSQL  Tags:
blog comments powered by Disqus

by Lee Everest, M.S.

Poll

Do you use Azure or cloud in your organization?



Show Results

Ads

Search


Month List

Calendar

«  February 2012  »
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
2728291234
567891011
View posts in large calendar

Tags

Disclaimer
The opinions, code, examples, et.al. expressed herein are my own personal opinions and do not represent my employer's view in any way, shape form, or fashion.  All code for demonstration purposes - no guarantees, either written or implied, are made.

© Copyright 2012 Lee Everest's SQL Server, etc. weblog