In SQL Server, there is a little-known switch…–E…which is a startup option to increase the number of extents allocated per file within a filegroup during each iteration of writes. Here’s an excerpt from the link from the KB on it:

The Microsoft SQL Server 2000 Enterprise Edition (64-bit) and Microsoft SQL Server 2005 Enterprise Edition releases support a new server startup option, -E. With the -E startup option, an administrator can increase the number of extents allocated per file in the filegroup during each iteration of the round robin cycle used to distribute extents across files. This extension may increase the performance for data warehouse (DW) applications by providing better data contiguity, and allowing the read-ahead mechanism to reduce the number of I/Os processed during sequential scans executed for DW queries.

 

image

Figure 1. Startup SQL Server with -E

This is one of the cool things that was intriguing to me at the Fast Track Data Warehouse sessions that I attended (mostly because I am at a gig doing data warehousing at this time).   The KB also says:

In the SQL Server 64-bit release, you can allocate four consecutive extents per given file, before switching to the next file in the filegroup. Lowering the frequency of file switches allows for larger contiguous ranges of disk space allocation, although it does not guarantee it, especially in the multiuser environment. Scans benefit from this contiguity because the data will be less fragmented allowing for fewer file switches and faster access. Additionally, the new allocation extension may decrease the number of I/Os executed during sequential scans.

In the Fast Track Data Warehouse, the architecture is optimized for scans, so reading data that is layed down in a sequential fashion is highly optimal. (I made a joke about this in the session, noting that we’ve spent the last 12+ years trying to get rid of scans, but here they’re a-OK).  You can read more about Fast Track and data layout in the Fast Track Data warehouse - Microsoft Corporation documentation.

Anyway, I thought this was really cool so I wanted to test it on my box.  I’ve actually been dorking with this thing for about three weeks but haven’t been able to see what I wanted to up until now.  I’ve run this over and over again and believe that I am getting some benefit from –E, even on my Hyper-V setup.  I run the following code, first without –E and then with –E.

DROP TABLE test_stg 
go 
CREATE TABLE test_stg 
    (id int identity (1,1) primary key clustered 
    ,fname char(100) 
    ,lname char(100) 
    ,dt date default getdate() 
    ) 
GO

INSERT test_stg with (tablock) (fname, lname) 
SELECT fname, lname 
FROM    
    ( 
     SELECT top 1000000 '' as fname, '' as lname 
     from sys.syscolumns a, sys.syscolumns b, sys.syscolumns c    
    ) x 
go 
DROP TABLE test 
go 
CREATE TABLE test 
    (id int primary key clustered 
    ,fname char(1000) 
    ,lname char(1000) 
    ,dt date 
    ) 
GO 
TRUNCATE TABLE Test; 
GO 
-- INSERT TEST SELECT * FROM test_stg 
SET NOCOUNT ON 
DECLARE @id int=1 
DECLARE @chunk int = 100000

WHILE 1=1 
BEGIN 
    INSERT Test WITH (TABLOCK) 
    SELECT * FROM test_stg 
    WHERE id BETWEEN @id and @chunk 
    --OPTION (MAXDOP 1) 
    IF @@ROWCOUNT=0 
        BREAK 
    SET @id += 100000    
    SET @chunk += 100000 
END 
GO

 

Note that when I add the –E startup option I remove the comment for OPTION (MAXDOP 1) so I get a nice sequential write to the disk.  Here is the perfmon using only the Logical Disk – Avg. Disk Bytes/Write counter, and then STATISTICS IO running the following query, which guarantees me both a scan along with only a few rows returned.

SELECT TOP 20 *
FROM test
WHERE (id=3 OR id BETWEEN 3000000 AND 3000010)
OR fname like '%lee'

image

Figure 2. Without -E

Table 'test'. Scan count 3, logical reads 283114, physical reads 2443, read-ahead reads 175470, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


Here’s the same with –E.

image

Figure 3. With -E

Table 'test'. Scan count 3, logical reads 283114, physical reads 614, read-ahead reads 171751, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

You can see that I had the same number of logical reads with –E but had quite a few fewer physical reads, which is a characteristic of –E.  I did not expect to have fewer read aheads, actually thought I would get more here. Need to revisit this one for sure. As far as the perfmon graphs go, I did not get quite the same 256k block size that Microsoft got in their Data Loading Performance Guide documentation, but with –E I got an average of about 197k read size and without it was roughly 159k as can be seen.  This was fairly substantial to me; as mentioned I’ve run this thing at least 100 times and see consistency in my results.  Again, this is on my Dell server but on a Hyper-V setup.

-E isn’t for everyone as Microsoft says, and certainly not for an OLTP database. For some data warehousing installations it seems to be pretty cool.  If you want to read up more on this, check out their website and search for Fast Track Data Warehouse.   I plan on geeking with this a bit more ‘cause I think it’s pretty slick!

Thanks

Lee

 

----------------------------

Oh yeah, we're gonna bring in some entry-level graduates, farm some work out to Singapore, that's the usual deal.

angry

 

 

 

 

Links

http://support.microsoft.com/kb/329526

http://msdn.microsoft.com/en-us/library/dd425070.aspx

http://technet.microsoft.com/en-us/library/ee730351.aspx

Fast Track Data warehouse - Microsoft Corporation


Posted in:   Tags:
blog comments powered by Disqus

by Lee Everest, M.S.

Poll

Do you use Azure or cloud in your organization?



Show Results

Ads

Search


Month List

Calendar

«  February 2012  »
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
2728291234
567891011
View posts in large calendar

Tags

Disclaimer
The opinions, code, examples, et.al. expressed herein are my own personal opinions and do not represent my employer's view in any way, shape form, or fashion.  All code for demonstration purposes - no guarantees, either written or implied, are made.

© Copyright 2012 Lee Everest's SQL Server, etc. weblog