In SQL Server, there is a little-known switch…–E…which is a startup option to increase the number of extents allocated per file within a filegroup during each iteration of writes. Here’s an excerpt from the link from the KB on it:
The Microsoft SQL Server 2000 Enterprise Edition (64-bit) and Microsoft SQL Server 2005 Enterprise Edition releases support a new server startup option, -E. With the -E startup option, an administrator can increase the number of extents allocated per file in the filegroup during each iteration of the round robin cycle used to distribute extents across files. This extension may increase the performance for data warehouse (DW) applications by providing better data contiguity, and allowing the read-ahead mechanism to reduce the number of I/Os processed during sequential scans executed for DW queries.

Figure 1. Startup SQL Server with -E
This is one of the cool things that was intriguing to me at the Fast Track Data Warehouse sessions that I attended (mostly because I am at a gig doing data warehousing at this time). The KB also says:
In the SQL Server 64-bit release, you can allocate four consecutive extents per given file, before switching to the next file in the filegroup. Lowering the frequency of file switches allows for larger contiguous ranges of disk space allocation, although it does not guarantee it, especially in the multiuser environment. Scans benefit from this contiguity because the data will be less fragmented allowing for fewer file switches and faster access. Additionally, the new allocation extension may decrease the number of I/Os executed during sequential scans.
In the Fast Track Data Warehouse, the architecture is optimized for scans, so reading data that is layed down in a sequential fashion is highly optimal. (I made a joke about this in the session, noting that we’ve spent the last 12+ years trying to get rid of scans, but here they’re a-OK). You can read more about Fast Track and data layout in the Fast Track Data warehouse - Microsoft Corporation documentation.
Anyway, I thought this was really cool so I wanted to test it on my box. I’ve actually been dorking with this thing for about three weeks but haven’t been able to see what I wanted to up until now. I’ve run this over and over again and believe that I am getting some benefit from –E, even on my Hyper-V setup. I run the following code, first without –E and then with –E.
DROP TABLE test_stg
go
CREATE TABLE test_stg
(id int identity (1,1) primary key clustered
,fname char(100)
,lname char(100)
,dt date default getdate()
)
GO
INSERT test_stg with (tablock) (fname, lname)
SELECT fname, lname
FROM
(
SELECT top 1000000 '' as fname, '' as lname
from sys.syscolumns a, sys.syscolumns b, sys.syscolumns c
) x
go
DROP TABLE test
go
CREATE TABLE test
(id int primary key clustered
,fname char(1000)
,lname char(1000)
,dt date
)
GO
TRUNCATE TABLE Test;
GO
-- INSERT TEST SELECT * FROM test_stg
SET NOCOUNT ON
DECLARE @id int=1
DECLARE @chunk int = 100000
WHILE 1=1
BEGIN
INSERT Test WITH (TABLOCK)
SELECT * FROM test_stg
WHERE id BETWEEN @id and @chunk
--OPTION (MAXDOP 1)
IF @@ROWCOUNT=0
BREAK
SET @id += 100000
SET @chunk += 100000
END
GO
Note that when I add the –E startup option I remove the comment for OPTION (MAXDOP 1) so I get a nice sequential write to the disk. Here is the perfmon using only the Logical Disk – Avg. Disk Bytes/Write counter, and then STATISTICS IO running the following query, which guarantees me both a scan along with only a few rows returned.
SELECT TOP 20 *
FROM test
WHERE (id=3 OR id BETWEEN 3000000 AND 3000010)
OR fname like '%lee'
Figure 2. Without -E
Table 'test'. Scan count 3, logical reads 283114, physical reads 2443, read-ahead reads 175470, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Here’s the same with –E.
Figure 3. With -E
Table 'test'. Scan count 3, logical reads 283114, physical reads 614, read-ahead reads 171751, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
You can see that I had the same number of logical reads with –E but had quite a few fewer physical reads, which is a characteristic of –E. I did not expect to have fewer read aheads, actually thought I would get more here. Need to revisit this one for sure. As far as the perfmon graphs go, I did not get quite the same 256k block size that Microsoft got in their Data Loading Performance Guide documentation, but with –E I got an average of about 197k read size and without it was roughly 159k as can be seen. This was fairly substantial to me; as mentioned I’ve run this thing at least 100 times and see consistency in my results. Again, this is on my Dell server but on a Hyper-V setup.
-E isn’t for everyone as Microsoft says, and certainly not for an OLTP database. For some data warehousing installations it seems to be pretty cool. If you want to read up more on this, check out their website and search for Fast Track Data Warehouse. I plan on geeking with this a bit more ‘cause I think it’s pretty slick!
Thanks
Lee
----------------------------
Oh yeah, we're gonna bring in some entry-level graduates, farm some work out to Singapore, that's the usual deal.
Links
http://support.microsoft.com/kb/329526
http://msdn.microsoft.com/en-us/library/dd425070.aspx
http://technet.microsoft.com/en-us/library/ee730351.aspx
Fast Track Data warehouse - Microsoft Corporation
a7b7d69b-42c9-4fd3-83fc-64ce213aa93e|0|.0