Dropping indexes when bulk loading a table is a rule #1 kind of deal, beginner stuff – if you remove the indexes, you’ll be better off nine times out of ten. Dropping a clustered index makes sense on a natural key, because worst case SQL Server will have to deal with reordering the key on the fly. For nonclustered indexes, well, this is one that a lot of folks don’t consider; sometimes, the results are dramatic. Myself and fellow associate Harry Doan ran into this the other day, where the developer left indexes on an empty table in an SSIS process.  In this blog I’m going to demo a scenario where I keep the nonclustered indexes in place, and then drop them for a data load. Next time I will test the scenario when the table has n number of rows already in the table.

One consideration for dropping nonclustered indexes on an empty table is the answer to the question…”What’s faster – loading a table with the indexes, or loading the table without the indexes, and then rebuilding indexes after the data has been landed?” For my test, I have a super-slammin’ box (Dell R900 with 200GB RAM).

Run the following code:

USE tempdb
GO
DROP TABLE TestNC 
GO
CREATE TABLE TestNC(
        ID int IDENTITY (0,1) NOT NULL
        ,Col1 varchar (4)
        ,Col2 varchar (10)
        ,Col3 int
        ,Col4 int
        ,Col5 varchar (7)
        )
GO
CREATE NONCLUSTERED INDEX idx1 ON TestNC (Col1, Col2, Col3)
GO
CREATE NONCLUSTERED INDEX idx2 on TestNC (Col4, Col5)
GO
DROP TABLE #data
GO
SELECT x.C1, x.C2, ROW_NUMBER() OVER (ORDER BY C3) AS C3, ROW_NUMBER() OVER (ORDER BY c4) 
as C4, c5 INTO #data FROM (SELECT TOP 10000000 SPACE(4) AS C1 ,'AAAAAAAAAA' AS C2 ,0 AS C3 ,1 AS C4 ,'AAAAAAA' AS C5 FROM sys.syscolumns AS a, sys.syscolumns AS b, sys.syscolumns as c )x GO

So now the table is ready, with indexes built and data staged to load. Run this script now:


INSERT INTO TestNC WITH (TABLOCKX) SELECT TOP 6000000 * FROM #data GO

Now drop the table, and recreate without the indexes, then load the table. Finally, go rebuild the indexes. Which is faster? A typical scenario such as this one with a decent-sized box will – most of the time – work as such. The  increase here is pretty dramatic because I’m using a strong SQL Server.

By the way, just the other day, someone else blogged about this, but really didn’t run a test like we have here. One thing that they mentioned is disabling rather than dropping indexes “is better than dropping indexes.”  Of course, when I read something like this, I have to go find out for myself, because more often than not proclaiming absolutes in SQL Server will more often than not come back and bite you in the behind. This is not trying to call someone out – we have too many SQL commandos and ninjas in world already trying to show someone up and flex their ego. Ego, man I left mine behind many years ago, and I’m much happier today. If you have one, my advice to you is lose it.  Egos are for classless losers and punks.

Additional code:

IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[TestNC]') 
AND name = N'idx1') DROP INDEX [idx1] ON [dbo].[TestNC] WITH ( ONLINE = OFF ) GO CREATE NONCLUSTERED INDEX [idx1] ON [dbo].[TestNC] ( [Col1] ASC, [Col2] ASC, [Col3] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY =
OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [PRIMARY] GO USE [tempdb] GO /****** Object: Index [idx2] Script Date: 10/04/2011 16:43:06 ******/ IF EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[TestNC]')
AND name = N'idx2') DROP INDEX [idx2] ON [dbo].[TestNC] WITH ( ONLINE = OFF ) GO USE [tempdb] GO /****** Object: Index [idx2] Script Date: 10/04/2011 16:43:06 ******/ CREATE NONCLUSTERED INDEX [idx2] ON [dbo].[TestNC] ( [Col4] ASC, [Col5] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY =
OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [PRIMARY] GO ALTER INDEX idx1 ON TestNC DISABLE; ALTER INDEX idx2 ON TestNC DISABLE; ALTER INDEX idx1 ON TestNC REBUILD ALTER INDEX idx2 ON TestNC REBUILD

Here are my results, including the ALTERs rather than drops. Try for yourself…don’t believe me like I didn’t believe that dude.

 

TEST RUN

Indexes                                 insert time               index time (idx1/2)     index action          total

----------------------------------  ----------------------- ----------------------      ------                   ------

Indexes on table                     2:28                        n/a                                                       2:28           

Not on table                            0:06                       :31+:01                      drop                    0:38

Not on table                            0:06                       :32+:02                      disable                 0:40

 

So based on this minimal test, a load with indexes too two additional minutes, and the disable as opposed to the drop index too two seconds longer.  Try some tests for yourself and see what you can come up with. Next time I’m going to look at the loading chart in Book Online and “see for myself”.

Lee

 

-------------------------

:king:

 

Refs and other readings

http://blogs.technet.com/b/sql_server_isv/archive/2011/10/04/redprairie-removing-unneeded-table-indices.aspx

http://msdn.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

http://msdn.microsoft.com/en-us/library/ms177445.aspx


Posted in: TSQL , Beginner  Tags:
blog comments powered by Disqus

by Lee Everest, M.S.

Info

Poll

Do you use Azure or cloud in your organization?



Show Results

Ads

Search


Month List

Calendar

«  May 2012  »
MoTuWeThFrSaSu
30123456
78910111213
14151617181920
21222324252627
28293031123
45678910
View posts in large calendar

Tags

Disclaimer
The opinions, code, examples, et.al. expressed herein are my own personal opinions and do not represent my employer's view in any way, shape form, or fashion.  All code for demonstration purposes - no guarantees, either written or implied, are made.

© Copyright 2012 Lee Everest's SQL Server, etc. weblog