Sharing some old tricks for deleting rows in a table

I was looking around the SQL Blogs again this evening and stumbled on SQLCat and a blog entitled Fast ordered delete [1] from last May.  I won’t paste their code – get it at the link - but apparently the MySpace boyz were in Redmond, and had a delete problem;  they were faced with deleting large amounts of older data and were having an issue – apparently it took too long on large tables.  This is a problem that I faced a long, long time ago, and did a lot of experimentation and work on it.  At a local, large retail company located here in the Dallas/Ft. Worth metroplex, they had production tables in the 60-100 million row range that needed data removed upon archival.  (Actually, the problem in this exercise is not the deletes themselves, but really is the blocking that the deletions can cause.  I’m very surprised that this dude Kevin Stephenson didn’t bring that up as a concern in his shop.  Chopping large amounts of data out of a table is a no-no for the most part, while taking nice chunks is best.  Maybe this is the reason that they choose the 10000 value. My part 2 of this blog series will show you how to maximize what I call the “chunk size” while avoiding the blocking).

SQLCat’s solution was a fairly simple one, and that was to use a view with a TOP clause to delete a chunk of data.  Not bad IMHO. Their reasoning for doing this can be seen in the below image.  Using a delete via a derived table that represented 10,000 rows, the got this execution plan [2]:

The results were the following:

 

Table 't1'. Scan count 2, logical reads 30564, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 1, logical reads 20152, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
   CPU time = 78 ms,  elapsed time = 80 ms.

 

Changing to a view with a TOP, the significantly reduced their time and read, since they didn’t have to hit the table twice (see above plan).  That plan look like so [3]

 

 

The results were the following:

Table 't1'. Scan count 1, logical reads 654, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
   CPU time = 31 ms,  elapsed time = 75 ms.

 

And now for mine. You may have seen something like this before, maybe not. The first time that I did was about 5 or so years ago and I thought that it was slick.  Yeah, forget the “best practice” garbage about not using goto. This is a script and not a stored procedure, so anything is fair game.

set rowcount 1000
declare @cnt int
set @cnt=0
label:
delete from t1
set @cnt = @cnt + @@rowcount
if @cnt < 10000
    goto label

 

I won’t past the statistics because for each iteration, several rows will be printed. Summing them, you will see if you run for yourself that while their logical reads were about 200 less, my time was 63ms…16% faster.  Not only that, but mine takes smaller chunks, and is much easier on the transaction log and blocking on other processes (I removed 1000 at a time and they 10000).  Uh, whassat you say?  Rowcount is deprecated? Very astute, you are, but no worries, because you should remember that DELETE, INSERT, AND UPDATE now can use TOP; simply change the code to this:

 

declare @cnt int
set @cnt=0
label:
delete top (1000) from t1
set @cnt = @cnt + @@rowcount
if @cnt < 10000
    goto label

 

Running this one we get actually 840 reads and 64ms, so it’s comparable to the old one using ROWCOUNT, and, yes, still faster than theirs. Come back next time for part 2 and I’ll show you how we can 1) maximize the chunk size for deletion and 2) monitor the process while it’s running.

 


Lee

-------------------------

 

 

 

“My mind is a raging torrent, flooded with rivulets of thought cascading into a waterfall of creative alternatives”

 

 

References


[1] Stephenson, K, and Kollar, L.  Fast Ordered Delete. Retrieved on 9/2/2009 from http://sqlcat.com/msdnmirror/archive/2009/05/20/fast-ordered-delete.aspx 

[2] Stephenson, K. and Kollar, L. Fast Ordered Delete FBB4/Pic1_2. Retrieved on 9/2/2009 from http://blogs.msdn.com/blogfiles/sqlcat/WindowsLiveWriter/Fastordereddelete_FBB4/Pic1_2.jpg

[3] Stephenson, K. and Kollar, L. Fast Ordered Delete FBB4/Pic2_2. Retrieved on 9/2/2009 from http://blogs.msdn.com/blogfiles/sqlcat/WindowsLiveWriter/Fastordereddelete_FBB4/Pic2_2.jpg

 

 


Posted in: The Best of MSDN Blogs , SQLServerPedia , TSQL  Tags:
blog comments powered by Disqus

by Lee Everest, M.S.

Poll

Do you use Azure or cloud in your organization?



Show Results

Ads

Search


Month List

Calendar

«  February 2012  »
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
2728291234
567891011
View posts in large calendar

Tags

Disclaimer
The opinions, code, examples, et.al. expressed herein are my own personal opinions and do not represent my employer's view in any way, shape form, or fashion.  All code for demonstration purposes - no guarantees, either written or implied, are made.

© Copyright 2012 Lee Everest's SQL Server, etc. weblog