Geeking with table variables. Again.

 

 

Table variables vs. Temporary tables. Ugh, here we go again

 

You often hear discussions on table variables and temporary tables in SQL Server, many of which relate to performance, which one is faster, when to use either based on the number of rows that you are working with, etc.  Ask ten people and you will probably get ten different answers.  One argument that the pro-temp table faction seems to hold steadfast is that you can created indexes on temp tables but you cannot on table variables. This is of course not true as we know. You can in fact create a constraints on table variables, both unique clustered and nonclustered, which create indexes on the table; you can also create a unique constraint on a table variable with multiple columns as well, hence creating a covering index if you need to index columns within your SELECT list. Very nice!  Illustration 1 shows a few ways to add these to a table variable.

 

 

DECLARE @tbl table

       (id int identity (1,1) unique

       ,Col1 int primary key clustered

       ,Col2 char(1)

       ,Col3 char (2)

       ,Col4 char (3)

       ,Col5 varchar (10)

       ,unique (Col2, Col3, Col4)

       ,unique (Col5)

       )

 

Illustration 1.  A table variable with multiple constraints

 

 

Here we’ve added several unique constraints (indexes) and primary key constraint (clustered) to our table variable.  Additionally, we added a unique index with more than one column, essentially to be used as a “covering constraint”, so to speak.  This table is obviously for demonstration purposes, so don’t load up a table variable like this one looking for spectacular performance.

 

 

Statistics? We don’t need no stinkin’ statistics!

 

That same faction may respond; “Well, you can I guess, but SQL Server doesn’t allow statistics on table variables, and if you don’t have statistics, then your indexes don’t help”. Wrong again!  In fact, unique constraints, either clustered or nonclustered, on table variables can and do help dramatically.  Run Illustration 2 in SQL Server 2008. We can clearly see the difference – without a clustered index the execution time was 3.4 seconds, and 35 milliseconds with the index for the given query.  A clustered index was chosen here because they work well with range seeks. This was shown by Moreau [1]. 

 

 

DECLARE @tbl table

       (Col0 bigint PRIMARY KEY CLUSTERED -- comment/uncomment this line and execute

       ,Col1 varchar (30)

       )

 

INSERT @tbl

SELECT ROW_NUMBER() OVER (ORDER BY c.id) as Col0

       ,CAST(NULL AS varchar (30)) as Col1

FROM

       (SELECT TOP 10000000 1 as id

        FROM sys.syscolumns a

              , sys.syscolumns b

              , sys.syscolumns) c

      

SET STATISTICS TIME ON

BEGIN

       SELECT *

       FROM @tbl

       WHERE (Col0 BETWEEN 5674540 AND 5684540)

       OR (Col0 = 78494)

       OR (Col0 BETWEEN 2298 AND 2329)

       OR Col0 = 3434666

END

SET STATISTICS TIME OFF

GO

 

 

--SQL Server Execution Times:

 --  CPU time = 0 ms,  elapsed time = 35 ms.

 

--SQL Server Execution Times:

 --  CPU time = 3484 ms,  elapsed time = 3474 ms.

 

Illustration 2.  Table variable results with/without a clustered unique constraint

 

 

 

Here’s another query for your contemplation, this one with a constraint on two columns – we also see much improved performance. Smokin!

 

DECLARE @tbl table

       (Col0 bigint

       ,Col1 varchar (30)

       ,UNIQUE (Col0, Col1) -- comment/uncomment this line

       )

 

INSERT @tbl

SELECT ROW_NUMBER() OVER (ORDER BY c.id) as Col0

       ,CAST(NULL AS varchar (30)) as Col1

FROM

       (SELECT TOP 10000000 1 as id

        FROM sys.syscolumns a

              , sys.syscolumns b

              , sys.syscolumns) c

      

      

SET STATISTICS TIME ON

BEGIN

       SELECT Col0, Col1

       FROM @tbl

       WHERE Col0 = 3434696

END

SET STATISTICS TIME OFF

GO

 

 

--SQL Server Execution Times:

--   CPU time = 0 ms,  elapsed time = 0 ms.

 

--SQL Server Execution Times:

--  CPU time = 719 ms,  elapsed time = 717 ms.

 

Illustration 3.  Table variable with/without a unique constraint on multiple columns.

 

 

 

Oh. Well then, it’s constraints on all of my table variables from here on out!

 

Uh, hold that thought. This is all well and good, but let’s look at it from a different perspective.  Let’s say that you have done your homework, read your books, read Dr. Toms article, and decided that you’re going to throw some of these constraints on your table. This may or may not be advisable; you might do well to understand the Accumulated Cost of Query as opposed to simply the time-to-completion, cost, or cpu given a statement. Specifically, when I refer to this cost, I consider compilation to be the 1) creation and population of an entity (table and data), and 2) execution for a given body of work, which can be expressed it in a simple mathematical equation:

 

Let’s take a look at an example.  Suppose that I have the following query. I will first build a table build without an index and then run the entire process with an index on the table variable. (We’ll do only one iteration, let i=1, but you could easily run this simulation to represent work via several thousand calculations, as if you were testing an OLTP process).  For this example, we only run the “compilation” component. That is, we only create the table variable and populate it with our data.

 

 

SET STATISTICS TIME ON

DECLARE @tbl table

       (Col0 bigint

       ,Col1 varchar (30)

       )

 

INSERT @tbl

SELECT ROW_NUMBER() OVER (ORDER BY c.id) as Col0

              ,CAST(NULL AS varchar (30)) as Col1

FROM

       (SELECT TOP 10000000 1 as id

        FROM sys.syscolumns a

              , sys.syscolumns b

              , sys.syscolumns) c

SET STATISTICS TIME OFF          

GO           

SET STATISTICS TIME ON

DECLARE @tbl table

       (Col0 bigint

       ,Col1 varchar (30)

       ,UNIQUE (Col0, Col1)

       )

 

INSERT @tbl

SELECT ROW_NUMBER() OVER (ORDER BY c.id) as Col0

              ,CAST(NULL AS varchar (30)) as Col1

FROM

       (SELECT TOP 10000000 1 as id

        FROM sys.syscolumns a

              , sys.syscolumns b

              , sys.syscolumns) c

SET STATISTICS TIME OFF

 

 

-- SQL Server Execution Times:

--   CPU time = 4515 ms,  elapsed time = 4531 ms.

 

--(10000000 row(s) affected)

 

-- SQL Server Execution Times:

--   CPU time = 99469 ms,  elapsed time = 100476 ms.

 

--(10000000 row(s) affected)

 

Illustration 4.  Compilation time, which includes creation of table and data population

 

 

 

From the above, we can conclude that for our example it is not worth the time, nor effort and resource, to include our unique constraint on this table. It took roughly 4.5 seconds to build the table without the constraint and .7 seconds to execute, and over 100 seconds to build with the constraint with a negligible amount of time to execute. In our example using our “covering constraint”, we could fire this off, all things being equal, approximately 19 times while waiting on the table to build with the constraint.  This can be expressed in the following terms: 

 

Conclusion 

We’ve seen that we can index table variables via constraints, and that they do in fact make a difference in reducing execution times for queries.  More importantly, though, is that we need to consider other factors aside from query time when deciding whether or not to add indexes to our table variables (or even temp tables for that matter). In a data warehouse setting, we might run one large batch and easily obtain these comparative values, while in an OLTP application, we might have to run a comparison over several days for processes that execute tens or hundreds of times per second. Either way, we can derive an Accumulated Cost of Query to help us understand the relative value of indexes which will then allow us to make a fact-based decision given their performance contribution.

 

 

Lee Everest

 

 

 

--------------


Contrary to popular belief, thinking is a requirement for this job.

 

 

 

  

 

 

 

 

[1]. Moreau, T. Dr. Tom's Workshop: How Table Variables Can Speed Up Your Queries. Retrieved on 12/10/09 from http://msdn.microsoft.com/en-us/library/aa175774(SQL.80).aspx

 

 

 


Comments


December 11. 2009 15:55
Table Variables, Indexes, and Accumulated Cost of Query

You've been kicked (a good thing) - Trackback from DotNetKicks.com

http://www.dotnetkicks.com/database/Table_Variables_Indexes_and_Accumulated_Cost_of_Queryhttp://www.dotnetkicks.com/database/Table_Variables_Indexes_and_Accumulated_Cost_of_Query

Add comment




  Country flag

biuquote
  • Comment
  • Preview
Loading




by Lee Everest, M.S.

Search


Ads

Calendar

«  July 2010  »
MoTuWeThFrSaSu
2829301234
567891011
12131415161718
19202122232425
2627282930311
2345678
View posts in large calendar

Tags

Disclaimer
The opinions, code, examples, et.al. expressed herein are my own personal opinions and do not represent my employer's view in any way, shape form, or fashion.  All code for demonstration purposes - no guarantees, either written or implied, are made.

© Copyright 2010 Lee Everest's SQL Server, etc. weblog