Column Store indexes are particularly interesting in SQL Server 2012, so I thought I would take a look at them. The first thing that you notice in BOL is the following:

  • They’re for read-only tables at this time…this may change in the future. If you insert or change the data, you get the following:

    Msg 35330, Level 15, State 1, Line 2
    INSERT statement failed because data cannot be updated in a table with a columnstore index. Consider disabling the columnstore index before issuing the INSERT statement, then rebuilding the columnstore index after INSERT is complete.

  • They build a view on top of the existing table
  • The view fetches only those columns needed
  • Performance is 10x to 100x.  (Really 100x? Wow).

There are a couple of things that I want to check out with these indexes, so I’ll look at one here and save the others for later. I want to find out if column store indexes on a bit field actually helps since in BOL is states that bit columns are legit columns to index with Column Store. Indexing a bit column may or may not help with a traditional nonclustered index as you might know, depending on the particular situation and a host of other variables, so let’s see if one of these bad boys helps in my example. I’m expanding here on a vid that I saw today over on Channel 9; check it out for yourself to see where I’m getting my ideas.

An interesting point in BOL:

The columnstore index does not support SEEK

If the query is expected to return a small fraction of the rows, the optimizer is unlikely to select the columnstore index. For example: needle-in-the-haystack type queries. If the table hint FORCESEEK is used, the optimizer will not consider the columnstore index.

So I expect here that I will probably get a seek with a traditional nonclustered index, but a scan with column store. Mine here is not of the ‘needle-in-the-haystack’ variety.

Run the following to get a table with 150 million rows:

BEGIN TRY
    DROP TABLE BitTable
END TRY
BEGIN CATCH
END CATCH
 
SELECT CAST(CASE WHEN x.id%2=0 THEN 1 ELSE 0 END AS bit) AS ID
       ,CASE WHEN x.id%2=1 then 'b'
          WHEN x.id%3=1 then 'c'
            ELSE 'd' 
        END AS charVal
INTO BitTable
FROM 
    (SELECT TOP 150000000 a.id from sys.syscolumns a
        ,sys.syscolumns b
        ,sys.syscolumns c) AS x

 

Create the indexes. This will take a few minutes.

 
CREATE NONCLUSTERED INDEX IDX_x ON BitTable (ID)
GO
CREATE NONCLUSTERED COLUMNSTORE INDEX IDX_csi ON BitTable (ID);
GO

Run the following:

 
SET STATISTICS TIME ON
SET STATISTICS IO ON
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

Make sure and run both DBCC statements before running each query. Now let’s run the query, first with the traditional NC index. The execution plan does show an index seek, btw.

SELECT COUNT(1) 
FROM BitTable WITH (INDEX(IDX_x))
WHERE   ID=1 

 

SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 22 ms.

(1 row(s) affected)
Table 'BitTable'. Scan count 3, logical reads 151900, physical reads 0, read-ahead reads 151732, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
   CPU time = 5437 ms,  elapsed time = 20817 ms.

 

Now let’s check out column store:

SELECT COUNT(1) 
FROM BitTable WITH (INDEX(idx_csi))
WHERE   ID=1 

 

SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 23 ms.

(1 row(s) affected)
Table 'BitTable'. Scan count 2, logical reads 728, physical reads 1, read-ahead reads 290, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
   CPU time = 7829 ms,  elapsed time = 5983 ms.

 

Some interesting things here. Quite a few less physical/logical reads as well as 1 more scan count in the nonclustered index (along with ORDERED FORWARD in the showplan text). The traditional nc index is definitely doing some more work, even though the CPU time is less. CPU time for both shown in task mgr:

image

I note that the column store index hit the CPUs much harder than the traditional NCI; both are in this view with the column store indicated by the large spike. CS pegged them at 100% while the traditional NCI hit them at roughly 10%.

Just for fun, let’s look at the query without an index:


-----------
81560849

(1 row(s) affected)

Table 'BitTable'. Scan count 3, logical reads 278294, physical reads 0, read-ahead reads 278294, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
   CPU time = 14500 ms,  elapsed time = 24880 ms.

 

No index is almost as fast as the large NC index on the table.

Thanks,
Lee

 

------------------------

 

http://channel9.msdn.com/posts/SQLShorts-SQL-2012-Apollo-Column-Store


Posted in: SQL Server 2012  Tags:
blog comments powered by Disqus

by Lee Everest, M.S.

Info

Poll

Do you use Azure or cloud in your organization?



Show Results

Ads

Search


Month List

Calendar

«  May 2012  »
MoTuWeThFrSaSu
30123456
78910111213
14151617181920
21222324252627
28293031123
45678910
View posts in large calendar

Tags

Disclaimer
The opinions, code, examples, et.al. expressed herein are my own personal opinions and do not represent my employer's view in any way, shape form, or fashion.  All code for demonstration purposes - no guarantees, either written or implied, are made.

© Copyright 2012 Lee Everest's SQL Server, etc. weblog