Just geeking some more.
There are a few caveats to using SELECT INTO, the singleton-select statement that is used often for fast loading data and creating a copy of a table on the fly. First off, it doesn’t create an exact copy of your source table. Indexes, constraints, defaults, computed columns, persisted columns, sparse columns, and others, do not get pushed to the new table. There are even restrictions on IDENTITY as well, and this may not end up in your destination table. (See Books Online for this one).
DROP TABLE SELECT_INTO
GO
CREATE TABLE SELECT_INTO
(ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
,STRING CHAR (100) UNIQUE NONCLUSTERED
,DEFAULT_VALUE CHAR(1) DEFAULT 'A'
,DEFAULT_VALUES VARCHAR (20) CONSTRAINT DF CHECK (DEFAULT_VALUES IN ('AAA', 'BBB', 'CCC'))
,STRING1 AS STRING + 'A' PERSISTED
,STRING2 CHAR (1) SPARSE
)
GO
SELECT *
INTO SELECT_INTO_DEST
FROM SELECT_INTO
Figure 1. We got nothin’. Notice I’ve expanded the keys, constraints, indexes, etc. :((
CREATE TABLE [dbo].[SELECT_INTO_DEST](
[ID] [int] IDENTITY(1,1) NOT NULL,
[STRING] [char](100) NULL,
[DEFAULT_VALUE] [char](1) NULL,
[DEFAULT_VALUES] [varchar](20) NULL,
[STRING1] [varchar](101) NULL,
[STRING2] [char](1) NULL
) ON [PRIMARY]
GO
These you may already know. Here are two others that I found that I did not see in BOL. Were you aware that expressions cause the NULLability of a column to change? Consider this:
SELECT CAST(ID AS INT) AS NEW_ID
,*
INTO SELECT_INTO_DEST2
FROM SELECT_INTO
Figure 2. Expression changed NULL when using the ID column
Were you aware that using a ROW_NUMBER() actually changes the data type? Check this one out:
SELECT
ROW_NUMBER() OVER (ORDER BY a.id) as ID
INTO SELECT_INTO_DEST3
FROM SELECT_INTO IDas a
Figure 3. Int went away, and now we have a bigint.
Here’s a jim-dandy:
SELECT
UPPER(STRING2) AS NEW_STRING2
INTO SELECT_INTO_DEST4
FROM SELECT_INTO
Figure 4. Type changed to a varchar from a char. Is this really necessary? It’s deterministic, right?
This BIGINT conversion was probably crafted to act in this manner so that you won’t have to worry about running out of values for int; remember that with something like COUNT() you have to revert to COUNT_BIG() if you have too many values that exceed integer. Either way, it makes you think about using SELECT INTO and to consider exactly what you end up with in your destination table before using expressions, casting, etc.
SELECT INTO is really great, but more and more I am using INSERT SELECT with (TABLOCK) for fast-loading tables when I’m in SQL Server 2008 and beyond. Remember that I’m working right now in a data warehouse setting and using these all of the time; neither are used much in an OLTP environment, or if they are, maybe in dev whilst moving data around, but certainly not in the online production system since SELECT INTO could cause locking of system tables. The inability of SELECT INTO to pass certain objects to the destination table makes me look to use it even more, and comparing the speed of the two, there appears to be no difference when loading data.
Thanks for reading
Lee
---------------------
Is it time to present at another SQLSaturday yet???
566976de-2704-46a1-a9f0-cabe23abeeae|0|.0