lee posted on August 9, 2011 22:32
In the first three parts of the series, we looked at setup for dynamic packages. It is very important that you read these first before attempting to download and run this package…it won’t work for you until you have an understanding of this concept, and have the prerequisites already installed. In this post, I attach the .dtsx file for you to download and run, along with a short video to "demo" the functionality. As mentioned, suppose that you had 2000 files to import. You certainly would not want to write 2000 SSIS packages; you wouldn't want to rewrite those same packages somewhere down the road when Microsoft decides make a slight change to SSIS, either. With this package that I have here, you write only one to import, and one to export - parameters that you set up prior to executing the package directs the generic package on the fly. Simply create the drive tables with data to feed the parameters, and the package creates the sources, pipelines, and destinations in memory. To gain this type of functionality, we leverage the powerful SSIS API. As a reference, check out Building Packages Programmatically on the Microsoft site. While they don't show you how to do exactly what we do here, you can see how this package was created by combining all of the information from their pages. For watching the video below, I wanted to keep the screen resolution at full, so make sure and use full screen to view. You can also flip to HD for a clearer image. Now that I try it though, I see that it doesn’t expand to full screen here in my blog, so go ahead and just click through the image below and watch it on YouTube (you can right-click it as well and choose “Watch on Youtube”). Make sure and go to 720 HD and full screen for best viewing.  Maybe someday I’ll have a good screen capture program for videos like all of the SQL Server MVPs and gurus have, but right now I have to use the freebies.   Dynamic SSIS Part 4 Pretty sweet, eh? I believe that this is the slickest SSIS package that I’ve ever seen, certainly, and I hope that you like it as well. I’ve seen blogs and articles on the subject of ‘dynamic’ packages, but maybe not one to this extent. I am planning the following posts as follow-ups to this series: Import Package Combining both packages Dissecting the SSIS API and the code for each dynamic SSIS package Expanding the scope of these packages (variables, logging, debugging, error handling) Others Give me a shout if you have problems running this package, and again, if you have not read the first three parts, please do so before trying to wire this package up. Thanks for visiting, Lee   --------------------------- Edit: 8/10/2011 – I’ve added the database scripts and zipped up the SSIS package here. Scripts and SSIS package Test database Configuration database Extract.dtsx

Posted in: SSIS , Vids , SQLServerPedia  Tags:
Lee posted on February 19, 2011 08:58
I’ve been tasked to generate a bunch of files at work for an actuarial application to consume; without getting into the details of what the app does with the files (because I have no idea, frankly), I need to query SQL Server, generate a file for each “key” whereby each corresponding foreign key has approximately forty records, and one column of type decimal. Data looks like so in a SQL Server table: 1  1.119 1  2.225 1  3.330 37 more times … 2 10.3 2 1.1 2 4.5 37 more times I’ve tried eight or ten methods up to now, and I’m going to cut to the chase and skip right to the fastest method that I have found, and then go back and jot down the others in subsequent blogs.  Check the first post to get an idea of some of the ideas that I have brainstormed to accomplish this task.   The method to generate files the fastest up to this point? A c# console application. I could have written it in c++ or c as well of course, but c# seems just as good, so I chose this language. Also, because I may try to take this code and put in an SSIS script task as additional method, I’d be set since you can use c# and vb.net there as well. Anyway, here’s the code. I’m still tweaking this so I’ll be working on it as well for improvements.  And, if you’re thinking that it could be much better than below, by all means send your thoughts to me while keeping the hate mail to yourself, even if you are 1) the “de-facto” expert among us on c# I/O, 2) you buddies with the guy that wrote the c# compiler 3) I happen to be treading on someone’s “sacred turf”, or 4) you just don’t like me or my blog.  Fair enough?  I think we’re gold then. This thing generates approximately 25-35 files per second; the fastest SSIS package that I have built does about 12 files/second, and I have yet to find anything to beat this time, including bcp.  Come up with your own method and share with us if you like. Thanks for reading! Lee Everest, MS -------------------------- He was wearing my Harvard tie. Can you believe it? My Harvard tie. Like oh, sure he went to Harvard using System; using System.IO; using System.Collections.Generic; using System.Text; using System.Data.SqlClient; using System.Data; public class Program { static StringBuilder sb = new StringBuilder(); public static void Main() { SqlConnection sqlConnection = null; int maxvalue=151000; string dbconn = "Application Name=SqlTest;Data Source=SQLVM1;Trusted_Connection=yes; database=tempdb"; using(sqlConnection = new SqlConnection(dbconn)) { sqlConnection.Open(); SqlParameter param = new SqlParameter("@id", SqlDbType.Int, 2); SqlDataReader reader; using (SqlCommand cmd = new SqlCommand(null, sqlConnection)) { param.Value = 1; cmd.CommandType = CommandType.Text; param.Direction = ParameterDirection.Input; cmd.Parameters.Add(param); while ((int)param.Value<maxvalue) { cmd.CommandText = "SELECT id FROM deleteTablexus WHERE idx = @id"; reader = cmd.ExecuteReader(); while (reader.Read()) sb.Append(reader[0].ToString() + "\r\n"); SaveToFile(param.Value.ToString()); reader.Close(); param.Value = (int)param.Value + 1; } } } } public static void SaveToFile(string fileName) { System.IO.TextWriter w = new System.IO.StreamWriter(@"c:\temp\files\"+fileName+".dat"); w.Write(sb.ToString()); w.Flush(); w.Close(); } }      

Posted in: Etc. (Off-topic) , .Net , SQLServerPedia  Tags:
Introduction I’m at a client getting ready to give my .02 cents on why their SQL Servers are not running to well, and I happened to notice multiple transaction log files for their databases.  We know that multiple data files for some databases, including tempdb, has advantages in performance and administration. This is well documented. However, I was wondering why their setup included multiple log files, so I started thinking about this; I had not remembered reading about extra throughput or anything when adding multiple transaction log files to a database, so I wasn’t sure what their goal was in having this setup. Actually seemed like a waste of resources to me. So, I decided to find out more about the transaction log (can I call it the “tlog” from here on out? Thanks) as it relates to single vs. multiple files, and exactly what happens, or doesn’t happen, when these additional files present. Ah, yeah. It's just we're putting new coversheets on all the TPS reports before they go out now I thought that I would find some good info on MSDN or around the internet, but was surprised a bit to find little information on this particular subject.  So I called up a friend at Microsoft and his answer was: SQL only writes to one ldf file for a database at a time.  Multiple log files are only useful for working around running out of space on your log volume. [1] What he’s saying here is that extra log files are normally used as a fail safe for the first tlog – SQL Server does not write to all tlog files at the same time, nor does it do round-robin type writes as it does with multiple data files.  Then I did happen to come across some work from Paul Randal on sizing the transaction log, and he also mentions the same thing for the most part: Add more transaction log files. As a general strategy for managing the size of the log, this is not good. Extra log files have zero effect on performance (the myth that SQL Server writes in parallel to the log files is just that - a myth) and make management more tricky. However, if your log fills up for some out-of-the-ordinary reason, then adding another log file may be the only way to allow the database to keep running. If you have to do this though, make sure that you remove the extra log file(s) when you can to keep log management uncomplicated.[2] I’m not certain what management complication he’s referring to, other than you would have multiple log files to back up maybe? Not sure. Anyway, the first part of the paragraph pretty much matches what my buddy offered up. And I agree here – over-engineering anything is worthless, so why do it? And that’s exactly what I’m trying to root-out here. By the way, Paul Randal has got some really great stuff on his blog, very impressive.  I read his bio and see that he was with Microsoft for 1000 years, so I figure that he knows what he is talking about.  I’ve also read some of his articles as a subscriber to SQL Server Magazine for the past decade, so I know he’s good with SQL Server internals as well. Check out his blog postings on the transaction log [2] for some really cool threads. Want to go to Chotchkie's? Get some coffee? Being the curious one that I am, all of this info is good and wonderful, but I wanted to investigate just a bit deeper to see if SQL Server really just ignores these files, whether SQL Server writes to them or not, or whatever else I might be able to dig up. I created a database called Test, and it looks kind of like so: Notice that I’ve shrunk the logs to 1MB, but you can create them this size to start off.  I also have them on different vhds, so I know that I/O won’t be running together in case we find something here in our little investigation. Run the following to verify that the log files are all 1MB: SELECT (size * 1.0 * 8.0)/1024.0 AS size_in_mb,* FROM Test.sys.database_files WHERE data_space_id = 0 I now execute the following script and insert some data into a table called LogTable. It has two columns, one of decimal data type, and the other char data type: SET NOCOUNT ON DECLARE @cnt decimal(10,4)=0 DECLARE @rows int=0 BEGIN TRAN WHILE 1=1 BEGIN INSERT INTO LogTable VALUES (ROUND((RAND()* 1000000),0), SPACE(1024)) SELECT @rows+=1 SELECT @cnt = (size * 1.0 * 8.0)/1024.0 FROM Test.sys.database_files WHERE data_space_id = 0 AND [FILE_ID]=5 IF @cnt>1.0 BREAK END SELECT @rows; GO I randomly choose [FILE_ID]=5, but I already know that I could have used 3 or 4 as well.  When the script complete s, I inserted a total of 23461 rows, but yours will most likely vary. Notice now the size of my files when I run the first script: Interestingly, all of the log files have been written to – obviously the first tlog file has most of the writes, but the others are the same size by the time the script hit the BREAK.  Apparently SQL Server did write to more than one log file, but it’s strange that it wrote only 1MB to 32MB for the first transaction log that was created. Subsequent runs indicate a similar behavior, but I ran a few times and found that it hit the BREAK with considerably fewer rows than 23K from the first run. And then there's Tom Smykowski... He's useless. What is SQL Server doing here, and what’s in these additional tlog files? I can only answer the second question, and may have to ping the storage engine team to explain it I suppose. In order to find out what’s going, I went to ApexSQL and downloaded the ApexSQL Log reader tool;  go download the 14-day trial and check it out if you are following along with me. When I consulted at Rent-A-Center a few years ago by the way, they had the Apex suite of tools and I really liked them. I installed and fired up the tool and here’s what I get when I have it inspect the Test_Log2, 3, and 4.ldf files: I get nothin!  So obviously SQL Server is writing some internals stuff here and not info from our INSERT statement(s).  Cool!  Running the tool with only that first log file, I get this, so we know that the tool is working correctly: And there’s IO as well for those log files for disk drives e, f, and g…as we would expect:   Conclusion We’ve seen that 1) SQL Server does not round-robin multiple transaction log files, 2) additional files do get written to, but not with transactional information per-se, 3) whatever is getting written to these tlog files happens at the same time, and is only a fraction of what is written to the first tlog file created for the database, and 4) there is I/O present on the disk when the additional tlog files are written to, but not much. I’m going to keep looking for some answers here, so let’s call this part 1 and I’ll investigate in some more and see what I can find…definitely interesting to me, and I hope you enjoyed the read. Right now I’m fairly convinced at this point that additional transaction log files are of no use in a normal Microsoft SQL Server database setup. Thanks much, Lee Everest ------------------------- I can't believe what a bunch of nerds we are. We're looking up "money laundering" in a dictionary References [1] Unnamed resource from Microsoft haha [2] http://www.sqlskills.com/BLOGS/PAUL/post/Importance-of-proper-transaction-log-size-management.aspx [3] http://www.sqlskills.com/BLOGS/PAUL/category/Transaction-Log.aspx [4] http://www.apexsql.com/default.aspx

If you’re like me, you enjoy random geeking with Microsoft SQL Server, and since most of our geeking is done at home you probably have some sort of setup for research, experimentation, or just for fun.  I am lucky to have a full-blown Active Directory domain similar to what you might have at work, along with three small servers running Windows Server 2008 R2, and one with Hyper-V installed.  I also have about 2 TB of storage laying around, most of it slow and not redundant.  One thing that you need for doing some heavy-duty stuff in SQL Server is a fast disk I/O subsystem;  obviously it is impractical or unaffordable to have anything like what one might have at the workplace, but we can have a really nice RAID setup for a nominal cost like the one that I just built this afternoon.  I’m amazed how inexpensive hardware is these days, and a super cool little setup for large databases for experimentation, proof-of-concept, presentations, performance tuning, additional portable storage, or .vhds is what I want to discuss in this blog post. Essentially, you need only four components to create a little external RAID array: Two 2.5” disks – I picked up two Hitachi Travelstar disks, which are pretty nice because they have a three-year warranty.  You pay on average $3/year for the warranty over the cheaper model, so I went with these.  Mine are 250GB, but you can certainly buy larger ones (right now up to 750GB at www.newegg.com).  I bought mine at Frys, by the way. One eSATA II PCIe i/e card – this is to “easily add internal and external Serial ATA II (3 Gb/s) ports to PCIe equipped desktops”. Since my current hardware lacks an eSATA port, I opted to buy a PCIe card to provide eSATA connectivity.  As an option, I could use the USB port included, but for my purposes, eSATA provided the performance I need, and we’re all about speed here.  I got a SIIG for super-cheap! One 2.5” double drive mobile rack for SATA HDD/SSD drives – This is the case that will serve as our RAID switch, and can handle two drives.  If you want to spend the extra money, you can get these for a reasonable price up to 6,8,10+ bays as well, and doing this you can actually do various RAID levels for some very cool stuff. (Mine holds two disks, so I can do JBOD, RAID O, and RAID 1, but I saw a few there that do RAID 5, 10, etc.).  Because the RAID controller is built-in to this enclosure, it can also do various splitting of RAID so you don’t have to use the entire disk space for just one RAID level.  I bought this one and it’s perfect for what I’m going for. Also, if you decide to get SSD, this handles those as well.  Remember – we’re talking true hardware RAID here, not software RAID that you build in disk administrator in Windows.       Once you get all of this stuff, plug your PCIe card in, fire up the drivers, plug the disks, set the dip switches to what RAID level you desire, and you have a RAID array in a box the size of an external drive!  Awesome!!!  You can use it locally, or get a nice case for it and take it with you to work, clients, presentations, wherever.  Load up SQL Server database files, .bak files, .vhds, etc.  If you decide to go portable with a laptop, you will probably need an eSATA express card for your laptop express slot, also very reasonably priced.  Performance?  This thing should run 2-times as fast using RAID 0 as the external hard drive that you use now. And cost?  About $200.00 for my setup, and I’m lucky in that I get to bill my employer for most of it :)) If you put one together, give me a shout and let me know how it goes! Thanks for reading, Lee   ------------------------

Posted in: Etc. (Off-topic) , SQLServerPedia  Tags:
Training for SQL Server. That's all that I see anymore everywhere that I look. From Microsoft's sites (developer resources, TechNet, Channel 9, MSDN blogs) to SQL Saturday, love boats, blogs that list training resources, companies everywhere offering training, national conferences, SQL Lunch, TSQL Tuesday, virtual conferences, SQL Server user groups, videos at sites such as SQLShare or Midnight DBA, even collaborative web sites chock full of blogs and training material.  You name it, and there's dozens or hundreds of venues to get training.  Everybody is getting into the training game!   Why?  Because it's fun!  These resources are all very good each in their own way no doubt, and many boast some well known "big-name" guys in the SQL community (SQL Server MVPs, MCMs, industry experts, etc.) who put in lots of work and effort to keep new and old SQL Server practitioners up to speed; we're all lucky that the community is vibrant and strong. One source that is really never mentioned that I offer-up is an alternative place to learn SQL Server - credit courses at your local community college. I'm not sure why it's never in the listing of places to learn, but this is an excellent route to get some really good training while earning college credit at the same time.  As the school year is now upon us, I know that places such as the Dallas County Community College District and North Lake College, for instance, are once again offering several SQL Server courses for aspiring students and current SQL Server folks. With this avenue of learning, forget about the one-hour, two-hour, three or four day time span, because instruction here is either a full 16 weeks, or something less but with 16 weeks of material compressed into a shorter time frame.  My first class this Fall, for instance, is 5 weeks in duration on Tuesday-Thursday nights from 6pm to 10pm.  On the one hand, yeah, that's a lot of time right after the work day, but on the other, it's a deep-dive...no brushing over concepts or lightly touching material.  Full training, hands on, learning SQL Server concepts, at the keyboard working through problems with others, sharing experiences, all in the atmosphere of the "community of learners".  I suppose I should make note of the concept of 'trainer' vs. 'teacher'.  You get training at most of the aforementioned outlets.  A trainer, however, is much different than a teacher [1]; a trainer uses his or her work experience and knowledge to share or convey concepts and ideas to an audience that are sometimes market-driven, or driven by a time-specific demand, oftentimes in bullet-list format, while a teacher meticulously creates a learning experience designed to engage and lead the student down an "inspirational" path through contemplation and thought.  Usually a trainer has intimate knowledge of a repetitive process or set of steps, while a teacher has instruction in psychology, sociology, and education in addition to the skill-set being conveyed.  Two separate concepts, both have their places in learning.  I believe college courses at a community college offer each - you get experienced practitioners from the field who share concepts based on industry experience, along with persons with advanced-level degrees who have professional instruction to teach others. I know, because I have experience in both.  My dad once said: "College doesn't necessarily teach you a skill, but rather teaches you how to think". Well said, dad. I suggest that you investigate this option if you are considering SQL Server training, and very soon - the school year is upon us so now is the time. Go check out your local community college, four-year university, or continuing education department for SQL Server training and see what's available. Courses are usually inexpensive and a lot of fun, and the learning experience is rewarding.  Find out if this avenue for learning fits into your schedule and budget, and seriously consider it the next time you want to learn SQL Server. Lee Everest   ----------------------------------     [1] http://socyberty.com/education/whats-the-difference-between-a-teacher-and-a-trainer/

Lee posted on July 15, 2010 12:17
Recently I met with a couple of guys from Microsoft over lunch to discuss what it might take to be considered for the award of Microsoft SQL Server MVP, and I thought that I would share this information for those interested while it is fresh on my mind. After doing some research on the program I have found out some things that you might consider helpful if you strive to work towards this goal. A little background first, about six months ago I had a couple of friends at nearly the same time tell me that I should investigate what steps were needed to become an MVP. They knew that I geek with SQL Server morning, noon, and night, weekends and holidays included, and might be a good candidate for this status. Actually I never thought much about it, because I really didn’t know anything about the program and wasn’t terribly curious to go investigate. Then shortly thereafter, someone from work mentioned it, which prompted my curiosity - one of the directors at the office said that it would be nice to have a SQL MVP at our Dallas office. Seeing that the company where I work has 200 employees at our local office, over 8000 in the United States, and 96,000+ world-wide, I figured that we should have some MVPs floating around. Add to this the fact that we just won a distinguished partner award, the two seemed to go hand-in-hand. After doing some investigation though, I was a bit shocked to find only one SQL Server MVP (John Paul Cook, who is located in another office and by the way a genius at infrastructure as well as SQL. There very well may be more, but he is the only one that I am aware of). At that time, I did some more reading, asked questions here and there, and came to the conclusion that, yeah, I might go ahead and investigate this a little…it may be cool to work towards becoming a SQL Server MVP. So I pinged some resources at our office and was hooked up with a former Microsoft employee who is now working for us as our practice area lead. He in turn contacted a couple of folks from the local Microsoft office, and we all got together to chat about the SQL Server MVP program. Here are a few take-ways for you, in case you are interested in going for this award. These were suggestions directly from two Microsoft SQL Server folks out of the Irving, TX office; to what degree or route for a candidate to follow is up to him/her, and this serves as an outline from my notes: Certifications – Get your certifications. This was one of the very first things mentioned. You have to have these up-to-date as well. Certs from 1999 are probably not what they have in mind so much. Blog – You have to have a SQL Server blog that you regularly contribute to. Pick interesting topics here. Speaking regularly at engagements – Make it a point to speak at local and, if possible, national events. This can include local SQL Server groups, SQL Saturdays, or other SQL Server events. Attend national events – PASS, and DevConnections or SQLServerconnections. And, get to know the folks there. Make contacts. Get to know someone on the product team – This is important, and probably a difficult one to achieve. You may meet these folks from events attended as noted. Meet someone and work with that person; you may assist them by doing some research, testing, or other proof of concept work. Attend Microsoft Partner events – Such as the partner conference that is going on now in Redmond. Write or assist with a book – This is one that is probably difficult as well, but will help greatly. Find out from authors if you can contribute to a chapter in an upcoming book. Contribute to some of the popular web sites – such as SQLServerCentral or MSSqltips. Specialize in a product within SQL Server– SQL Azure and Parallel Data Warehouse come to mind. In other words, choose a feature in the product and specialize in it. Write a blog about it, etc. and become a source of information to help the community. I sort of pulled out of them that it wouldn’t hurt to pick a topic that a lot of folks aren’t already blogging about or researching. Get active on social.msdn.com – You have to contribute regularly to Microsoft’s forums; they will check this for your participation in helping customers resolve problems. One of the first things that the committee will go look at. Create training products – One of the local guys here in Dallas has a side business doing training videos. This is a major contribution to the community. Get out and meet local MVPs – While you can nominate yourself, there may be some SQL Server MVPs in your area to get to know. Find out how they got nominated, how they got their award, and specifics on what they did to achieve their recognition. Build something – This one was suggested, but to a lesser degree these days. You might create a set of products to put out on Codeplex, for instance, that will help the community solve something that is currently not in the delivered toolset. One thing that was not mentioned but I sort of figured out…it will help to work for a company that will support you in this effort. You obviously can’t afford to take time off of work and fly across the country on your own nickel, so it is imperative to work for someone willing to assist you with the process. This also requires them to know, to a degree, what they will get in return for having a SQL Server MVP on staff, and understand the benefits of their investment. If you have no support in this area, it will be more difficult, in my opinion. Working for an insurance company or maybe a clothing retailer (not that there is anything wrong with these industries) might not be as good as working for a consulting company or products specialty company. As you can see, there are a lot of tracks or avenues to go about becoming a SQL Server MVP; you probably can’t do all of them, but can certainly take-on many. So, what do you think? Are you interested in pursuing this status? Do you have the determination to achieve this goal? Do you live, eat, breathe, sleep, wake up thinking Microsoft SQL Server?  After meeting with the guys mentioned, I’ve sort of concluded that I’m not on this track at the present time. I tend to enjoy doing other things, stuff that you read about on my blog, and haven’t ventured out too far with some of the requirements that they have suggested to me. Remember that not everyone will become an MVP – you have to position yourself to ‘stand out’ amongst the crowd, so to speak, and at the end of the day you may be nominated but may not receive the reward. But, it’s definitely something to keep in mind for the future. No doubt that the folks that I have met who are MVPs are really super class-acts for sure; someone attaining the status will enjoy the camaraderie and friendships that will be made for a lifetime, and will open up very rewarding paths and opportunities throughout a career. Thanks for visiting, Lee Everest M.S.  texastoo.com   --------------------- “It feels so good to be back here at the Dexter Lake Club. We'd like to do a tune entitled: Shama Lama Ding Dong. So hit-it!”         Flickr Tags: sql server mvp,become a sql server mvp,Microsoft MVP program  

Start reading here, please If you happen to have caught any of my blogs before, I like to test (and challenge) SQL Server’s “sacred cows”;  I have always written code in a number of ways to see what particular design works for a given implementation.  I’ve concluded a long time ago that there are no absolutes in SQL Server development, and just when you think that one way is the right way to do something, you can certainly find another that challenges that method.  This may be due to several factors: different types of data, small or large amounts of data, low to high transaction environment, hardware, server or disk setup, etc.  The list goes on forever.  And, every time that a noob comes to me and asks, “Hey what’s the best way to write this?”, I am always quick to say, “I don’t know, I’d have to go write it. Why don’t you try it several ways and find out for yourself?”.  So, I’m geeking with another of these sacred cows, just for fun, to see what I can come up with – this one is whether explicitly dropping temporary tables is a hindrance to performance. Once Upon a Time About five years ago, I worked on a super-high transaction processing system that was fed by BizTalk. If you’ve ever worked with BizTalk, you know exactly what it can do to SQL Server as I do – kick off a bunch of orchestrations and and you can flat-out flood SQL Server. I talkin’ bring the database server to its knees, bra! And we did it all of the time until we learned the art of fine-tuning locks and blocks with BizTalk, and using query hints in just the right places.  It was grueling but a lot of fun.  At the time, Microsoft in Dallas stated that it was the largest BizTalk setup that they knew about, anywhere.  We fought with our system for hours, days, even weeks, until we had it down. Great learning experience for sure. PITA, but we had some fun times. During that time, a rep from Microsoft came by and looked at my code.  For some reason or another I had gotten into the habit of explicitly dropping temporary (temp) tables, which he quickly pointed out thank you.  Don’t ask me why, I have no idea at the time, I was just doing it.  This was very early on in SQL Server 2005 development.  Anyway, he goes on to tell me that this will cause things to run slower, and may create unnecessary CPU by doing so…essentially, his suggestion was to just go ahead let SQL Server get rid of the temp objects gracefully and without the “Drop Table #xxx”.  He never mentioned anything about GAM or SGAM, allocations, or reuse of temp objects.  And because we were suffering from locks, blocks, and deadlocks, I was willing to try anything to help speed things up.  We also had quite a few nice scripts to look into performance, so we could readily see if things helped or hurt our code. A guru chimes in on the subject This month’s SQL Server Magazine (July 2010), super-stealth Ninja SQL guru Andrew J. Kelly, a SQL Server MVP and practice manger at Solid Quality Mentors, had a paragraph about this very thing – explicitly dropping temp tables. He mentioned that in one situation a query or process decreased in execution time by 50ms from removing drops and/or truncates, however, he didn’t detail the situation much or offer up any test results or cases to speak of, only that he “confirmed that explicitly dropping a temporary table will defeat the ability of using caching mechanisms”.  I don’t doubt him a bit, he’s one helluva SQL Server technician, but wouldn’t it be fun to check this out for ourselves nonetheless? No guru here, but I did stay at a Holiday Inn Express last night I tested this notion of firing off scripts that create hundreds of transactions/second today to see if I could find out for myself whether or not there was any fact-based evidence that explicitly dropping temp tables caused slowdowns in a SQL Server.  As I type I am debating on if I should post my code or not because I don’t really think folks will try it out.  Maybe I’ll just skip that part, but if you want it send me an email. Essentially what I have put together for my test cases is the following: A batch file that fire off several sqlcmd commands using DOS start, so that I get multiple spids at a time running concurrently and hitting the same tables A stored procedure using crossing multiple databases that does a simple fetch of a customer, fetch a product, check to see if product amount >0, create an order, and insert the order into an Order table. All are done randomly by doing ORDER BYs using NEWID() The proc creates a temp table and then inserts some metadata at the end The proc either drops/does not drop the temporary table depending on the test In perfmon, I observe roughly 650 transactions/second in tempdb, and maybe 200 transactions/second on average across the three databases.  I actually added several counters in perfmon, including as work tables created and work table cache, none of which were terribly interesting If you want the code, ping me and I’ll zip it up and email it to you as mentioned, or try your own design out for comparative purposes I”m watching the Bachelorette while writing this, BTW. My meager eight tests are below. Note that I run them with the temp table drop, and without, and the last two are with a large column so that I get close to a page allocated for each insert. (A friend suggested this, why I have no idea. But I thought I’d see if it made any difference because it sounded reasonable). Interesting findings I guess: Notice that with the explicit temp table drop, I get an average execution time of 248 seconds, and without the drop and average time of 285 seconds. Again, this test is executing 600+ transactions/second across three spids running a loop of 10,000 rows per iteration. In all, I end up inserting between 150K and 190K records inserted during a single run, which lasts about three minutes.  In my tests – granted, probably not worth much – show that dropping a temp table is actually faster given my test, hardware, and stored procedure. The end, or just the beginning? What’s fun about what we do as SQL Devs and DBAs is that we are constantly facing change; change in SQL Server and how it performs, changes as new features are added, changes as the database engine is tweaked.  This is what is so interesting about working with this great tool!  One minute, just when you think you have something figured out, a whole new list of ideas opens up from something that someone experiences.  What I have presented here is just dorking with SQL Server…I’ve really not done more than a couple of hours worth of work, and certainly don’t have a well-thought presentation by any stretch. Yet I’m sort of surprised at what I found. And, what’s even more exciting is that you or someone else could take this idea and come up with, given a different set of constraints, the very opposite results that I have. Thanks for reading, Lee Everest   ------------------------ “Land's sakes, Wolfie; ain't you gonna eat me???”   EDIT:  Have some requests for the code.  Run this, and let me know what your results are. Download here Kelly,A (2010). Is Tempdb Affecting your Day-to-Day SQL Server Performance? SQL Server Magazine, pp. 27-28  Flickr Tags: dropping temporary tables,explicitly dropping temp tables,temp table allocation,temp table performance

Lee posted on May 27, 2010 13:28
Thoughts on some of the latest items removed. Perusing the list here are a few notable ones for me.  We can’t let these go by without a commentary on my favorites, so enjoy! Features not in the next version 1) Compatibility level 80 – Finally after ten years we’re getting away from SQL Server 2000. Unfortunately where I sit today, still seeing some SQL 2k boxes. Uh oh. 2) AWE – wasn’t this confusing as hell anyway? Glad it’s gone. 3) sp_dboption – just used this the other day. They’re seemingly rolling everything into ALTERs. 4) SET ROWCOUNT – still like using it, not sure why they have to get rid of it. Oh well. 5) *= and =* - the threat has finally become reality! 6) sqlmaint utility – working with SQL Server the past 13 years and I still haven’t figured this one out. Features to be removed within the next 20 yrs 1) Database compatibility level 90 – really? 2) The following SET ANSI_NULLS OFF and ANSI_NULLS OFF database option – Sure will miss geeking with nulls when this goes away. SET ANSI_PADDING OFF and ANSI_PADDING OFF database option – don’t care. SET CONCAT_NULL_YIELDS_NULL OFF and CONCAT_NULL_YIELDS_NULL OFF – always going to be set to ON now. Um, yeah probably a good idea. Never really wrote an expression where I actually wanted NULL as my result set. 3) DBCC DBREINDEX, DBCC INDEXDEFRAG, DBCC INDEXDEFRAG, DBCC SHOWCONTIG, DBCC PINTABLE – Sorry to see these old friends go away. 4) XP_API – isn’t this really the ODS API? What on earth is XP api? Anyway, hate to see this one go as well. There was something to be said for writing extended stored procedures. Ken Henderson spoke at a local sql user group 7-8 years ago; taught me how to write them. These were fun to write and filled lots of hours of mine, dorking with the c/c++ code. They were special. 5) sp_configure 'allow updates' – have run this one 14,000 times in my day it seems. 6) SQLMail – We can only thank Jesus, Mary, Joseph, and all of the Angels and Saints for this one. 7) WRITETEXT, UPDATETEXT, READTEXT – was there ever a worse implementation of something in SQL Server? 8) Three-part and four-part column references in SELECT list – yeeeichs! Better fire up SQL Server Upgrade Advisor/SQL Server Upgrade Assistant for new upgrades/migrations! 9) Numbered procedures – I always loved these. Most people never knew that you could number them! 10) Not ending Transact-SQL statements with a semicolon – see #8.  Ugh. 11) Use of #, ## as temporary table and temporary stored procedure names –they finally removed my favorite table name.  Only a few of us knew about this.  It’s the one… with no name! haha       Lee   -------------------------- You’ll not find these types of interesting facts anywhere but on this blog site.   Site: http://msdn.microsoft.com/en-us/library/ms143729.aspx Flickr Tags: SQL SERVER 2008 R2 deprecated features,deprecated items,SQL Server items discontinued

Posted in: SQL Server 2008 , SQLServerPedia  Tags:
admin posted on May 23, 2010 08:03
A smashing success! I certainly enjoyed my time at SQLSaturday and thought that the team, including Sri Sridharan, president of the North Texas SQL Server Users Group, did an excellent job putting it together.  From all accounts that looked like a tremendous amount of work to pull off such a great event, so congrats to the core team guys for all of the great work. The rooms were perfect, lunch was perfect, giveaways perfect,  total first-class job, guys.  It was also great to see my friend Tim Mitchell, visit with Sean and Jen McCown, and chat with a bunch of others throughout the day.  Let's have another one soon! For the folks that signed up for my session, I want to thank you again for attending. It was an honor to be able to present to you and I appreciate your enthusiasm and participation.  To take off from your weekend, away from loved ones and all that was going on in Dallas yesterday (The Byron Nelson golf tournament, Texas Rangers game, Taste of Addison, etc.) it was a privilege to spend this time with you. And, I was very pleased to see so many sign up, we had nearly a full room!  Mine was class in the beginner track but I could tell from the start that everyone was at a higher level for the most part;  I had to make adjustments on the fly, 'turn it up' a notch so to speak in order to make it a little more exciting. Kudos to you guys for making it an enjoyable day, and I hope to see you soon at a user group meeting or somewhere down the road! Lee Everest   --------------------

Posted in: Etc. (Off-topic) , SQLServerPedia  Tags:
A tutorial on the Script Component, continued. Introduction The thrilling conclusion to my three-part series is upon us!  Actually the series isn’t terribly thrilling, nor is this part all that interesting. (Well, it’s alright I guess, but nothing earth-shattering.  This is because if you have read the other two blog postings in this series you know how well we have taken the confusion out of this exercise :)).  In this blog effort, I drag the script component out and then select “Destination” for script component type; remember that we had three choices, either Source, Transformation, or Destination for using the Script Component to explore. Background Info – It gets technical here, so pay attention! My first idea for this type was to have a way to capture a max date from a data flow and then simply end the flow. The reason for this is because I want to use the date later on down the road.  But, after I did this I thought it would be cool to add another task to update an existing row in a table.  You could either use this updated row as informational, or read the table the next time the package runs (at the beginning of the package, via another task) and use the value in a WHERE clause of the data fetch, which might make for a good way to incrementally load the table.  There are all kinds of ways to do this sort of population,  and this one is about as good as any other for a Type 1 scenario.   Figure 1. Screenshot of the package   Figure 2. The aggregate task opened   Figure 3. The Script Transformation Editor (Script Component)   Figure 4. The Script Transformation Editor – Input Columns   Figure 5. The Script Transformation Editor – Inputs and Outputs   More Code Now, here’s the code for the Script Component: /* Microsoft SQL Server Integration Services Script Component * Write scripts using Microsoft Visual C# 2008. * ScriptMain is the entry point class of the script.*/ using System; using System.Data; using Microsoft.SqlServer.Dts.Pipeline.Wrapper; using Microsoft.SqlServer.Dts.Runtime.Wrapper; [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute] public class ScriptMain : UserComponent { public override void PreExecute() { base.PreExecute(); } public override void Input0_ProcessInputRow(Input0Buffer Row) { IDTSVariables100 variables = null; //Use VariableDispenser to write to variables outside of PostExecute() this.VariableDispenser.LockOneForWrite("MaxDate",ref variables); variables[0].Value = Row.TransactionDate; //Add a row for the output column OUTPUTBuffer.AddRow(); OUTPUTBuffer.MaxDate = DateTime.Parse(variables[0].Value.ToString()); //MesssageBox to see the variable value System.Windows.Forms.MessageBox.Show(variables[0].Value.ToString()); //Unlock the variable variables.Unlock(); } public override void PostExecute() { base.PostExecute(); } }   I use this.VariableDispenser.LockOneForWrite because you can’t access a variable outside of the PostExecute() method unless you do so. Not sure why, must be a “feature”!  For the database and tables, I am using Production.TransactionHistory table as a source from AdventureWorks 2008, and the below script is for the destination table. It’s simply a copy of TransactionHistory: USE [AdventureWorks] GO /****** Object: Table [dbo].[tbl] Script Date: 05/18/2010 09:56:06 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[tbl]( [TransactionDate] [datetime] NULL, [ModifiedDate] [datetime] NULL, [TransactionID] [int] NULL, [ProductID] [int] NULL, [ReferenceOrderID] [int] NULL, [ReferenceOrderLineID] [int] NULL, [TransactionType] [nvarchar](1) NULL, [Quantity] [int] NULL, [ActualCost] [money] NULL ) ON [PRIMARY] GO   Here are the screenshots for the OLE DB Command task, in case you haven’t used this one much. It’s cool, especially for INSERT/UPDATE type scenarios. Figure 6. The OLE task Connection Managers tab   Figure 7.  The OLE task Component Properties tab   Figure 8. The OLE task Column Mappings tab Code for the PROCESS_LOG table used in the OLE task: UPDATE dbo.PROCESS_LOG SET LAST_UPDATE_DATE = ? WHERE TABLE_NAME = 'dbo.tbl' GO CREATE TABLE PROCESS_LOG (PROCESS_ID INT IDENTITY (1,1) NOT NULL PRIMARY KEY ,TABLE_NAME VARCHAR (40) ,LAST_UPDATE_DATE DATETIME ,LAST_UPDATE_BY VARCHAR (40) DEFAULT SYSTEM_USER ) GO INSERT PROCESS_LOG (TABLE_NAME) VALUES ('dbo.tbl') GO SELECT * FROM PROCESS_LOG   OK!  So if I run this whole mess, here’s what I get: Figure 9.  MessageBox showing the max date of the stream   Figure 10.  Package Execution.   Figure 11.  Row gets updated with max date from the table Conclusion We checked out the Script Component in SQL Server 2008, and used the “Destination” type to capture a max(date) value in a stream. This was a nice way of doing this because our other option would be to simply write the query in an ExecuteSQLTask – doing this, however, causes another trip to the database in order to fetch the max date, and so because we can use the Aggregate task we can get the max date in one pass, and from there use the Script Component to assign to a variable. As you can see, I was quite verbose in my code and screen shots, so please download the package and use these as a guide. I sort of hesitate putting packages out here for folks to try, because I always get the inevitable “It doesn’t work” email from someone. Doh!  Just ping me if you can’t get it to work please. On a sidebar, these days I am located at a truly ETL shop, and I am working and interacting with folks that have used other ETL tools long before SSIS came out. (This place does $50 billion in sales/year, so they’re not messing with Northwind, pubs, library, credit, or any of the other Microsoft databases down here. Heavy-duty bro).  They contend that ETL is all about keeping your hands off of the keyboard, so to speak, and not coding in c# or vb.net.  And I have heard this argument before from hardcore ETL-types, those that have used Informatica or DataStage for years.  Their thinking is that..”who is going to support this?”, since all they do is ETL, and not .Net coding.  On the one hand, they have a point, but on the other, SSIS is still in its infancy, and until we get the hundreds of transformations in SSIS that they have in those other apps, we will need to continue using the SSIS Script Task and Script Component to leverage the great tasks and APIs that are available to us. I hope that these three simple and straightforward blogs on the SSIS Script Component have helped someone out in their shop.  Remember that you have to make sure and fire up VS 2008/SQL Server 2008 BIDS, create the process_log table, and change the connections accordingly.  As always, thanks for visiting my site! Lee   -------------------------- My orders from General Lee is to hold the Mason-Dixie line, and no Yankees are crossing it!     Resources SSIS Package (SQL Server 2008) File download here Facebook SQL Saturday Rulz! Blog del.icio.us Tags: ssis script component,ssis destination script component

Posted in: SSIS , SQL Server 2008 , SQLServerPedia  Tags:
admin posted on May 12, 2010 19:54
Here's my list of what needs to go. Ok, everyone seems to be creating a list of five things that SQL Server should get rid of, so I'll do one of those as well.  (I already did a "five things that SQL Server needs", but apparently I jumped the gun on that one. Oh well).  Actually I'm fairly happy with everything in the product right now, and really don't have anything that's hanging around driving me crazy.  Having said that, sure here's a few that I wouldn't miss: Model Database - Does anyone really use this thing?   The ability to make changes to it so that when you create a new database it is "modeled" after a template, so to speak,  is a good idea I guess, but I never seem to use it. Database Diagrams - This facility is weak, so getting rid of it is no hair off of my <backside>.  Make it really slick or just trash it. Backup Devices - I can't remember the last time I actually created a device for a backup.  Not needed. OLE Automation - What I thought was once cool, back in my wilder and more daring days, is a garbage. This stuff needs to go. Can't believe that it's still around.  What's worse - I can't believe that it's not deprecated! I'll always appreciate that piece of ingenious code by the late, great Ken Henderson on programming the Microsoft Word API to spell-check before inserts into a table, but I'd never use something like that in a production setting. Auto Shrink - Probably the worst 'feature' eva!  I think they leave it in just so that they can warn beginning DBA- types to go shut it off, and fill up chapters in books and articles.  It's gotta go!!!   Lee   ----------------------- "He's not a man - a machine. A Terminator. A Cyberdyne Systems Model 101"   Flickr Tags: 5 things that SQL Server should remove

admin posted on May 11, 2010 20:44
Here's my list...enjoy! Super-stealth DBA/SSIS expert Jamie Thomson came up with a great list entitled "Five things SSIS should drop", and Aaron Bertrand offered the blog "5 things SQL Server should drop"; I thought I'd throw in my .02 cents and give my own list...Five things SQL Server should add.  Here they are and in no particular order: Object Restores from a backup - How many times have you wanted to just restore a proc, a table, or some other some other object in SQL Server from a backup?  I have.  Who cares if a table is "transactionally consistent"!  Just give me the table as-is and let me fix it.  We had table restores in SQL Server 6.5 but they removed them, remember? Bring 'em back!!! More minimally-logged operations - With the advent of the very popular document and key-value pair databases roaming the landscape (MongoDB, Cassandra, Hadoop, et. al.), we need to add more minimally or better yet completely non-logged operations at some point in SQL Server to compete with these guys. The new INSERT WITH (TABLOCK) is great - so what's next?  Again, just let me decide if I want to send the data without logging, and let me be responsible to see that the rows got there.  I already check for them anyway.  Some operations just don't need to be logged.  For example, do you think uploading a comment on facebook needs to be logged?  Uh no.  Just retry if it fails.  Give me the option to be "reasonably sure" rather than 100% positive at my discretion. Linked Servers with optimized distributed transactions - If we're going to scale out, let's scale out the right way. If you have a multi-server architecture that uses the MSDTC, you automatically have a non-optimized query. Congratulations.  Give me a query optimizer that can look at both sides of the fence and make heads or tails of query optimization.  Also, give me high-performance linked servers. MSDTC code hasn't been changed in 10-15 years;  Let's chuck it and redo the whole process.  (Either that or just get rid of it entirely). "Shared-something" or "Shared Everything" Architecture - Wouldn't it be cool to have one database file across multiple servers? Wouldn't it be cool to have a pool of RAM, disk, CPU that could be dynamically moved around a farm of servers?  Wouldn't it be nice to have a cluster that does more than just "high availability"?  Give me the processing power of RAIW. (Redundant Array of Inexpensive Workstations).  It won't work?  Uh, don't tell the boys from Terradata - they have it and it's great.  And, maybe we will as well someday soon ;) Add Ctl-B to SSMS - This was the greatest feature ever for those of us who used it in SQL 7 and 2000 Enterprise Manager.  Who took it out and why?  Huh? CTL + B was the shortcut that grabbed the query results window; when you wanted to move it, it snagged the bar between the results pain and the query window allowed you to slide it up or down with a mouse. Bring it back ASAP!!!   Thanks for reading! Lee   ------------------------- Buford T. Justice: Well, thank you, Mr. Bandit. And as the pursuer, may I say you're the ***dam*dest pursuee I've ever pursued. Now that the mutual bullsh*t is over, WHERE ARE YOU, YOU SOMB*TCH?   Flickr Tags: 5 things that sql server should add,5 things sql server meme

Posted in: SQLServerPedia , Etc. (Off-topic)  Tags:
INTRODUCTION The folks at SQL Cat (SQL Customer Advisory Team) post some really great stuff, and had a blog a couple of months back [1] on partitions and locking.  I haven’t played much with partitions yet in SQL 2005 or 2008 -  haven’t really had a need to up to this point, and haven’t randomly geeked with the technology since I’ve been busy geeking with other stuff.   But I knew that I would eventually come around to them, and now is that time - the client where I am located is wanting to use the technology in a consolidation effort in their data warehouse.  For their implementation, they want to deploy partitioning to segment country codes, and specifically, they want to make sure that DML on one country will not affect another country.  For example,  they want to know if inserts or deletes that are occurring for China affect the people who are querying data from Indonesia. The goal of my post is to expand what SQLCat did by doing the following: Change the INSERT to a DELETE statement (theoretically, changing DML type won’t matter) Increase the percentage of rows to be affected to those of all of the rows in the table Investigate what SQL Server Profiler captures during this operation Test a concurrent set of DML actions to the partitions Test it using a heap table BACKGROUND, AND AN OPINION Maybe I haven’t wanted to jump into the feature due to prior experiences.  I started working with SQL Server 6.5, so I am familiar with and have used all of the various flavors that Microsoft has offered to scale through the years, and frankly I was never impressed with any of them.  Federated servers, horizontal partitioning, distributed partitioned views, vertical partitioning, anything using the MSDTC, constraints to direct rows, etc.  I’ve never liked, all of which I believed were a small step short of a hack. Personal opinion here, so take it FWIW.  I am very skeptical when using “scale” and “SQL Server” in the same sentence; I am, however, very impressed thus far with partitions and table partitioning. Let’s gen up a scenario and do this thing. SQL CAT’S CODE We’ll set up the tables and indexes using the following, somewhat similar to their script but slightly modified: USE tempdb GO BEGIN TRY DROP TABLE dbo.Data END TRY BEGIN CATCH END CATCH GO CREATE PARTITION FUNCTION pf_year(INT) AS RANGE RIGHT FOR VALUES (2007,2008,2009,2010) GO CREATE PARTITION SCHEME ps_year AS PARTITION pf_year ALL TO ([PRIMARY]) GO CREATE TABLE Data ( [year] INT NOT NULL , UpdateValue FLOAT NOT NULL , PayLoad CHAR(200) NOT NULL ) ON ps_year([year]) GO -- drop index data.cix CREATE CLUSTERED INDEX CIX ON Data(Year) ON ps_year(Year) GO Notice that I have four years in my partition function, which will be really five logical partitions in SQL Server.  We can verify this by running the following query, as this shows our partition ids for our object_id for our “Data” table.  This is very important, because we will look into SQL Profiler to make sure that DML for one partition doesn’t escalate locks to one of the other partitions; if it did, we may experience accessibility/concurrency problems for those other users.  In the screenshot below we can see the partition_id column, and this is what we’ll search for to verify either way. SELECT * FROM sys.partitions WHERE OBJECT_ID = OBJECT_ID('Data') Figure 1 – Table with our partitions set up YES, WE WILL NEED SOME DATA Let’s populate the table now.  For this test I am deleting data.  What I am going to do is kill two birds with one stone: I am going to check for both locking at the partition level, and am also going to skew the table values just to verify that it has affect on SQL Server by inserting quite a bit more data in the partition that I am going to delete. (Remember that SQL Server will try to escalate up to a table lock if the number of rows being affected is percentage-wise is high compared to the number of rows in the table. SQL Cat used 10,000 rows for all partitions).  Run the following and you will see [rows] in the above query populated. --/* populate the table */ TRUNCATE TABLE Data GO DECLARE @y INT = 2007 WHILE (@y < 2011) BEGIN INSERT Data SELECT @y, 0, REPLICATE('X', 200) FROM (SELECT TOP (CASE WHEN @y = 2008 then 1000000 else 10000 end) o1.id FROM sys.syscolumns o1 CROSS JOIN sys.syscolumns o2 CROSS JOIN sys.syscolumns o3 ) AS BIG SET @y +=1 END   Figure 2 – Screenshot while table is being populated Figure 3 – Screenshot while table is being populated   This is cool – you can see each partition filling up from the above statement.  Because I can use expressions in TOP, I insert more records for year 2008 than the others to get the data skewed via the CASE statement; again, my thinking here is that we will verify that partitions don’t bother each other as they did in their test, but also make sure that deleting a large chunk in one partition will also not affect the entire table. I am going to use a read committed isolation level since that is what they use at the client (the SQL Server default) and leverage the new feature in SQL Server 2008 to set the lock escalation to AUTO at the table level.  These are worth changing up, by the way, to see the various affects on locking when setting the LOCK_ESCALATION for a given table. --SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED SET TRANSACTION ISOLATION LEVEL READ COMMITTED --SET TRANSACTION ISOLATION LEVEL REPEATABLE READ --SET TRANSACTION ISOLATION LEVEL SERIALIZABLE ALTER TABLE Data SET (LOCK_ESCALATION = AUTO) --ALTER TABLE Data SET (LOCK_ESCALATION = DISABLE) GO   ARE WE THERE YET? Alright, it’s time to run the script and look at this thing.  As mentioned, SQL Cat did not fire up SQL Server Profiler, so here we will so that we can 1) see lock escalations and 2) view the types of locks.  They also mentioned that you can add the Lock:Escalation event, but I am also going to add Lock:Acquired and Lock:Release as well, since these will give us additional information to get a better picture of the process.  Here’s the code: BEGIN TRAN DELETE Data WHERE YEAR=2008 SELECT lck.request_session_id AS session_id , resource_type , CASE resource_type WHEN 'OBJECT' THEN OBJECT_NAME(lck.resource_associated_entity_id) WHEN 'KEY' THEN OBJECT_NAME(p.object_id) WHEN 'PAGE' THEN OBJECT_NAME(p.object_id) WHEN 'RID' THEN OBJECT_NAME(p.object_id) WHEN 'HOBT' THEN OBJECT_NAME(p.object_id) ELSE CAST(resource_associated_entity_id AS NVARCHAR(MAX)) END AS object_name , lck.request_mode , i.name AS index_name FROM sys.dm_tran_locks lck WITH (NOLOCK) LEFT JOIN sys.partitions p WITH (NOLOCK) ON p.hobt_id = lck.resource_associated_entity_id LEFT JOIN sys.indexes i WITH (NOLOCK) ON p.object_id = i.object_id AND p.index_id = i.index_id LEFT JOIN sys.databases db WITH (NOLOCK) ON lck.resource_database_id = db.database_id WHERE lck.request_mode <> 'Sch-S' AND lck.request_session_id = @@SPID AND lck.resource_type <> 'DATABASE' COMMIT TRAN First I’m going to run this with LOCK_ESCALATION = AUTO. Notice in screenshots and that I have something similar to SQLCat; in the profiler shot I see that there is a HOBT lock type at the end.  Figure 4. Profiler with LOCK_ESCALATION = AUTO   Figure 5. Profiler with LOCK_ESCALATION = DISABLE The difference here is that HOBT is not seen in Figure 5, but it is in Figure 4, so I’m reasonably sure that we did get the HOBT granularity (Heap or B-Tree) for the first run.  Good deal. INTERESTING FIND While the above output was expected, the below two screenshots from the output of the queries for my two tests are somewhat strange and unexpected – they’re the same.  With AUTO and DISABLE, I get both HOBT granularity. After running this a dozen times or more, it appears that with tables with one very large partition over the others, you get the HOBT locking and not row-level locks even with DISABLE.  I retested this over and over as well, starting back with 10K rows for each partition, and then upping the 2008 partition at various increments. Only when I got to about 500,000 rows did it exhibit this behavior, and you could see that the page locks went away and were replaced by the output in Figure 6. Something to test further for sure; maybe I’ll ping the SQLCat team to comment on this.  What I am finding is that ALTER TABLE seems to be working inconsistently, because every time that I drop the table it seems to work OK.  I’ll follow up here. Figure 6. With LOCK_ESCALATION = AUTO   Figure 7. With LOCK_ESCALATION = DISABLE   TESTING A CONCURRENT INSERT WITH DELETE Here’s a test that they did not do – I am going to run the above DELETE, but at the same time run a large INSERT into another partition. Set up the first query, and then run this one at the same time in another window. BEGIN TRAN DECLARE @Counter int = 0 BEGINNING: INSERT Data SELECT 2007,0,' ' SELECT @Counter +=@@ROWCOUNT IF @Counter < 50000 GOTO BEGINNING SELECT p.partition_id ,lck.request_session_id AS session_id , resource_type , CASE resource_type WHEN 'OBJECT' THEN OBJECT_NAME(lck.resource_associated_entity_id) WHEN 'KEY' THEN OBJECT_NAME(p.object_id) WHEN 'PAGE' THEN OBJECT_NAME(p.object_id) WHEN 'RID' THEN OBJECT_NAME(p.object_id) WHEN 'HOBT' THEN OBJECT_NAME(p.object_id) ELSE CAST(resource_associated_entity_id AS NVARCHAR(MAX)) END AS object_name , lck.request_mode , i.name AS index_name FROM sys.dm_tran_locks lck WITH (NOLOCK) LEFT JOIN sys.partitions p WITH (NOLOCK) ON p.hobt_id = lck.resource_associated_entity_id LEFT JOIN sys.indexes i WITH (NOLOCK) ON p.object_id = i.object_id AND p.index_id = i.index_id LEFT JOIN sys.databases db WITH (NOLOCK) ON lck.resource_database_id = db.database_id WHERE lck.request_mode <> 'Sch-S' AND lck.request_session_id = @@SPID AND lck.resource_type <> 'DATABASE' COMMIT TRAN No problems here – we were able to delete a big chunk out of one partition while inserting into another, and both partitions display HOBT locking granularity. Very nice. Figure 8.  Concurrent INSERT and UPDATE YET ANOTHER INTERESTING TEST Let’s try a heap with the above test.  Drop and recreate the table without the clustered index, and run with 10K rows for each partition. Here’s what I come up with: Figure 9.  Heap – it doesn’t work.   CONCLUSION From my test, partitioning using ALTER TABLE and setting the LOCK_ESCALATION looks to be good news for concurrency in a large table.  I found no problems or collisions from doing various DML to the partitions at the same time.  One thing that I could not explain was why I got HOBT granularity locks from time-to-time with a large amount of data in one partition;  I noticed this when I did not DROP the target table.  This is worth looking into a bit more. Thanks for reading, Lee Everest, M.S.   ---------------------------- “Redo those buttons. Dress that belt buckle. Straighten that cap. And ***damnit tuck up those pajamas!”   Refs: [1]  Kejser, 2008. Enabling Partition Level Locking in SQL Server. Retrieved on 5/10/10 from http://sqlcat.com/msdnmirror/archive/2010/03/03/enabling-partition-level-locking-in-sql-server-2008.aspx Flickr Tags: table partitioning sql server,lock escalation in table partitioning sql server 2008,partitioning performance sql server 2008 facebook:

Posted in: SQL Server 2008 , TSQL , SQLServerPedia  Tags:
SSIS-related posts getting the most traffic by far. I was looking at Google Analytics the other day on my blog and wanted to know what topics and pages generate the most traffic;  I sort of had a good idea but wanted to verify by checking over the past year or so. Here’s the list of the top five pages on this site (not including off-topic and default.aspx page) for distinct hits.  Kind of interesting I guess. Use variables in the SSIS data flow tab   Reason Token Based Server Access Violation Failed Error sp_executesql string size in SQL Server 2005, 2008  Create a custom assembly in SSIS  SSIS Pivot, a simple and easy explanation It turns out that three of the top five posts (60% for those who are math-challenged) that gain the most page views are SSIS-related posts.  I think it’s cool to know that so many people are using SSIS and are looking for assistance and examples to help them in their shops.  When I need some help, a blog post with a generic and straightforward example with screen shots is imperative, and I have tried to make my examples easy to read without a lot of clutter as well.  I'm looking forward to doing more SSIS posts in the future alongside some of our great SSIS practitioners and experts, such as SQL Sever MVP Jamie Thomson and SQL Server MVP Tim Mitchell;  it's guys like these who give us all great ideas and motivation to go explore SSIS further. I hope that I have helped someone out along the way and, as always, I appreciate your visiting my blog and for the great feedback that you send.  This blog is my scratchpad for jotting stuff down, but truth be told I do this for you guys, not myself. Thank you, Lee ------------------------------------   Flickr Tags: popular blog pages,top sql server blog posts on my site,sql server blog posts

Posted in: SSIS , SQLServerPedia  Tags:
admin posted on April 12, 2010 22:14
Interesting finding, I reckon. I see some posts on the internet about the cost of updating a column to itself; you might find a statement (or maybe you have used one) that looks like so: UPDATE TABLE SET id = ISNULL(@var,id) Essentially, if the variable is NULL, you update the column to itself. (Real tricky, I know right?).  Someone asked me the other day what exactly happens here, and I told them that I wasn't quite sure to be honest.  Although I've heard of folks doing experiments on and claiming this or that, my best bet is to always test it myself regardless; glad I tested this myself as well. Let's check this out and see what's up. To clarify, the person's exact question was whether or not the transaction log grows during an update such as this.  This is really a good place to start for this action, because if the column updates itself, the log should grow in theory - a DML statement in Transact-SQL (save SELECT) will write to the transaction log by definition.  I first run this to create my table, and then run sp_helpfile: DROP TABLE TLOG_TEST GO CREATE TABLE TLOG_TEST (Id INT IDENTITY (1,1) NOT NULL PRIMARY KEY ,CharValue char (8000) default ' ' ) GO   Figure 1.  sp_helpfile before we've done anything Now, let's run our statement to make the TLog go!  Run the following slick statement to insert a few records: DECLARE @count int=0 START: INSERT INTO TLOG_TEST DEFAULT VALUES SET @count+=@@ROWCOUNT IF (@count <= 50000) GOTO START   Now we run sp_helpfile again.  Figure 2.  sp_helpfile after running the above statement.   Here we see that the datafile grows for sure, nothing yet for the tlog.  Now, run the following statement. UPDATE TLOG_TEST SET CharValue = CharValue For the sake of not pasting the screenshot again (you'll have to trust me on this one) the log did not grow. Somewhat unexpected, but sort of makes sense, really. While a SQL Transaction shows up in Profiler, SQL Server really isn't doing much here.   Just for grins and giggles...try this one. DECLARE @var char(1) = 'a' UPDATE TLOG_TEST SET CharValue = ISNULL(@var,'a')   Figure 3. sp_helpfile after updating the row to something else   Based on my test here, I don't see the log grow when you update a column to itself. To verify, Profiler revealed Reads, CPU, and duration, but no Writes;  I conclude here that the operation takes a toll on the server but that's about it. Test this for yourself under different conditions and see what you can come up with.  A follow-up might be to see what locks SQL Server actually takes out on the table. Thanks for reading, Lee   ----------------------- Sarah Connor: Kyle, the women in your time, what are they like? Kyle Reese: Good fighters.     Flickr Tags: sql update a column value to itself

Lee posted on April 4, 2010 14:58
Geeking with MongoDB and .Net. Time for another off-topic post?  Well, maybe not completely I suppose.  I had a .Net expert contact me this week and ask me if I had done any performance comparisons with a file or document-based database - often termed "NOSQL" databases - to SQL Server.   Geeking with all of this other stuff on my plate, I promptly responded no but would check into it.  Knowing very little about these systems, it wasn't terribly difficult to guess that for some DML operations one of these databases (Cassandra, MongoDB, others) would probably beat SQL Server hands-down based on what they were.  Let's check it out nonetheless. Using the Sam Corder (samus) driver I ran the following simple test in .Net, inserting 500K rows into MongoDB, and then running the TSQL that follows, inserting 500K rows.  public static void Main(string[] args) { using (var mongo = new Mongo()) { mongo.Connect(); var db = mongo.GetDatabase("Data"); var DataTest = db.GetCollection("DataTest"); for (int i = 1; i < 500001; i++) { var Data = new Document(); Data["GUID"] = new Guid(); Data["value"] = i; DataTest.Insert(Data); } } DROP TABLE MongoCompare GO CREATE TABLE MongoCompare (guid uniqueidentifier ,value int ) GO DECLARE @id int = 1 WHILE (@id < 500001) BEGIN INSERT INTO MongoCompare VALUES (newid(), @id) SET @id+=1 END GO   Using SQL Server 2008 Development Edition, the TSQL script took about 1.5 minutes, and the MongoDB took roughly 30 seconds (.5 minutes) to complete, which I sort of expected. We know that implicit transactions fire for each row in an RDBMS, and MongoDB, a non-relational document based system, has no ACID properties to keep things "transactionally-consistent" so to speak, although you could create this in the business logic layer of a rich or web client for sure, which would undoubtedly make the comparison a little closer I would say.  I should also mention that Perfmon CPU utilization went nutso during the MongoDB insert. It's funny and amusing to read all of the comments around this technology these days, by the way; competition from either MySQL, Oracle, DB2, or even these NOSQL databases, such as Hadoop, HBase, Cassandra, Hypertable, etc. will probably make our product even better in some form or fashion. Need to do some more stuff on this in the future. Who knows - some department or team somewhere down the road may have one of these databases that I'll have to source and merge with data from SQL Server. Lee   ------------------------       Flickr Tags: Mongodb,Mongodb vs. SQL Server,Insert comparison MongoDb to SQL Server

Posted in: Etc. (Off-topic) , TSQL , SQLServerPedia  Tags:
Ran across this one the other day. I was searching for something the other day on Google!, can’t remember what right now, and happened upon an old post [1] (old for the internet – 1998) entitled The EXISTS Flaw.  This referenced a C.J. Date from September, 1989 and his postulate on a flaw with the EXISTS syntax of the structured query language.  I’ve been using SQL and TSQL now for about 15 years and I never heard of it…glad I learn something new every day, right?.  So I look at this posting and right off determined that the author 1) isn’t comparing apples-to-apples in his two queries, and 2) is actually witnessing how each operator – IN and EXISTS – is supposed to work, by definition.  That is beside the point, because this particular blog posting is not to engage anyone in a debate (I might save that one for another occasion) but to instead look into the behavior of why NOT IN works as it does. While checking this out I discovered something interesting when you use NOT IN that I wanted to share. Run the following to create the table and insert the rows which he has done. DROP TABLE sp go CREATE TABLE sp (sno char(2) ,pno char(2) ,qty int ) GO INSERT sp VALUES ('S1', 'P1', NULL) INSERT sp VALUES ('S2', 'P1', 200) INSERT sp VALUES ('S3', 'P1', 1000) The question here is: “Find supplier numbers for suppliers who are known to supply part P1, but not in a quantity of 1,000”.  SELECT DISTINCT spx.sno FROM sp spx WHERE spx.pno = 'P1' AND 1000 NOT IN ( SELECT spy.qty FROM sp spy WHERE spy.sno = spx.sno AND spy.pno = 'P1'); SELECT DISTINCT spx.sno FROM sp spx WHERE spx.pno = 'P1' AND NOT EXISTS ( SELECT spy.qty FROM sp spy WHERE spy.sno = spx.sno AND spy.pno = 'P1' AND spy.qty = 1000);   Now, look at the result set.  You will notice that a S2 returns for the first query, but that suppliers S1 and S2 return in the second query. What’s up with this?  The author states that EXISTS does not return the correct result set.  Let us ask now, does NULL not equal 1000?  Without question yes, NULL does not equal 1000, and in fact NULL equals nothing, not even NULL.  So in this light you could make a case that in fact the first query is incorrect because our question wants to know who carries part P1 in a quantity other than 1000, and NULL is not a quantity equal to 1000. What I find interesting here, however, is to understand why the first query returns only S2 and not why the second returns both S1 and S2, and we do this with none other than a query execution plan.  Let’s look at them both:   Figure 1. Execution Plan (est) for query 1 Figure 2. Execution Plan (est) for query 2 For Figure 2, notice how for the predicates we find that qty=1000 and pno=’P1’ as expected. Now look at the same plan in Figure 1 and check out the differences – NOT IN also includes pno=’P1’ , but it’s interesting that it includes the compound predicate AND (qty IS NULL OR qty = 1000).  I am quite certain here that in my statement I have nothing that tells NOT IN to disregard NULLs.  This I cannot explain.  Why does NOT IN handle this as such?  Why is NOT IN doing me a favor and checking for (and throwing out of the recordset) the NULL?  Additionally, it would seem that the IN and NOT IN work differently.  If you remember, IN actually flattens out a TSQL statement to a series of ORs (Figure 3), but a NULL won’t be handled as it appears to be under the NOT IN scenario because nothing can equal NULL. If you were to add “, NULL” to the SELECT list you would not get sno equal to 1.   Figure 3.  IN flattens out to OR SELECT * FROM sp WHERE qty IN (200, 1000)   Notice when comparing Figure 1 to Figure 3 the vast difference in behavior for an IN and NOT IN. Wouldn’t one expect that IN would also include the following, since NOT IN adds this caveat?  Makes sense to me… OR [Test].[dbo].[sp].[qty] IS NOT NULL   Try some of these experiments with IN and NOT IN, and NOT EXISTS and see what you can come up with.  I find that geeking with these peculiarities is a lot of fun. Thanks for reading! Lee   ------------------------ “I haven't felt this awful since we saw that Ronald Reagan film.”   Flickr Tags: TSQL NOT IN,query plans,execution plans [1]  FFE Software (1998). The Exists Flaw. Retrieved 4/2/10 from http://firstsql.com/iexist2.htm

Posted in: TSQL , SQLServerPedia  Tags:

by Lee Everest, M.S.

Poll

Do you use Azure or cloud in your organization?



Show Results

Ads

Search


Month List

Calendar

«  February 2012  »
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
2728291234
567891011
View posts in large calendar

Tags

Disclaimer
The opinions, code, examples, et.al. expressed herein are my own personal opinions and do not represent my employer's view in any way, shape form, or fashion.  All code for demonstration purposes - no guarantees, either written or implied, are made.

© Copyright 2012 Lee Everest's SQL Server, etc. weblog