Introduction
The Script Component in SQL Server Integration Services (SSIS) seems to be a somewhat inauspicious data flow task that is oftentimes avoided or misunderstood by some. When you open the task, you don’t see much - a few methods, a couple of references, and that’s about it. Truth be told, it’s a very easy task to use and incorporate within the Data Flow of a package, and it can serve some really useful purposes; all that you need to do is get the basics down, and then start imagining possibilities for its’ use. There are a few good examples on the internet, and here I will hopefully shed some light as well. In this series I will explain the three ways to use the task, and include a basic example as well as one that solves a real-world problem. In part one, I will focus on the Script Component as a source. Part two will cover transformations, and part three I’ll see what I can dig up for Script Component destinations. These divisions coincide with the three ways to use the Script Component, by the way; when you open one in SSIS you’ll be required to choose how you will use the component.
The powerful Script Component The Script Component (let’s abbreviate SC, not necessarily an industry standard, but works here) falls in between the baked-in transformation tasks that come with SSIS and a custom component. You can almost do anything in an SC that you can in a custom component, but that someone might choose the latter if they were going to shrink wrap a custom transformation and sell it to the public, or offer it up on www.Codeplex.com for instance. Here’s what BOL says about the SC:
The Script component simplifies the development of a custom data flow source, transformation, or destination when the sources, transformations, and destinations included with Integration Services do not fully satisfy your requirements.
This is an important point – don’t use this if one of the delivered tasks will work. Use it when one of the canned SSIS tasks shipped with the product won’t handle the job at-hand. Otherwise, you will be using it in an ill-advised fashion and may run into other consequences down the road. My example packages are included as a .zip file, so download these and check them out in your own environment.
The Script Component as a Source
In this first part, we’ll use the SC as a source, so we will connect to a data source, make the necessary transformations, and then push them to an output buffer. Note that this “output buffer” is nothing more than what you would see by connecting a green line from a source task to a destination task in SSIS. That line represents the path that your data moves through. BOL discusses the Script Component as a Source:
You use a source component in the data flow of an Integration Services package to load data from a data source to pass on to downstream transformations and destinations. Ordinarily you connect to the data source through an existing connection manager.
One thing should be mentioned here; you should not use the SC as a source without connecting to something, such as a text file, database, XML document, etc. Sending 3.4145926 or some other hard-coded value to an output buffer, for example, is not a good use of the SC. (If you have the formula for Coca-Cola, well then this might be an exception…hard-code away. For most cases, however, this is not using the SC as it was intended).
A Simple Example
For demonstration purposes, I’ll do the ill-advised aforementioned and display a hard-coded value…how about pi? We’ll follow this up with a real-world example as mentioned. For now, open up BIDS (SQL Server 2008), go to the Control Flow, add a Data Flow task, and then drag a Script Component SSIS Data Flow Item to the Data Flow workspace. We’re going to be using c# in these examples. See Figure 1.
Figure 1. Adding a Script Component
Double-click on the SC and then notice the three choices on the left – Script, Inputs and Outputs, and Connection Managers. Here are their uses:
Script – this does nothing more than point you to the “Edit Script” button so that you can get to the Visual Studio Tools for Applications (VSTA), the place where you will code the SC
Inputs and Outputs – This is where you define what goes in and comes out of the SC. For our example we specified that we’re using the SC as a source, so we only get “Output 0” with no choice for inputs. This by the way is the output buffer that we mentioned before. You can have multiple buffers.
Connection Managers – Where you will optionally create and reference local connection managers.
Figure 2. Specifying how the script will be used in the data flow.
While we’re in the Inputs and Outputs window, go ahead and rename "Output 0" to OurBuffer. Then, expand Output Columns, click “Add Column” to create a new column, rename it piVALUE, and give it a data type of numeric [DT_NUMERIC] in the Data Type Properties window for the column. Make sure and change the scale to a number other than zero (mine is 6) so that you get the decimal values of the output. Yours should look like Figure 3 at this point. You have just created the underlying methods and properties for your buffer and column which will be the output of the component.
Figure 3. Creating the Buffer and Output Column
Now click back on Script option in the Script Transform Editor dialog box, and then click the Edit Script Button. We’re now in the Microsoft Visual Studio Tools For Applications 2.0 (VSTA); we’ll use this workspace for our custom code for our SC. You will find three methods:
public override void PreExecute()
{
}
public override void CreateNewOutputRows()
{
}
public override void PostExecute()
{
}
For now, forget about PreExecute() and PostExecute() methods, because we’re going to concentrate only on CreateNewOutputRows(). Since our task is to take the hardcoded value of pi, place it in a column within a buffer, and then make it available to a downstream to a data flow destination, we will make modifications here and thus define a new output row via this overrideable method. Remember that when we were in the Inputs and Outputs section, above, we created the buffer (the temporary holding place for our data), and set the number of columns and types that it would include (we set one and defined it as a numeric value). Now we code the CreateNewOutputRows() method to actually assign values to our buffer. In the CreateNewOutputRows() method, type in the following code:
public override void CreateNewOutputRows()
{
OurBufferBuffer.AddRow();
decimal vpi = OurBufferBuffer.piVALUE= 3.4145M;
System.Windows.Forms.MessageBox.Show(vpi.ToString(), "Value: pi"
, System.Windows.Forms.MessageBoxButtons.OK
, System.Windows.Forms.MessageBoxIcon.Information);
}
The first line is to call the AddRow() method. Here we are simply adding a row to our buffer, and fortunately for us we don’t have to worry about anything else, such as which buffer to use, or which columns to reference, or what order to reference them. Because there is no get accessor for our buffer, we’ll need to assign a decimal variable equal to the value that we’re putting into the piValue so we can show it in a message box. Close the VSTA, close the Script Transform Editor dialog box by clicking OK, and then execute the task. You should get the following. Note that the box won’t “turn green” because we didn’t set the output to a destination (Figure 4).
Figure 4. Simple example execution results
Go ahead and add a flat file destination so that we can see our value actually make it to a data flow destination; I have added a flat file connection manager and created a text file called piValue.txt within the project folder. Run the package again and see your output in piValue.txt
Figure 5. Simple example execution to a flat file.
A real-world example for using the SSIS Script Component as a source
The previous example was ok for showing the very basics of the SC, but our main tenet for using this task was to leverage it only when the shipped tasks will not work. In this example, it would be difficult to consume the following, in Figure 6:
Figure 6. Test data file to consume
Create a new package, add a Data Flow, double-click on it, add a Flat File Source data flow source, and try to map to the file called “DATA.txt” from the .zip download that I have included. Doesn’t work, does it? Try Fixed Width, Delimited, Ragged Right, or any of the others. So what is this mish-mash of data? Looks like a mess to me. Actually I saw something similar a work the other day (I'm consulting right now at a medical products company based in Switzerland that does $8 billion in annual sales). Basically, this is a file that contains Header and Item-level data. Look at it closely. The header “HDR” includes a line followed by one or more “ITM” lines, which are items for that particular header record. What our example includes is a set of fictitious sales; within the header is a store name, store number, invoice number and total rollup amount, and in the detail line we have the line number, store number, invoice number, item number, description, number purchased, and price. So how do we deal with this? Let’s try the Script Component.
Figure 7. Test data file - problems mapping DATA.txt in a Flat File Editor
Add a Script Component to a new package. Double-click and go into Connection Managers, replace Connection with FF in the Name, drop down the Connection Manager drop-down, create a flat file connection, and point to DATA.txt. Set it to a Ragged Right and close the dialog box.
No go to the Inputs and Outputs and configure it to look like I have mine in Figure 8. I am going to change all of the data types for this one to string [DT_STR] for the sake of brevity. Notice that I have two outputs – one each for the header records (HDR) and one for the detail item records (ITM). These two outputs will be the buffers for the various record types. If we happen to run across a header record, we’ll move it off to the header destination, and likewise for a detail record. Sweet!
Figure 8. Script Component mappings for Data.txt
Replace you code now in VSTA with the following snippet.
/* Microsoft SQL Server Integration Services Script Component
* Write scripts using Microsoft Visual C# 2008.
* ScriptMain is the entry point class of the script.*/
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.IO;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
private StreamReader _reader;
private string _dataFile;
public override void AcquireConnections(object Transaction)
{
IDTSConnectionManager100 connMgr = this.Connections.FF;
_dataFile = (string)connMgr.AcquireConnection(null);
}
public override void PreExecute()
{
base.PreExecute();
_reader = new StreamReader(_dataFile);
}
public override void PostExecute()
{
base.PostExecute();
_reader.Close();
}
public override void CreateNewOutputRows()
{
string _token;
string _nextLine;
string[] _columns;
char[] delimiters;
_token = "HDR";
delimiters = ",".ToCharArray();
_nextLine = _reader.ReadLine();
while (_nextLine != null)
{
_columns = _nextLine.Split(delimiters);
{
if ((_columns[0].ToString()) == _token)
{
HDRBuffer.AddRow();
HDRBuffer.LINETYPE = _columns[0];
HDRBuffer.STORENAME = _columns[1];
HDRBuffer.STORENUM = _columns[2];
HDRBuffer.INVOICENUM = _columns[3];
HDRBuffer.TOTALAMT = _columns[4];
}
else
{
ITMBuffer.AddRow();
ITMBuffer.LINETYPE = _columns[0];
ITMBuffer.LINENUM = _columns[1];
ITMBuffer.STORENUM = _columns[2];
ITMBuffer.INVOICENUM = _columns[3];
ITMBuffer.PRODUCTNUM = _columns[4];
ITMBuffer.PRODUCTDESC = _columns[5];
ITMBuffer.QTY = _columns[6];
ITMBuffer.AMT = _columns[7];
}
}
_nextLine = _reader.ReadLine();
}
}
}
What have we done here. Very simply we have expanded what we did before by simply adding more column placeholders where our data can flow. I take a peek at the first column and determine if it's a header record or a detail record, send it to the appropriate buffer block, and there add the values for the columns that we did in the previous step when we mapped for Data.txt. This is a very simple example - as we go further in the blog series I'll add more features to make the code robust, such as error handling, logging, and maybe throw in an Interface or two to help out the .Net runtime.
Close this up, and go back to the Data Flow workspace. Add two OLE destination tasks like I have done in Figure 9, and while in each, create a destination table so you can run the package. Notice that the buffers that we created "HDR" and "ITM" coincide with the outputs of the Script Component. Also note that there is no error buffer - we'll create one of these in another blog. Map your data and then execute the package. You should get three HDR rows and 14 ITM rows.
Figure 9. Script Component with data flow destinations
Figure 10. Data moved to SQL Server tables
Conclusion
In this first blog, we looked at the Script Component in SSIS 2008, and created two packages. The first was a very simple to check out the features of the package. In the second part we looked at a real-world example, and mapped the mixture of rows in a text file to SQL Server destinations. Hopefully these examples were straightforward and, beginner or otherwise, you got something from this blog.
Thanks for reading,
Lee Everest, M.S.
------------------------------------
Enjoy this posting? Check out my blog site at www.texastoo.com for research, cool stuff, and humor.

Refs: Books Online Script Component
Code/Solution: Solution