Programmatically Import Delimited Text File with Derived Columns into Flat File Destination Using SSIS’s Script Task Tool

Abstract

This article is in response to a request that was sent by one of my blog’s subscribers regarding scripting an SSIS package with derived columns that sources text file and writes data into another text file. Thus, this article demonstrates creating a SQL Server Integration Services package that imports a text file with derived column transformation into flat file destination using a Script Task component.

Requirements

Article

We begin by launching Microsoft Visual Studio and create a new Integration Services Project which is located under Business Intelligence Projects category.

After you have assigned a project name, proceed to click and drag the Script Task in Control Flow pane of the package’s toolbox.

Right click the script task and click on “Edit”

Under the Script Task Editor change the “ScriptLanguage” to “Microsoft Visual C# 2008″.

In Project Explorer, ensure the following references are added:
Microsoft.SqlServer.Dts.Design;
Microsoft.SqlServer.DTSPipelineWrap;
Microsoft.SQLServer.DTSRuntimeWrap;
Microsoft.SqlServer.ManagedDTS;
Microsoft.SqlServer.ScriptTask;
System;
System.AddIn;
System.Data;
System.Windows.Forms;
System.Xml;

Back to the code window, ensure that the following namespaces are declared:
using System;
using System.Data;
using System.Data.Sql;
using System.Data.SqlTypes;
using System.Data.SqlClient;
using Microsoft.SqlServer.Server;
using Microsoft.SqlServer.Dts.Tasks;
using Microsoft.SqlServer.Dts.Runtime;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using RuntimeWrapper = Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.IO;

After the above declarations, proceed to creating an instance of application:
Application selectSIFISO_app = new Application();

The next step is to create an SSIS package object:
Package sS_pkg = new Package();

Assign relevant package name and description:
sS_pkg.Name = “Load Flat File Source into Flat File Destination Using C#”;
sS_pkg.Description = “Programmatically create an SSIS 2008 package that loads a Flat File Source into Flat File Destination Using Script Task’s C# language”;

Insert the Data Flow Task with appropriate name and some buffer space for processing of file (the last part is optional – you can also use default buffer allocation):
sS_pkg.Executables.Add(“STOCK:PipelineTask”);
TaskHost taskHost = sS_pkg.Executables[0] as TaskHost;
MainPipe dataFlowTask = (MainPipe)taskHost.InnerObject;
taskHost.Name = “Dynamic Data Flow Task”;
taskHost.Properties["DefaultBufferMaxRows"].SetValue(taskHost, “1000000”);

Insert the Flat File connection
ConnectionManager connectionManagerFlatFile = sS_pkg.Connections.Add(“FLATFILE”);

You can change this path depending on where you have stored the flat file
connectionManagerFlatFile.ConnectionString = @”C:\Temp\flat_src.txt”;

Assign name to the flat file connection
connectionManagerFlatFile.Name = “TXT_FlatFile”;

Indicate that the flat file is delimited
connectionManagerFlatFile.Properties["Format"].SetValue(connectionManagerFlatFile, “Delimited”);

Indicate whether the source file has column headings or not – in this case, our sample data has column headings.
connectionManagerFlatFile.Properties["ColumnNamesInFirstDataRow"].SetValue(connectionManagerFlatFile, Convert.ToBoolean(true));

Get native Flat File connection
RuntimeWrapper.IDTSConnectionManagerFlatFile100 connectionFlatFile = connectionManagerFlatFile.InnerObject as RuntimeWrapper.IDTSConnectionManagerFlatFile100;

Declare local variable that will be used in this demonstration:
string line;
int k =0; ;

Determine the number of columns by reading the sample Flat File – line by line.
using (StreamReader file = new StreamReader(@”C:\Temp\flat_src.txt”))
{
try
{
while ((line = file.ReadLine()) != null)
{
char[] delimiters = new char[] { ‘|’ };
string[] parts = line.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);

for (int i = 0; i < parts.Length;i++)
{
RuntimeWrapper.IDTSConnectionManagerFlatFileColumn100 flatFileCol =
connectionFlatFile.Columns.Add() as RuntimeWrapper.IDTSConnectionManagerFlatFileColumn100;
sS_AssignColumnProperties(flatFileCol, parts[i], "|");
}
//Exit file after reading the first line
break;
}
}
catch (Exception ex)
{
throw ex;
}
finally
{
file.Close();
}
}

Edit the last Flat File column delimiter into NewLine instead of a Comma
connectionFlatFile.Columns[connectionFlatFile.Columns.Count - 1].ColumnDelimiter = Environment.NewLine;

Insert Flat File source component
IDTSComponentMetaData100 componentSource = dataFlowTask.ComponentMetaDataCollection.New();
componentSource.Name = “FlatFileSource”;
componentSource.ComponentClassID = “DTSAdapter.FlatFileSource”;

Insert source design-time instance and initialise component
CManagedComponentWrapper instanceSource = componentSource.Instantiate();
instanceSource.ProvideComponentProperties();

Set source connection
componentSource.RuntimeConnectionCollection[0].ConnectionManagerID = connectionManagerFlatFile.ID;
componentSource.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.ToConnectionManager90(connectionManagerFlatFile);

Reinitialize Flat File source metadata,
instanceSource.AcquireConnections(null);
instanceSource.ReinitializeMetaData();
instanceSource.ReleaseConnections();

Derived Column
IDTSComponentMetaData100 componentDerivedTransform = dataFlowTask.ComponentMetaDataCollection.New();
componentDerivedTransform.Name = “Derived Col Transform”;
componentDerivedTransform.ComponentClassID = “DTSTransform.DerivedColumn”;
CManagedComponentWrapper DesignDerivedTransformColumns = componentDerivedTransform.Instantiate();
DesignDerivedTransformColumns.ProvideComponentProperties();

design time
componentDerivedTransform.InputCollection[0].ExternalMetadataColumnCollection.IsUsed = false;
componentDerivedTransform.InputCollection[0].HasSideEffects = false;

Create the path from source to derived columns, further insert two derived columns; the first one displays a full name whilist the other derived column insert a date to record when the file/record was loaded.
IDTSPath100 SourceToDerivedPath = dataFlowTask.PathCollection.New();
SourceToDerivedPath.AttachPathAndPropagateNotifications(componentSource.OutputCollection[0], componentDerivedTransform.InputCollection[0]);

IDTSOutputColumn100 myConCatCol = componentDerivedTransform.OutputCollection[0].OutputColumnCollection.New();
myConCatCol.Name = “FullName”;

myConCatCol.SetDataTypeProperties(Microsoft.SqlServer.Dts.Runtime.Wrapper.DataType.DT_STR, 100, 0, 0, 1252);
myConCatCol.ExternalMetadataColumnID = 0;
myConCatCol.ErrorRowDisposition = DTSRowDisposition.RD_FailComponent;
myConCatCol.TruncationRowDisposition = DTSRowDisposition.RD_FailComponent;

IDTSCustomProperty100 myConCatProp = myConCatCol.CustomPropertyCollection.New();
myConCatProp.Name = “Expression”;
myConCatProp.Value = “[LastName] + \”\n\n,\” + [FirstName]\n”;

myConCatProp = myConCatCol.CustomPropertyCollection.New();
myConCatProp.Name = “FriendlyExpression”;
myConCatProp.Value = “[LastName] + \”\n\n,\” + [FirstName]\n”;

IDTSOutputColumn100 myCol = componentDerivedTransform.OutputCollection[0].OutputColumnCollection.New();
myCol.Name = “Loaddate”;

myCol.SetDataTypeProperties(Microsoft.SqlServer.Dts.Runtime.Wrapper.DataType.DT_DBTIMESTAMP, 0, 0, 0, 0);
myCol.ExternalMetadataColumnID = 0;
myCol.ErrorRowDisposition = DTSRowDisposition.RD_FailComponent;
myCol.TruncationRowDisposition = DTSRowDisposition.RD_FailComponent;

IDTSCustomProperty100 myProp = myCol.CustomPropertyCollection.New();
myProp.Name = “Expression”;
myProp.Value = “Getdate()”;

myProp = myCol.CustomPropertyCollection.New();
myProp.Name = “FriendlyExpression”;
myProp.Value = “Getdate()”;

IDTSInput100 DerivedColumnInput = componentDerivedTransform.InputCollection[0];
IDTSVirtualInput100 DerivedColumnVirtualInput = DerivedColumnInput.GetVirtualInput();
IDTSVirtualInputColumnCollection100 DerivedColumnVirtualInputColumns = DerivedColumnVirtualInput.VirtualInputColumnCollection;

Added the below to validate input columns
foreach (IDTSVirtualInputColumn100 virtualInputColumnDT in DerivedColumnVirtualInputColumns)
{
// Select column, and retain new input column
IDTSInputColumn100 inputColumn = DesignDerivedTransformColumns.SetUsageType(DerivedColumnInput.ID, DerivedColumnVirtualInput, virtualInputColumnDT.LineageID, DTSUsageType.UT_READONLY);
}

Declare new StreamWriter object and create a text file that will be used for output – provided it does not exists.

StreamWriter SW;
if (!File.Exists(“E:\\MyTextFile.txt”))
{
//Create output text file
SW = File.CreateText(“E:\\MyTextFile.txt”);
//Insert column row seperated by pipe
SW.WriteLine(“EmployeeKey|FirstName|LastName|Title|BirthDate|HireDate|FullName|Loaddate”);
//Close the file
SW.Close();
}

Insert the Flat File Destination connection
ConnectionManager connectionManagerFlatFileDestionation = sS_pkg.Connections.Add(“FLATFILE”);

You can change this path depending on where you have stored the flat file
connectionManagerFlatFileDestionation.ConnectionString = “E:\\MyTextFile.txt”;

Assign name to the flat file connection
connectionManagerFlatFileDestionation.Name = “Sample_TXT_FlatFile_Des”;

Indicate that the flat file is delimited
connectionManagerFlatFileDestionation.Properties["Format"].SetValue(connectionManagerFlatFileDestionation, “Delimited”);

Indicate whether the source file has column headings or not – in this case, our sample data has column headings.
connectionManagerFlatFileDestionation.Properties["ColumnNamesInFirstDataRow"].SetValue(connectionManagerFlatFileDestionation, Convert.ToBoolean(true));

Get native Flat File connection
RuntimeWrapper.IDTSConnectionManagerFlatFile100 connectionFlatFileDestionation = connectionManagerFlatFileDestionation.InnerObject as RuntimeWrapper.IDTSConnectionManagerFlatFile100;

Determine the number of columns by reading the sample Flat File – line by line.
using (StreamReader file = new StreamReader(“E:\\MyTextFile.txt”))
{
try
{
while ((line = file.ReadLine()) != null)
{
char[] delimiters = new char[] { ‘|’ };
string[] parts = line.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);

for (int j = 0; j < parts.Length; j++)
{
RuntimeWrapper.IDTSConnectionManagerFlatFileColumn100 flatFileCol =
connectionFlatFileDestionation.Columns.Add() as RuntimeWrapper.IDTSConnectionManagerFlatFileColumn100;
sS_AssignColumnProperties(flatFileCol, parts[j], "|");
k++;
}
//Exit file after reading the first line
break;
}
}
catch (Exception ex)
{
throw ex;
}
finally
{
file.Close();
}
}

Edit the last Flat File column delimiter into NewLine instead of a Comma
connectionFlatFileDestionation.Columns[connectionFlatFileDestionation.Columns.Count - 1].ColumnDelimiter = Environment.NewLine;

Insert Flat File Destination component
IDTSComponentMetaData100 componentDestination = dataFlowTask.ComponentMetaDataCollection.New();
componentDestination.Name = “FlatFileDestination”;
componentDestination.ComponentClassID = “DTSAdapter.FlatFileDestination”;

Insert source design-time instance and initialise component
CManagedComponentWrapper instanceDestination = componentDestination.Instantiate();
instanceDestination.ProvideComponentProperties();

Set source connection
componentDestination.RuntimeConnectionCollection[0].ConnectionManagerID = connectionManagerFlatFileDestionation.ID;
componentDestination.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.ToConnectionManager90(connectionManagerFlatFileDestionation);

Reinitialize Flat File source metadata,
instanceDestination.AcquireConnections(null);
instanceDestination.ReinitializeMetaData();
instanceDestination.ReleaseConnections();

Connect the Flat File source to the OLE DB Destination component
dataFlowTask.PathCollection.New().AttachPathAndPropagateNotifications(componentDerivedTransform.OutputCollection[0], componentDestination.InputCollection[0]);

Get input and virtual input for destination to select and map columns
IDTSInput100 destinationInput = componentDestination.InputCollection[0];
IDTSVirtualInput100 destinationVirtualInput = destinationInput.GetVirtualInput();
IDTSVirtualInputColumnCollection100 destinationVirtualInputColumns = destinationVirtualInput.VirtualInputColumnCollection;

Declare local integer variable that will be used for looping and map transformed columns to destination columns

int new_i=0;
foreach (IDTSVirtualInputColumn100 virtualInputColumn in destinationVirtualInputColumns)
{
if (new_i < k)
{
// Select column, and retain new input column
IDTSInputColumn100 inputColumn = instanceDestination.SetUsageType(destinationInput.ID,
destinationVirtualInput, virtualInputColumn.LineageID, DTSUsageType.UT_READONLY);
// Find external column by name
IDTSExternalMetadataColumn100 externalColumn =
destinationInput.ExternalMetadataColumnCollection[inputColumn.Name];
// Map input column to external column
instanceDestination.MapInputColumn(destinationInput.ID, inputColumn.ID, externalColumn.ID);

new_i++;
}
}

Reinitialize Derived Column Transformation component’s metadata
DesignDerivedTransformColumns.AcquireConnections(null);
DesignDerivedTransformColumns.ReinitializeMetaData();
DesignDerivedTransformColumns.ReleaseConnections();

Execute the package or disable the below code if you intend running the package later
sS_pkg.Execute();

Finally, save the package – in this case, we have opted to save the package into file system
selectSIFISO_app.SaveToXml(@”E:\newFFArticle.dtsx”, sS_pkg, null);

Dts.TaskResult = (int)ScriptResults.Success;}

In addition to the above code, you will notice that some part of the code references to the below function. This function is used to assign DTS column properties:
private static void sS_AssignColumnProperties(RuntimeWrapper.IDTSConnectionManagerFlatFileColumn100 flatFileCol, string getColName, string getDelim)
{

Assign delimiter:
flatFileCol.ColumnType = “Delimited”;
flatFileCol.ColumnDelimiter = getDelim;

Indicate column data type – in this case, all the source columns will be set to String Data Type:
flatFileCol.DataType = RuntimeWrapper.DataType.DT_STR;

Indicate column width – in this case, width of all source columns will be set to a length of 100:
flatFileCol.ColumnWidth = 100;

Assign column name:
RuntimeWrapper.IDTSName100 columnName = flatFileCol as RuntimeWrapper.IDTSName100;
columnName.Name = getColName.ToString();
}

Conclusion

It’s that simple!

You can now execute your script task and the package will be created in location you specified.

For the complete C# solution to this article, see “Requirements” section for a file name called “selectSIFISO_Flat_File_Source_into_Derived_then_FlatFile_Destination_Using_CSharp.txt”.

Thanks.

Sifiso.

7 Comments

  1. prash says:

    please send code for
    flat file source
    “retain null values from the source as null values in the dataflow”
    this is chekbox we have in flat file connection manager

    we have property there in flatfile source

    please provide that code i am littile bit confusing in that

  2. Dirk says:

    I followed your example and am at the part where you give the flat file connection manager its columns. In the GUI for SSIS, the flat file connection manager automatically retrieves the column names and data types when you click on it. In your example you manaully create column names and then later rename and set their attributes. Isn’t there a way to get SSIS to derive the column names and data types for you without having to do this? Please advise. Thanks.

    • Sifiso says:

      Hi There,

      I appreciate the feedback.

      Unfortunately, every click event that is performed in the GUI of SSIS has to be programmed when you creating the package using a script task.

      It depends also on what you would like to achieve by programmatically creating an SSIS package. If it is convenient for you to use the GUI than rather use the GUI.

      Regards,

      Sifiso.

  3. Dnyaneshwar says:

    Thanks Sifiso. It was really helpful for me. I must say it is a smart and hence rare exercise.

    I am done with flat file and now heading towards excel file import. I was wondering how to use this script to import data from excel file. Could you please guide me here? Thanks.

    Dnyaneshwar

  4. vishal says:

    Error: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. —> System.Runtime.InteropServices.COMException (0xC020801C): Exception from HRESULT: 0xC020801C
    at Microsoft.SqlServer.Dts.Pipeline.Wrapper.CManagedComponentWrapperClass.AcquireConnections(Object pTransaction)
    at ST_a2036a0d6b0a45c8813c58848d40f65a.csproj.ScriptMain.Main()
    — End of inner exception stack trace —
    at System.RuntimeMethodHandle._InvokeMethodFast(Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeTypeHandle typeOwner)
    at System.RuntimeMethodHandle.InvokeMethodFast(Object target, Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeTypeHandle typeOwner)
    at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks)
    at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    at System.RuntimeType.InvokeMember(String name, BindingFlags bindingFlags, Binder binder, Object target, Object[] providedArgs, ParameterModifier[] modifiers, CultureInfo culture, String[] namedParams)
    at System.Type.InvokeMember(String name, BindingFlags invokeAttr, Binder binder, Object target, Object[] args, CultureInfo culture)
    at Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTATaskScriptingEngine.ExecuteScript()

    this error at acqiureconnection method
    please solve this .

  5. Danny says:

    Thanks- this example showed me exactly what I was needed.

Leave a comment

%d bloggers like this: