Copying and Updating a Tab Delimited Text File with SSIS

This post will focus on how you can use an SSIS script task to "update" a text file. Update is a bit of a misnomer. We will actually be copying the text file, parsing the text file using an array, and updating the first column of the text file, and prepending the file path to the sixth column. The column contains the name of a PDF file in its current state. We we will be prepending the path to the PDF file in the this column. Additionally, we will be placing the file in a staging area, C:\Index\. I arbitrarily chose the name for the index directory and the file names for this example. I am also using local resources for this example. In the real world, you can use any resource in your network as long as you have the appropriate rights and permissions. You may be asking why would I want to perform these actions on the file. That is a valid question. We have a process on our ECM server that will pick up this file and use it to import each of the files listed in the index document. It will use the index document to import the correct file into our system. This post is part of a larger project, however, the concepts presented in this post can standalone.

1.Drag a script task onto the SSIS canvas and name it appropriately.

2.Create the following project level variables. If you have been reading my other posts, you may already have created them.
Name           Data Type        Value

vSftpInboundFilePath         String Our local staging area - C:\Staging\

vIndex String Our local directory for storing the index files - C:\Index\

3.Double click on the script task to open the Script Task Editor. Verify that C# is the chosen script language.

4. Identify the ReadOnlyVariables and click on the "OK" button.

5. Your Script Task Editor should now resemble the image below. Click on the "Edit Script" button to enter the code editor.

6.Import the following namespaces.

#region Namespaces

using System;

using System.Data;

using Microsoft.SqlServer.Dts.Runtime;

using System.Windows.Forms;

using System.IO;

using System.Text;

#endregion

7..Create a Try...Catch block.

try

{

  Dts.TaskResult = (int)ScriptResults.Success;

}

catch (Exception ex)

{

  Dts.Events.FireError(0, "ERROR", ex.Message, null, 0);

  Dts.TaskResult = (int)ScriptResults.Failure;

}

8. Create the following variables within the block.

//Populate the variable using the project level variable value.

string sourceDirectory = Dts.Variables["User::vSftpInboundFilePath"].Value.ToString();

//Create an array to store all the filenames that are prefaced with the text "Test" in our source directory.  

//This will ensure that we only searching within a specific subset of folders in the directory.

string[] folders = System.IO.Directory.GetDirectories(sourceDirectory, "Test*", System.IO.SearchOption.AllDirectories);

9.Create a foreach loop.

   foreach (string dir in folders)

               {

             

               }

10.Create the following variables within the loop and call the System.IO.File.Copy method to copy the source document to a staging area.

// Full path to source document.

string fullFilePath = dir + @"\Export.txt";

//Path to directory for staging the index (rewritten index files).

string indexFolderPath = Dts.Variables["User::vIndex"].Value.ToString();

//New file name path pointing to the directory for index files.

string newIndexFileName = indexFolderPath + @"TestRename" + Path.GetFileName(Path.GetDirectoryName(fullFilePath)) + ".txt";

                   

//Filename substring to be used in the creation of the new file name.

string fileNameOnly = Path.GetFileNameWithoutExtension(fullFilePath);

//Extension to append to the new file name.

string extension = Path.GetExtension(newIndexFileName);            

//Path that will be inserted into the line array.  It will contain the path to the PDF file .

string lastFolderName = Path.GetFileName(Path.GetDirectoryName(fullFilePath)).Trim();

//Provdes the full path for the copying the text document.

string copyPath = indexFolderPath + fileNameOnly + extension;

//Will be used in the creation of lines  a tab delimited file.

string row = "";

//Copy the export file to the staging location.

System.IO.File.Copy(fullFilePath, copyPath);

//Instantiate an array of lines from copy of the source file.

string[] lines = File.ReadAllLines(copyPath);

                   

                 
11. Embed the following for loop within the foreach loop.

     for (int i = 0; i < lines.Length; ++i) //Parses the arrays into lines.

                   {

                       //Create an array for storing the contents of the file.

                       string[] line = lines[i].Split('\t'); //Parses the line into tab delimited substrings.

                       //Assigns the value of "testing" to the first element of the array, in our case, this is the first column. This is an arbitrary value for demo

                       //purposes.                  

                       line[0] = "Testing";

                       

                       //Adds the path to the sixth element of the array, or the sixth column.        

                       line[5] = @"C:\Staging\" + lastFolderName + @"\" + line[5];

                       foreach (string value in line)

                       {

                           row = (string.Join("\t", line));

                           row += "\t\n";

                       }

                       //Populate the file.

                       File.AppendAllText(newIndexFileName, row);

                       //Delete the copy of the file if it exists.

                       if (File.Exists(copyPath))

                       {

                           File.Delete(copyPath);

                       }

                   }

12.Save your changes, close out of the Script Task Editor and execute the task. You should see results similar to the image below.

For reference we will examine the files before and after execution.

We will be copying and updating the file into the staging area C:\Index\. The script task will delete the copy at run time after updating the first and sixth columns.  This is our source file. Notice the text "PLACHOLDER" text in the first column and the filename "XXXXXXXX.pdf" in the sixth column. All of the values are separated by spaces and tabs. We will replace the text after running the script task.

I am using Notepad++ to view the file. Oded Maoz Erell provides a nice tutorial on how to view white space and tabs in Notepad++, if you have not already activated this option.

The newly created index files in our C:\Index\ directory.

Note that both the first column and the sixth column have been updated with the values we provided in the script. The index file now contains the full path to the PDF file, and our ECM system can now locate the file.

In this example, we copied a text file, from a subdirectory, to a new staging area C:\Index\. We parsed the tab delimited text file using an array, and updated the values of the first column of the text file, and prepended the file path to the sixth column. We also deleted the file from the staging area before the script completed. Finally, we renamed the file and staged it in C:\Index\ where it will await further processing. In my next post, I will demonstrate how to use an event handler to send out error notifications if the SSIS package fails. See you there.

Here is the code in its entirety.

#region Help:  Introduction to the script task

/* The Script Task allows you to perform virtually any operation that can be accomplished in

* a .Net application within the context of an Integration Services control flow.

*

* Expand the other regions which have "Help" prefixes for examples of specific ways to use

* Integration Services features within this script task. */

#endregion

#region Namespaces

using System;

using System.Data;

using Microsoft.SqlServer.Dts.Runtime;

using System.Windows.Forms;

using System.IO;

using System.Text;

#endregion

namespace ST_eebb6f2193604825b29970b992402b00

{

[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]

public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

{

       #region Help:  Using Integration Services variables and parameters in a script

       /* To use a variable in this script, first ensure that the variable has been added to

        * either the list contained in the ReadOnlyVariables property or the list contained in

        * the ReadWriteVariables property of this script task, according to whether or not your

        * code needs to write to the variable.  To add the variable, save this script, close this instance of

        * Visual Studio, and update the ReadOnlyVariables and

        * ReadWriteVariables properties in the Script Transformation Editor window.

        * To use a parameter in this script, follow the same steps. Parameters are always read-only.

        *

        * Example of reading from a variable:

        *  DateTime startTime = (DateTime) Dts.Variables["System::StartTime"].Value;

        *

        * Example of writing to a variable:

        *  Dts.Variables["User::myStringVariable"].Value = "new value";

        *

        * Example of reading from a package parameter:

        *  int batchId = (int) Dts.Variables["$Package::batchId"].Value;

        *  

        * Example of reading from a project parameter:

        *  int batchId = (int) Dts.Variables["$Project::batchId"].Value;

        *

        * Example of reading from a sensitive project parameter:

        *  int batchId = (int) Dts.Variables["$Project::batchId"].GetSensitiveValue();

        * */

       #endregion

       #region Help:  Firing Integration Services events from a script

       /* This script task can fire events for logging purposes.

        *

        * Example of firing an error event:

        *  Dts.Events.FireError(18, "Process Values", "Bad value", "", 0);

        *

        * Example of firing an information event:

        *  Dts.Events.FireInformation(3, "Process Values", "Processing has started", "", 0, ref fireAgain)

        *

        * Example of firing a warning event:

        *  Dts.Events.FireWarning(14, "Process Values", "No values received for input", "", 0);

        * */

       #endregion

       #region Help:  Using Integration Services connection managers in a script

       /* Some types of connection managers can be used in this script task.  See the topic

        * "Working with Connection Managers Programatically" for details.

        *

        * Example of using an ADO.Net connection manager:

        *  object rawConnection = Dts.Connections["Sales DB"].AcquireConnection(Dts.Transaction);

        *  SqlConnection myADONETConnection = (SqlConnection)rawConnection;

        *  //Use the connection in some code here, then release the connection

        *  Dts.Connections["Sales DB"].ReleaseConnection(rawConnection);

        *

        * Example of using a File connection manager

        *  object rawConnection = Dts.Connections["Prices.zip"].AcquireConnection(Dts.Transaction);

        *  string filePath = (string)rawConnection;

        *  //Use the connection in some code here, then release the connection

        *  Dts.Connections["Prices.zip"].ReleaseConnection(rawConnection);

        * */

       #endregion

public void Main()

       {

       

 

           try

           {

               //Populate the variable using the project level variable value.

               string sourceDirectory = Dts.Variables["User::vSftpInboundFilePath"].Value.ToString();

               //Create an array to store all the filenames that are prefaced with the text "Test" in our source directory.  

               //This will ensure that we only searching within a specific subset of folders in the directory.

               string[] folders = System.IO.Directory.GetDirectories(sourceDirectory, "Test*", System.IO.SearchOption.AllDirectories);

               foreach (string dir in folders)

               {

                   // Full path to source document.

                   string fullFilePath = dir + @"\Export.txt";

                   //Path to directory for staging index files.

                   string indexFolderPath = Dts.Variables["User::vIndex"].Value.ToString();

                   //New file name path pointing to the directory for index files.

                   string newIndexFileName = indexFolderPath + @"TestRename" + Path.GetFileName(Path.GetDirectoryName(fullFilePath)) + ".txt";

                   

                   //Filename substring to be used in the creation of the new file name.

                   string fileNameOnly = Path.GetFileNameWithoutExtension(fullFilePath);

                   //Extension to append to the new file name.

                   string extension = Path.GetExtension(newIndexFileName);            

                   //Path that will be inserted into the line array.  It will contain the path to the PDF file .

                   string lastFolderName = Path.GetFileName(Path.GetDirectoryName(fullFilePath)).Trim();

                   //Provdes the full path for the copying the text document.

                   string copyPath = indexFolderPath + fileNameOnly + extension;

                   //Will be used in the creation of a tab delimited file.

                   string row = "";

                 

                   //Copy the export file to the staging location.

                   System.IO.File.Copy(fullFilePath, copyPath);

                   //Instantiate an array of lines from copy of the source file.

                   string[] lines = File.ReadAllLines(copyPath);

                   

                 

                   for (int i = 0; i < lines.Length; ++i) //Parses the arrays into lines.

                   {

                       string[] line = lines[i].Split('\t'); //Parses the line into tab delimited substrings.

                       line[0] = "Testing"; //Assigns the value of testing to the first element of the array, in our case, this is the first column.

                       line[5] = @"C:\Staging\" + lastFolderName + @"\" + line[5]; //Adds the path to the sixth element of the array, or the sixth column.

                       foreach (string value in line)

                       {

                           row = (string.Join("\t", line));

                           row += "\t\n";

                       }

                       //Populate the file.

                       File.AppendAllText(newIndexFileName, row);

                       //Delete the copy of the file if it exists.

                       if (File.Exists(copyPath))

                       {

                           File.Delete(copyPath);

                       }

                   }

               }

           

             

               Dts.TaskResult = (int)ScriptResults.Success;

           }

           catch (Exception ex)

           {

               Dts.Events.FireError(0, "ERROR", ex.Message, null, 0);

               Dts.TaskResult = (int)ScriptResults.Failure;

           }

}

       #region ScriptResults declaration

       /// <summary>

       /// This enum provides a convenient shorthand within the scope of this class for setting the

       /// result of the script.

       ///

       /// This code was generated automatically.

       /// </summary>

       enum ScriptResults

       {

           Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,

           Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure

       };

       #endregion

}

}