In a data warehousing/BI solution, many times we need to develop SSIS packages with a parent-child relationship. When we use parent-child relationship packages with a parent package variable configured in the child package for logging events, we get logging information in our expected location as well as some other location.
In this article, we will discuss why we get the logging information in some other location when we use a parent package variable configuration. We will also discuss one approach to get logging information in the expected location only.
Let’s create an SSIS solution with 1 master package and 2 child packages. Using these packages we will insert a data into “Employee.xls” and “Client.xls” from the employee and client tables respectively.
Below is our master package.
In this package, we have used 2 execute package tasks, which will execute the Employee_Load.dtsx package and the Client_Load.dtsx package. If we look in the connection managers, we see three connections. Two connections are for the package connection and one is for capturing the log events.
Below you can see the variables we have used in our master package.
These variables are:
- DBConnectionString : Data Source=SG-IND-2691SQL2008;Initial Catalog=TEST;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;
- ExcelFileDestination : C:ProjectsSSIS_Test_ProjectSSIS_Test_ProjectExcel Files
- LogFolderLocation : C:ProjectsSSIS_Test_ProjectSSIS_Test_ProjectLog Folder
- PackageLocation: C:ProjectsSSIS_Test_ProjectSSIS_Test_Project
All these variables are configured using XML files.
Now we will use the SSIS enabled logging feature to log all event entries when master package executes. This is how we have configured logging into the master package:
If we look under the configuration column (last column) we have specified an SSIS_Log connection manager.
This SSIS_Log connection manager is configured using “LogFolderLocation” and we have appended the string “Master_Pkg_Log” with the current date to the log folder at runtime. When this package will run, we will get a log file for this package under the location: C:ProjectsSSIS_Test_ProjectSSIS_Test_ProjectLog FolderMaster_Pkg_Log_20121018.
So far we have seen the master package. Now let’s have a look on child package “Employee_Load.dtsx”.
As we see, this is a very simple package that is just inserting data from the Employee table into an Excel file. In this package, we have used 3 connection managers: one is for the database connection, one is for log data, and one is for the Excel destination.
let’s have a look at the variables that we have used for this package and see how they are configured.
As we see, we have configured all variables of this package using a parent package variable. That means when this package executes, all these variables will have values from the parent package.
Now let’s have a look at how we have configured logging for this package (the same way as the master package).
If we see under the configuration column (last column), we have specified the SSIS_Log connection manager. This SSIS_Log connection manager is configured using the LogFolderLocation variable and we have appended the string Employee_Pkg_Log with the current date to the log folder at runtime. When this package runs, we will get the log for this package under the location: C:ProjectsSSIS_Test_ProjectSSIS_Test_ProjectLog FolderEmployee_Pkg_Log_20121018.
In the same way, we will create the other child package, Client_Load.dtsx, which will insert data into the Client.xlsx file from the Client table.
All other things such as logging, variables, and configuration are same as Employee_Load.dtsx.
When we execute the master package, we get log files created under two location :
- 1st location : C:ProjectsSSIS_Test_ProjectSSIS_Test_ProjectLog Folder (expected)
- 2nd location : C:ProjectsSSIS_Test_Project (not expected)
You can see the results below.
1st location :
2nd Location :
is this happening? Why do we have log files under two different locations?
The log files under the 1st location are fine because this is where we want to have log files. That means we have two question here.
- Why do we have log files under the 2nd location (C:ProjectsSSIS_Test_Project/)
- If “Employee_Pkg_Log_20121018” and “Client_Pkg_Log_20121018” log files are under 2nd location then why don’t we have “Master_Pkg_Log_20121018” log file under this 2nd location?
The answer to these questions is the order of events fired in the SSIS package. When we use the parent package variable configuration, the events are fired in the following way.
- Logging starts
- Package is Validated
- Parent Package Variable Configurations are loaded
- Expression is applied
- Package Runs
In all other cases, the configuration is the first event. As our master package does not have the parent package variable configuration, the configuration event executes first in this package. Hence the SSIS_Log Connection manager gets the value for Log Folder Location from the config file and therefore the log for the master package gets created in our dedicated log folder only i.e under 1st location: C:ProjectsSSIS_Test_ProjectSSIS_Test_ProjectLog Folder and not under 2nd Location
This explains question No.2. Now about question No.1.
When the child package Employee_Load.dtsx runs, as we have parent package variable configuration in this package, the logging event executes first and then the configuration event. Hence the SSIS_Log Connection manager does not get the value for LogFolderLocation. Below is the expression that we have used for SSIS_Log Connection manager.
@[User::LogFolderLocation] + "Employee_Pkg_Log_" + (DT_WSTR,4)YEAR(GETDATE()) + RIGHT("0"+(DT_WSTR, 2) MONTH(GETDATE()) ,2) + RIGHT("0"+(DT_WSTR, 2) DAY(GETDATE()) ,2)
In this case, the SSIS_Log Connection manager will have only the value for Employee_Pkg_Log_20121018 and therefore the log file with this name gets created under the solution, i.e. under the 2nd location (C:ProjectsSSIS_Test_Project/).
Once the configuration event gets executed after logging event, then it also creates the log file under the dedicated log folder, i.e under the 1st Location: C:ProjectsSSIS_Test_ProjectSSIS_Test_ProjectLog Folder
Hence we have log files for child packages under two locations.
The approach to have log files under the expected location only is to use environment variables. We can create an environment variable and set its value to LogFolderLocation path, i.e. C:ProjectsSSIS_Test_ProjectSSIS_Test_ProjectLog Folder. Then we can configure the LogFolderLocation variable in child packages using this environment variable. Now run the master package, we will have the log for our child packages in our expected folder only.