Wednesday, June 8, 2011

Unpivot Transformation with Multiple Destination Columns

The Unpivot transformation makes an unnormalized dataset into a more normalized version by expanding values from multiple columns in a single record into multiple records with the same values in a single column. For example, a dataset that lists customer names has one row for each customer, with the products and the Qty purchased shown in columns in the row. After the Unpivot transformation normalizes the data set, the data set contains a different row for each product that the customer purchased.

The need to unpivot data during an ETL operation is quite common. One method to perform the unpivoting is writing the source query in such a way that the data starts off unpivoted using the UNPIVOT T-SQL statement in SQL Server. This however would obviously not work out on a source that does not support unpivoting such as a text file or Office Excel or even older versions of popular RDBMSs. The other method of course is to use the Unpivot Transformation in Integration Services.

The Unpivot Transformation is quite a simple component, especially if you need to unpivot a bunch of columns to a single destination column. Unpivoting to multple destination columns however, seems a tad challenging the first time round, at least for me it did. Until I figured out how to. This post is all about unpivoting a columns to multiple destination columns.

Consider this scenario. A simple Product table (which I shall call Source this point forward):


Needs to be transformed into the following Product table (which I shall call Destination this point forward):


Configuring the Unpivot transformation for this would be quite simple, and would look like this:

The problem starts when you need the Destination table looking like this, with multiple destination columns (Qty and Cost):

You add the Cost columns as well (as in the image below) to the Unpivot transformation,

and get the following validation error (encountered by a lot of people):

PivotKeyValue is not valid. In an UnPivot transform with more than one unpivoted DestinationColumn, the set of PivotKeyValues per destination must match exactly.

The reason is this. If you look closely at the Pivot Key Value column of the Unpivot transformation you would notice that for the Qty destination column, the set of values are ‘Qty1’, ‘Qty2’ and ‘Qty3’. Whereas, for the Cost destination column, the set of values are ‘Cost1’, ‘Cost2’ and ‘Cost3’, which are clearly not matching. The solution to the problem here is quite trivial. Change the default values of the Pivot Key Value column of the offending destination column to match the values of the other destination columns. Like this:


And its Done....

Instead of Qty1, Qty2, Qty3 give "Pivot Key Value" as 1, 2, 3.