Thursday, August 1, 2024

Pivoting and unpivoting data in T-SQL

 Pivoting and unpivoting data in T-SQL can be extremely useful for transforming your data into different shapes, particularly for reporting and analysis. Here's an overview of how to use the PIVOT and UNPIVOT operators in T-SQL.

Pivot

The PIVOT operator transforms rows into columns. This is particularly useful for creating summary tables or cross-tab reports.

Example: Pivoting Data

Let's assume we have a table named Sales with the following data:

YearQuarterSalesAmount
2022Q1100
2022Q2150
2022Q3200
2022Q4250
2023Q1110
2023Q2160
2023Q3210
2023Q4260

We want to pivot this data so that each quarter becomes a column.

SELECT Year, [Q1], [Q2], [Q3], [Q4]
FROM
(
    SELECT Year, Quarter, SalesAmount
    FROM Sales
) AS SourceTable
PIVOT
(
    SUM(SalesAmount)
    FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS PivotTable
ORDER BY Year;

Result:

YearQ1Q2Q3Q4
2022100150200250
2023110160210260

Unpivot

The UNPIVOT operator performs the reverse transformation, turning columns into rows. This is useful for normalizing data or preparing it for certain types of analysis.

Example: Unpivoting Data

Suppose we have a table named QuarterlySales with the following data:

YearQ1Q2Q3Q4
2022100150200250
2023110160210260

We want to unpivot this data so that each quarter becomes a row.

SELECT Year, Quarter, SalesAmount
FROM
(
    SELECT Year, [Q1], [Q2], [Q3], [Q4]
    FROM QuarterlySales
) AS SourceTable
UNPIVOT
(
    SalesAmount FOR Quarter IN ([Q1], [Q2], [Q3], [Q4])
) AS UnpivotTable
ORDER BY Year, Quarter;

Result:

YearQuarterSalesAmount
2022Q1100
2022Q2150
2022Q3200
2022Q4250
2023Q1110
2023Q2160
2023Q3210
2023Q4260

Explanation

Pivot

  • Source Table: The inner query selects the source data.
  • PIVOT: The PIVOT clause aggregates the SalesAmount values and spreads them across new columns specified in the FOR Quarter IN ([Q1], [Q2], [Q3], [Q4]) clause.
  • SUM(SalesAmount): This is the aggregation function applied to the SalesAmount values.

Unpivot

  • Source Table: The inner query selects the source data.
  • UNPIVOT: The UNPIVOT clause transforms the column values into rows. The FOR Quarter IN ([Q1], [Q2], [Q3], [Q4]) clause specifies the columns to be unpivoted.
  • SalesAmount FOR Quarter: This specifies the new column names for the values (SalesAmount) and the labels (Quarter).

Practical Considerations

  • Performance: Pivoting and unpivoting can be resource-intensive, especially on large datasets. Indexing and query optimization are essential.
  • Data Integrity: Ensure that the data being pivoted or unpivoted is clean and consistent to avoid unexpected results.
  • Dynamic Pivoting: For scenarios where the columns to be pivoted are not known in advance, dynamic SQL can be used to construct the pivot query at runtime.

These techniques are powerful tools in T-SQL for data transformation, making it easier to reshape data for analysis and reporting.

Dynamic Pivoting in t-sql

 Dynamic pivoting in T-SQL allows you to create pivot tables where the columns are not known beforehand and need to be generated dynamically based on the data. This is useful when the set of values to be pivoted changes frequently or is not known at compile time.

Here's a step-by-step example to demonstrate dynamic pivoting in T-SQL:

Scenario

Let's say we have a Sales table with the following data:

YearQuarterSalesAmount
2022Q1100
2022Q2150
2022Q3200
2022Q4250
2023Q1110
2023Q2160
2023Q3210
2023Q4260

We want to dynamically pivot this data so that each quarter becomes a column.

Steps for Dynamic Pivoting

  1. Retrieve the unique values for the columns to pivot.
  2. Construct the dynamic SQL query.
  3. Execute the dynamic SQL query.

Step 1: Retrieve Unique Column Values

First, we need to get the unique values from the Quarter column:

DECLARE @Columns NVARCHAR(MAX);
SELECT @Columns = STRING_AGG(QUOTENAME(Quarter), ',') 
FROM (SELECT DISTINCT Quarter FROM Sales) AS Quarters;

Step 2: Construct the Dynamic SQL Query

Next, we construct the dynamic SQL query using the retrieved column values:

SET @SQL = '
SELECT Year, ' + @Columns + '
FROM
(
    SELECT Year, Quarter, SalesAmount
    FROM Sales
) AS SourceTable
PIVOT
(
    SUM(SalesAmount)
    FOR Quarter IN (' + @Columns + ')
) AS PivotTable
ORDER BY Year;
';

Step 3: Execute the Dynamic SQL Query

Finally, execute the dynamic SQL query:


EXEC sp_executesql @SQL;

Complete Example

Putting it all together, the complete T-SQL script looks like this:

-- Step 1: Retrieve unique column values
DECLARE @Columns NVARCHAR(MAX);
SELECT @Columns = STRING_AGG(QUOTENAME(Quarter), ',') 
FROM (SELECT DISTINCT Quarter FROM Sales) AS Quarters;

-- Step 2: Construct the dynamic SQL query
DECLARE @SQL NVARCHAR(MAX);

SET @SQL = '
SELECT Year, ' + @Columns + '
FROM
(
    SELECT Year, Quarter, SalesAmount
    FROM Sales
) AS SourceTable
PIVOT
(
    SUM(SalesAmount)
    FOR Quarter IN (' + @Columns + ')
) AS PivotTable
ORDER BY Year;
';

-- Step 3: Execute the dynamic SQL query
EXEC sp_executesql @SQL;

Explanation

  1. Retrieve Unique Column Values: The STRING_AGG function is used to concatenate the distinct quarter values, separated by commas. The QUOTENAME function ensures that the column names are properly escaped.

  2. Construct the Dynamic SQL Query: The @Columns variable is used to dynamically insert the column names into the PIVOT clause of the SQL query.

  3. Execute the Dynamic SQL Query: The sp_executesql stored procedure is used to execute the dynamically constructed SQL query.

This approach allows you to pivot data dynamically based on the actual values present in the dataset, making it flexible and adaptable to changing data.