Thursday, August 1, 2024

Pivoting - tricky example in t-sql

 Let's create a tricky pivoting example in T-SQL that involves more complex transformations, such as incorporating additional calculations and handling potential NULL values.

Scenario

Suppose we have a Sales table with the following structure and data:

YearQuarterRegionSalesAmount
2022Q1East100
2022Q1West200
2022Q2East150
2022Q2West250
2022Q3East200
2022Q3West300
2022Q4East250
2022Q4West350
2023Q1East110
2023Q1West210
2023Q2East160
2023Q2West260
2023Q3East210
2023Q3West310
2023Q4East260
2023Q4West360

We want to pivot this data to show the total sales for each region by year, and include columns for the percentage of total sales for each quarter within each year.

Pivoting with Calculations Example

Step 1: Create a Temporary Table for Summed Data

First, create a temporary table that includes the total sales for each quarter by region and the overall total sales for each year.

-- Create a temporary table to store summed data
SELECT 
    Year, 
    Quarter, 
    Region, 
    SalesAmount,
    SUM(SalesAmount) OVER (PARTITION BY Year) AS TotalSalesYear
INTO #SalesSummary
FROM Sales;

Step 2: Construct the Dynamic SQL Query for Pivoting

Construct the dynamic SQL query to pivot the data and include the percentage calculations.

DECLARE @Columns NVARCHAR(MAX);
DECLARE @SQL NVARCHAR(MAX);

-- Get unique regions and construct column names
SELECT @Columns = STRING_AGG(QUOTENAME(Region), ', ')
FROM (SELECT DISTINCT Region FROM Sales) AS Regions;

-- Construct the dynamic SQL query
SET @SQL = '
SELECT Year, ' + @Columns + ',
    ' + @Columns + 'Q1_Percent, ' + @Columns + 'Q2_Percent, ' + @Columns + 'Q3_Percent, ' + @Columns + 'Q4_Percent
FROM
(
    SELECT 
        Year, 
        Region, 
        SUM(SalesAmount) AS TotalSales,
        SUM(CASE WHEN Quarter = ''Q1'' THEN SalesAmount END) AS Q1_Sales,
        SUM(CASE WHEN Quarter = ''Q2'' THEN SalesAmount END) AS Q2_Sales,
        SUM(CASE WHEN Quarter = ''Q3'' THEN SalesAmount END) AS Q3_Sales,
        SUM(CASE WHEN Quarter = ''Q4'' THEN SalesAmount END) AS Q4_Sales,
        SUM(SalesAmount) * 100.0 / NULLIF(SUM(SUM(SalesAmount)) OVER (PARTITION BY Year), 0) AS TotalSalesPercent
    FROM #SalesSummary
    GROUP BY Year, Region
) AS SourceTable
PIVOT
(
    MAX(TotalSales) FOR Region IN (' + @Columns + ')
) AS PivotTable
PIVOT
(
    MAX(TotalSalesPercent) FOR Region IN (' + @Columns + 'Q1_Percent, ' + @Columns + 'Q2_Percent, ' + @Columns + 'Q3_Percent, ' + @Columns + 'Q4_Percent)
) AS PercentTable
ORDER BY Year;
';

-- Execute the dynamic SQL query
EXEC sp_executesql @SQL;

Explanation

  1. Temporary Table with Summed Data: The #SalesSummary table includes the total sales for each quarter by region and the overall total sales for each year. The SUM(SalesAmount) OVER (PARTITION BY Year) calculates the total sales for the entire year for use in the percentage calculation.

  2. Dynamic Columns: The STRING_AGG function constructs a comma-separated list of regions to dynamically generate the pivot column names.

  3. Dynamic SQL Query: The @SQL variable contains the dynamic SQL query that performs the pivot operation and includes additional calculations for each quarter's sales percentage within the year. The CASE statements within the query handle the different quarters, and the NULLIF function prevents division by zero errors.

  4. Pivoting and Calculations: The PIVOT operation is used twice:

    • First, to pivot the total sales data.
    • Second, to pivot the percentage calculations for each quarter.
  5. Execution: The sp_executesql stored procedure executes the dynamically constructed SQL query.

This example showcases how to dynamically pivot data while incorporating additional calculations, handling potential NULL values, and generating the pivot columns based on the data at runtime.

Dynamic SQL in t-sql

 Dynamic SQL in T-SQL (Transact-SQL, used in Microsoft SQL Server) is used to construct and execute SQL statements at runtime. This is particularly useful when the SQL query needs to be generated dynamically based on varying conditions, user inputs, or other runtime parameters.

How to Write and Execute Dynamic SQL in T-SQL

There are two primary ways to execute dynamic SQL in T-SQL:

  1. Using EXEC: The simplest way to execute a dynamic SQL statement is by using the EXEC command.
  2. Using sp_executesql: This method is more flexible and secure, as it allows parameterization of the dynamic SQL, which helps prevent SQL injection.

Examples

Example using EXEC

DECLARE @TableName NVARCHAR(50)
DECLARE @SQL NVARCHAR(MAX)

SET @TableName = N'Employees'
SET @SQL = N'SELECT * FROM ' + @TableName

EXEC(@SQL)

Example using sp_executesql

DECLARE @TableName NVARCHAR(50)
DECLARE @SQL NVARCHAR(MAX)
DECLARE @Column NVARCHAR(50)
DECLARE @Value NVARCHAR(50)

SET @TableName = N'Employees'
SET @Column = N'LastName'
SET @Value = N'Smith'
SET @SQL = N'SELECT * FROM ' + @TableName + ' WHERE ' + @Column + ' = @Value'

EXEC sp_executesql @SQL, N'@Value NVARCHAR(50)', @Value

Using Parameters to Prevent SQL Injection

One of the main benefits of using sp_executesql over EXEC is the ability to parameterize the SQL query, which helps prevent SQL injection attacks. Here’s an example:

DECLARE @TableName NVARCHAR(50)
DECLARE @SQL NVARCHAR(MAX)
DECLARE @LastName NVARCHAR(50)

SET @TableName = N'Employees'
SET @LastName = N'Smith'
SET @SQL = N'SELECT * FROM ' + @TableName + ' WHERE LastName = @LastName'

EXEC sp_executesql @SQL, N'@LastName NVARCHAR(50)', @LastName

Dynamic SQL with OUTPUT Parameters

sp_executesql also supports output parameters. Here’s an example of using an output parameter:

DECLARE @SQL NVARCHAR(MAX)
DECLARE @AvgSalary MONEY
DECLARE @DepartmentID INT

SET @DepartmentID = 1
SET @SQL = N'SELECT @AvgSalary = AVG(Salary) FROM Employees WHERE DepartmentID = @DeptID'

EXEC sp_executesql @SQL, N'@DeptID INT, @AvgSalary MONEY OUTPUT', @DeptID = @DepartmentID, @AvgSalary = @AvgSalary OUTPUT

PRINT @AvgSalary

Best Practices for Using Dynamic SQL in T-SQL

  1. Use sp_executesql with Parameters: Always use sp_executesql with parameters to avoid SQL injection.
  2. Validate Input: Validate all user inputs and ensure they are properly sanitized before including them in dynamic SQL.
  3. Use QUOTENAME: When dynamically constructing SQL object names (like table or column names), use the QUOTENAME function to properly escape them and prevent SQL injection.
  4. Limit the Use of Dynamic SQL: Use dynamic SQL only when necessary. Static SQL is easier to read, maintain, and optimize.

Here’s an example that demonstrates the use of QUOTENAME:

DECLARE @TableName NVARCHAR(50)
DECLARE @SQL NVARCHAR(MAX)

SET @TableName = N'Employees'
SET @SQL = N'SELECT * FROM ' + QUOTENAME(@TableName)

EXEC sp_executesql @SQL

By following these best practices, you can safely and effectively use dynamic SQL in T-SQL to handle complex querying requirements that static SQL cannot easily address.