MSSQL Pivot
Pivots (turning a column’s values into actual columns) is a very common activity. Spreadsheet programs have robust support for it. But Standard SQL? Not so much.
The Problem
Create Table orders (
orderNumber int,
sku char(3),
quantity int,
salesPerson varchar(10)
);
insert into orders values
( 1, 'ZR34', 2, 'Mary'),
( 1, 'AS99', 1, 'Mary'),
( 2, 'ZR34', 1, 'Jim'),
( 2, 'MB01', 1, 'Jim');
The ubiquitous order table with the SKU, quantity and sales person. To keep this simple, I did not normalize. If that bothers you, then think of the orders table as the results of joining between the all the bits.
The ask, is to product a report that shows, for each sales persons, how many of each SKU they sold.
sku Mary Jim Kiki
AS99 1 0 0
MB01 0 1 0
ZR34 2 1 0
(Kiki was apparently on vacation.)
Now if your most people, you download the data to a spreadsheet and call it a day. But we’re not most people. We have a hammer (SQL) so we are going to hammer this flat. Plus, we know that we’ll get the same request next week, and the next, etc. And who has time for that?
Standard SQL
If we were to do this in standard SQL, it would look like:
select sku, sum([Mary]) as "Mary", sum([Jim]) as "Jim",
sum([Kiki]) as "Kiki"
from (
select
sku,
case when salesPerson = 'Jim' then quantity else 0 end as [Jim],
case when salesPerson = 'Mary' then quantity else 0 end as [Mary],
case when salesPerson = 'Kiki' then quantity else 0 end as [Kiki]
from orders
) as bySP
group by bySP.sku
A new case statement is required for every salesperson. That’s no fun.
MSSQL
MSSQL has a pivot statement that makes this a bit less painful
select sku, [Mary], [Jim], [Kiki]
from (select sku, quantity, salesPerson from orders) s
pivot (sum(quantity) for salesPerson in ( [Mary], [Jim], [Kiki])) pvt
Some notes about the syntax:
- The alias for the pivot (eg
pvt
) is required. - The alias for the subselect is also required, even though it isn’t used.
- The values that form the in-list are not strings - they are column names.
- You can use * in the outer select’s return list.
You can use a bare table in the from clause, but be careful. Any column (like sku
) that is not aggregated or used as the pivot column becomes a defacto group-by. In our example orderNumber
becomes another row label
select *
from orders
pivot (sum(quantity) for salesPerson in ( [Mary], [Jim], [Kiki])) pvt
Leading to:
orderNumber sku Mary Jim Kiki
1 AS99 1 NULL NULL
2 MB01 NULL 1 NULL
1 ZR34 2 NULL NULL
2 ZR34 NULL 1 NULL
It would be nice if we could do something like this:
-- THIS DOESN'T WORK
select *
from (select sku, quantity, salesPerson from orders) s
pivot sum(quantity)
for salesPerson in ( select distinct salesPerson from orders ) as pt
But unfortunately, the list of values needs to be given explicitly. The one good thing about this is that you will have a column for a value, even if there is no rows that match (think of poor Kiki).
Dynamic SQL
But this can be done using Dynamic SQL and exec. I’ve built dynamic queries such as this for the standard sql case and it is no fun. Doing so for the pivot operator is a piece of cake.
First create a salesForce
table so Kiki will make an appearance.
create table salesForce(
name varchar(10)
);
insert into salesForce values ('Kiki'), ('Mary'), ('Jim')
Then use a cursor to build our column list and presto!
declare sp cursor for select [name] from salesForce
declare @list varchar(500) = ''
declare @query varchar(1000)
declare @aName varchar(10)
open sp
Fetch next from sp into @aName
while @@FETCH_STATUS = 0
begin
if @list != ''
set @list = @list + ', '
set @list = @list + '[' + @aName + ']'
fetch next from sp into @aName
end
close sp
deallocate sp
set @query = 'select sku, ' + @list +
' from (select sku, quantity, salesPerson from orders) s ' +
'pivot (sum(quantity) for salesPerson in (' + @list + ')) pvt'
print @query
exec(@query)
Unfortunately, to get rid of the nulls means keeping two parallel list - one for the select that has the isnull
and one for the pivot value list.
I hope this helps you use the pivot statement effectively.