7

The awesome Pivot and Unpivot Features in Oracle SQL

I decided to write about the Pivot and Unpivot clauses, because I find them very useful and powerful.

In this article I will try to explain how they are used and show you some examples of the things you can do with them.

These are features that were added in version 11g of the database, and are used to transpose or rotate rows into columns and columns into rows very easily.

Converting Rows Into Columns (The Pivot Clause)

Let’s start by looking at the Pivot clause.

To show you why you might need to do this type of operation, I’m going to show you a very simple example.

This is how the employee table I’m going to use for the example looks like:

ID NAME BIRTHDATE DEPARTMENT_ID HIRE_DATE
7 CHARLES DEAN 08-JUN-1973 00:00:00 3 01-MAR-2002 00:00:00
8 EDDIE FARREL 28-JUL-1980 00:00:00 1 20-APR-2009 00:00:00
4 STUART GARCIA 12-APR-1965 00:00:00 4 15-FEB-2010 00:00:00
11 LUKE MINT 19-OCT-1985 00:00:00 4 08-JAN-2011 00:00:00
9 GEORGE HAYES 03-AUG-1982 00:00:00 2 22-SEP-2012 00:00:00
3 LUISA JACKSON 08-MAR-1970 00:00:00 3 29-AUG-2013 00:00:00
2 JAMES BOSH 15-FEB-1992 00:00:00 2 01-JUL-2014 00:00:00
12 NIGEL OAKS 05-NOV-1997 00:00:00 4 01-OCT-2014 00:00:00
10 IGOR KEYS OSBOURNE 11-SEP-1987 00:00:00 3 14-NOV-2014 00:00:00
1 JOHN SMITH 01-JAN-1995 00:00:00 1 28-MAR-2015 00:00:00
5 JUSTIN BLACK 16-MAY-1990 00:00:00 1 02-MAY-2015 00:00:00
6 ANGIE CROOD 22-JUN-1998 00:00:00 1 01-JUL-2015 00:00:00
13 LUKE GREEN JR 05-FEB-1995 00:00:00 4 01-SEP-2015 00:00:00

Now suppose I am asked to create a query that tells me the number of employees that were hired in the years of 2014 and 2015 for each department. They want to see: The department id, and then a column for the number of employees hired in 2014 and another column for the number of employees hired in 2015.

How can I do that?
How would you do it?

I can’t do a simple count grouped by department, because it would count the number of employees of each department, regardless of the date in which they were hired.

So, how would you do it? Think about for a moment.

Somehow, I need to count employees, but as we discussed earlier, a simple count won’t work. So, in one column, I need to count employees only if they were hired in 2014, and in another column I need to count those who were hired in 2015.

Since there is a condition involved here, it sounds like a job for a CASE expression.

So, my query using COUNT and CASE expressions would be something like this:

SELECT department_id, COUNT(
    CASE TO_CHAR(hire_date, 'YYYY')
      WHEN '2014'
      THEN 1
    END) AS "2014", COUNT(
    CASE TO_CHAR(hire_date, 'YYYY')
      WHEN '2015'
      THEN 1
    END) AS "2015"
FROM employee
GROUP BY department_id
ORDER BY department_id;
DEPARTMENT_ID 2014 2015
1 0 3
2 1 0
3 1 0
4 1 1

In this example, the CASE expressions will return NULL when the condition is not met, and since COUNT only counts rows where the column or expression passed as argument is not null, it will only count the rows where the CASE expression’s condition evaluated to true.

Now, there is an easier way to do it, that is available on version 11g or above, and it is by using the PIVOT clause.

Here is the equivalent query (we are going to review it together).

SELECT *
FROM
  (
    SELECT department_id, TO_CHAR(hire_date, 'YYYY') AS YEAR
    FROM employee
  )
  PIVOT (COUNT(*) FOR YEAR IN (2014, 2015))
ORDER BY department_id;

This produces the same results than our previous query, but this query looks simpler, don’t you think?

Okay, let’s review it!

The code in line # 7 is what does the magic, and it is called the PIVOT clause, and some people also call it “the pivot operator”.

Here is a short explanation:
The PIVOT keyword is mandatory, and then, enclosed in parentheses, you have to provide an aggregate function, which in this case was COUNT(*). After the aggregate function it comes the FOR keyword and then it comes a column whose rows you want to rotate into separate columns. In this case the column I used is “YEAR”, so each row in this column will become a separate column in the final result set. Then it comes the IN keyword, and then between parentheses, the list of values for which I want separate columns. In this case, I want a column for 2014 and one for 2015.

There is an important limitation here, regarding the list of values I put inside the parentheses: This list is static, or in other words, I have to specify each one of the values for which I want a separate column. I cannot use a subquery here. Please keep that in mind.

So, in natural language, this is what I’m telling the database to do: I want you to look at the YEAR column and count how many rows have 2014 in that column, and how many rows have 2015, and return the counts in separate columns.

Now, as you can see I used a subquery (starts in line # 3). Do you remember how this type of subqueries that appear in the FROM clause are called? They are called inline views.

So, I needed an inline view because where I have the “YEAR” in the pivot clause, it is mandatory that you use a column. I cannot use the hire_date column because I don’t want to pivot on the complete hire_date. I only want to pivot on the year, so I need a function to extract the year from the date (I used TO_CHAR but I could have used EXTRACT as well), but functions are not allowed here, so I use the function in the inline view and provide an alias for the new column, and then in the main query I can reference it as a column called YEAR.

Another thing that is important to notice is that in my inline view I have the department_id column, and in my results I have also the department id column, so, every column that is not used in the PIVOT clause works as if I were grouping by that column. In this case, I have the count of employees grouped by department id.

What else can I do here?

Well, I can provide an alias for the aggregate function. I’m going to call it “EMPLOYEES_HIRED” to show you how this alias is used.

SELECT *
FROM
  (
    SELECT department_id, TO_CHAR(hire_date,'YYYY') AS YEAR
    FROM employee
  )
  PIVOT (COUNT(*) AS EMPLOYEES_HIRED FOR YEAR IN (2014, 2015))
ORDER BY department_id;
DEPARTMENT_ID 2014_EMPLOYEES_HIRED 2015_EMPLOYEES_HIRED
1 0 3
2 1 0
3 1 0
4 1 1

As you can see, it added the alias to the name of each of the columns.

And I can also add an alias for the values inside the parentheses, for example:

SELECT *
FROM
  (
    SELECT department_id, TO_CHAR(hire_date,'YYYY') AS YEAR
    FROM employee
  )
  PIVOT (COUNT(*) AS EMPLOYEES_HIRED FOR YEAR IN (2014 AS year2014 , 2015 AS year2015))
ORDER BY department_id;
DEPARTMENT_ID YEAR2014_EMPLOYEES_HIRED YEAR2015_EMPLOYEES_HIRED
1 0 3
2 1 0
3 1 0
4 1 1

And as you see, those aliases are also used to construct the final column names.

So, you can add aliases for the aggregate function and for the values in the IN sub-clause, but they are optional. If you don’t provide aliases, the values in the IN sub-clause become the names of the final columns.

Okay, what else?

Well, I can have more than one aggregate function, for example, if besides the count of employees for each year I want the SUM of the salary of the employees that were hired in that year, I can add a SUM on the SALARY too:

SELECT *
FROM
  (
    SELECT department_id, TO_CHAR(hire_date, 'YYYY') AS YEAR, salary
    FROM employee
  )
  PIVOT (
    COUNT( * ) AS employees_hired,
    SUM(salary) AS salaries 
    FOR YEAR IN (2014 AS year2014, 2015 AS year2015))
ORDER BY department_id;
DEPARTMENT_ID YEAR2014_EMPLOYEES_HIRED YEAR2014_SALARIES YEAR2015_EMPLOYEES_HIRED YEAR2015_SALARIES
1 0 3 8050
2 1 3500 0
3 1 6000 0
4 1 4750 1 4750

Pretty cool, right?

Okay, what else?

You can have combinations of values in the IN clause!

For example, if I want to count the employees that were born in 1995 and hired in 2015 and the count of those that were born in 1997 and hired in 2014, I can do something like this:

I’m going to make some changes and remove all the aliases to simplify it.

SELECT *
FROM
  (
    SELECT department_id, TO_CHAR(hire_date, 'YYYY') AS hire_year,
           TO_CHAR(birthdate, 'YYYY') as birth_year
    FROM employee
  )
  PIVOT (
    COUNT(*)
    FOR (birth_year,hire_year) 
      IN ((1995,2015) , (1997,2014)))
ORDER BY department_id;
DEPARTMENT_ID 1995_2015 1997_2014
1 1 0
2 0 0
3 0 0
4 1 1

As you can see, I have birth_year and hire_year in the FOR clause in line 10, and I have the combination of values I’m interested in, in the IN clause in line 11.

This is awesome…

Now, let’s talk about Unpivot.

Converting Columns Into Rows (The Unpivot Clause)

As you can imagine, Unpivot help us do the opposite operation of Pivot. With UNPIVOT we can rotate columns into rows.

I’m going to use a very simple table called DEPARTMENT to build an unpivot example.

Here is my DEPARTMENT table. It has an id and name for every department. It has also a column for the monthly budget and one for the ID of the last employee hired in that department:

ID NAME MONTHLY_BUDGET LAST_EMPLOYEE_ID
1 ACCOUNTING 20000 8
2 MARKETING 15000 9
3 INFORMATION TECHNOLOGY 30000 10
4 HUMAN RESOURCES 25000 13
5 REGULATORY AFFAIRS 5000
6 CUSTOMER SERVICE 2000

Now, since the monthly budget and last employee id have both the same data type, which in this case is numeric data, I could use UNPIVOT to do something like this:

SELECT *
FROM department 
UNPIVOT (
  data_value
  FOR category
  IN (monthly_budget,last_employee_id)
);
ID NAME CATEGORY DATA_VALUE
1 ACCOUNTING MONTHLY_BUDGET 20000
1 ACCOUNTING LAST_EMPLOYEE_ID 8
2 MARKETING MONTHLY_BUDGET 15000
2 MARKETING LAST_EMPLOYEE_ID 9
3 INFORMATION TECHNOLOGY MONTHLY_BUDGET 30000
3 INFORMATION TECHNOLOGY LAST_EMPLOYEE_ID 10
4 HUMAN RESOURCES MONTHLY_BUDGET 25000
4 HUMAN RESOURCES LAST_EMPLOYEE_ID 13
5 REGULATORY AFFAIRS MONTHLY_BUDGET 5000
6 CUSTOMER SERVICE MONTHLY_BUDGET 2000

In my results I have a “category” column, and in that column I have “monthly budget” and “last employee id”, which are the titles of two separate columns in the department table, but now they are in the same column but in different rows, so I actually transposed columns into rows.

As you can see, now I have 2 rows for each department. In one row I have the monthly budget, and in the other row I have the last employee id, and in the column called DATA_VALUE I have the value that corresponds to the category I’m looking at, so, in the case of ACCOUNTING, in my original data I have 20,000 in the MONTHLY BUDGET column, and 8 in the LAST EMPLOYEE ID column, but now I have them in the same column, but in different rows.

The name of the category column was defined in line 5, in the FOR clause, and the name of the data_value column was defined in line 4.

There’s another thing to notice here: I have 2 rows for ACCOUNTING, 2 for MARKETING and 2 for IT and HUMAN RESOURCES, but I have only one for REGULATORY AFFAIRS and CUSTOMER SERVICE. Why is that?

If you look at the data in the department table again, you can see that for those departments I have NULL in the LAST_EMPLOYEE_ID column, and that is why I don’t have that category in my UNPIVOTed results. This is the default behavior, but if you want to get all of the rows, even if they have NULL in the original data, you just need to add “INCLUDE NULLS” in the UNPIVOT clause:

SELECT *
FROM department 
UNPIVOT INCLUDE NULLS (
  data_value
  FOR category
  IN (monthly_budget,last_employee_id)
);
ID NAME CATEGORY DATA_VALUE
1 ACCOUNTING MONTHLY_BUDGET 20000
1 ACCOUNTING LAST_EMPLOYEE_ID 8
2 MARKETING MONTHLY_BUDGET 15000
2 MARKETING LAST_EMPLOYEE_ID 9
3 INFORMATION TECHNOLOGY MONTHLY_BUDGET 30000
3 INFORMATION TECHNOLOGY LAST_EMPLOYEE_ID 10
4 HUMAN RESOURCES MONTHLY_BUDGET 25000
4 HUMAN RESOURCES LAST_EMPLOYEE_ID 13
5 REGULATORY AFFAIRS MONTHLY_BUDGET 5000
5 REGULATORY AFFAIRS LAST_EMPLOYEE_ID
6 CUSTOMER SERVICE MONTHLY_BUDGET 2000
6 CUSTOMER SERVICE LAST_EMPLOYEE_ID

I modified the Unpivot clause in line 3, and as you can see, with this change now I have both categories for those departments.

Okay, to show you a slightly more complex example of the Unpivot clause, I’m going to create a table from some pivoted data from my employee table:

CREATE TABLE pivoted_employee AS
SELECT department_id, COUNT( * ) as total_employees, SUM(salary) as total_salaries, 
  COUNT(
    CASE
      WHEN salary < 3000
      THEN 1
    END) as employees_3000,
  SUM(
    CASE
      WHEN salary < 3000
      THEN salary
    END) AS salaries_3000
FROM employee
GROUP BY department_id;
DEPARTMENT_ID TOTAL_EMPLOYEES TOTAL_SALARIES EMPLOYEES_3000 SALARIES_3000
1 4 11050 2 4050
2 2 6000 1 2500
4 4 16500 1 2000
3 3 12750 1 2250

As you can infer from the query, the total_employees column has the count of the employees for each department, and the total_salaries has the sum of the salaries of the employees in each department, and the other 2 columns have the count and sum of salaries only for employees whose salary is less than 3000.

Now, I want to unpivot this data, to get only one column for the counts and one for the sum of salaries, and another column that tells me the type of data that each row represents.

How can I do that?

Here is one way to do it:

SELECT *
FROM pivoted_employee
UNPIVOT (
  (employees, salaries) 
  FOR type IN 
  (
    (total_employees,total_salaries) as 'ALL EMPLOYEES', 
    (employees_3000,salaries_3000) as 'EMPLOYEES WHO EARN LESS THAN 3000'
  )
)
ORDER BY department_id,type;
DEPARTMENT_ID TYPE EMPLOYEES SALARIES
1 ALL EMPLOYEES 4 11050
1 EMPLOYEES WHO EARN LESS THAN 3000 2 4050
2 ALL EMPLOYEES 2 6000
2 EMPLOYEES WHO EARN LESS THAN 3000 1 2500
3 ALL EMPLOYEES 3 12750
3 EMPLOYEES WHO EARN LESS THAN 3000 1 2250
4 ALL EMPLOYEES 4 16500
4 EMPLOYEES WHO EARN LESS THAN 3000 1 2000

So, the "employees" and "salaries" that I have in line 4 of the query, are the columns that will have measure values in the final results. They will have the actual sums and counts from the pivoted_employee table. I called them employees and salaries, but they could be named however I like.

The "type" that I have in line 5 is the name of the column where the descriptors, or categories, or types will be returned, and it could also have any other name if I wanted to.

And the pairs of columns that I have in lines 7 and 8, define which data I want to include for each type, for example, in the row that will have the data for all of the employees, I want to include TOTAL_EMPLOYEES for the EMPLOYEES column in the result, and TOTAL_SALARIES for the SALARIES column in the result, and the string that I have after the "AS" keyword, is the actual text that will be displayed in the TYPE column for the row that includes data for all employees.

In a similar way, for the row that includes data only for employees who earn less than 3000, the type column will say 'EMPLOYEES WHO EARN LESS THAN 3000', and the EMPLOYEES column in the result will have the data from the EMPLOYEES_3000 column from my pivoted_employee table, and the salaries column will have data from the SALARIES_3000 column.

I Know! This is not very easy to memorize. I mean, remembering what the role of each part of the UNPIVOT clause is, is not something easy to do, but I guarantee you that it will become easier if you practice.

And to conclude the article, I just want to show you that you can combine PIVOT and UNPIVOT in the same query.

Take a look at this:

SELECT *
FROM pivoted_employee
UNPIVOT (
  (employees, salaries) 
  FOR type IN 
  (
    (total_employees,total_salaries) as 'ALL', 
    (employees_3000,salaries_3000) as 'EARN LESS THAN 3000'
  )
)
PIVOT (
  MAX(salaries) as sal, 
  max(employees) as emp 
  FOR department_id 
    IN (1 as d1,2 as d2,3 as d3,4 as d4)
);
TYPE D1_SAL D1_EMP D2_SAL D2_EMP D3_SAL D3_EMP D4_SAL D4_EMP
EARN LESS THAN 3000 4050 2 2500 1 2250 1 2000 1
ALL 11050 4 6000 2 12750 3 16500 4

So, I used UNPIVOT to rotate some columns into different rows (line 3), and then I used PIVOT to rotate the different rows I had for each DEPARTMENT ID into columns (line 11), as you see in the final result. Now I have 2 columns for each department: One for the salaries and one for the employees. And I did all of this in the same query, which is pretty awesome!

Okay, that's it for this article. I hope it helped you at least a little bit to understand how to use these awesome features.

Now, go ahead and have some fun practicing them.

See you later!

Opt In Image
Don't want to miss any Oracle SQL tips?

Subscribe to be informed about new posts, tips and more awesome things.

Carlos

Carlos

I've been working with Oracle databases on a daily basis for more than 10 years.

Leave a Reply

7 Comments on "The awesome Pivot and Unpivot Features in Oracle SQL"

avatar
Sort by:   newest | oldest | most voted
soni
Guest

Thanks Carlos..Very nicely explained

Roy
Guest

Thanks Carlos, you are a better instructor and has a brilliant ideas. I like your articles.

Maruthi
Guest

Very good concept on the Pivot and unpivot functions, very nicely articulated for easy understanding – Thank you so much CARLOS, i am following all your articls, you are doing great stuff — keep doing your innovative work and expecting more articles from you..

wpDiscuz