You can use a Pivot Table as a database in the same way that you can use Simple Tables or Excel Tables.
However, I’ve never seen a description of how to do it. That’s too bad, because Excel 2010 gave Excel users the ability to use one or more Pivot Tables as a massive and powerful spreadsheet database.
If you don’t set up your Pivot Table as a database, you typically must use the GETPIVOTDATA function to return data from it. That limits your power, because GETPIVOTDATA is a “screen-scraper” function. That is, it only can return the numbers and text you see on your screen.
On the other hand, if you do set up your Pivot Table as a database, you can use Excel’s more powerful functions with it, functions like SUMIFS, SUMPRODUCT, INDEX, MATCH, and so on. In future posts, I’ll show you how these functions can give you significantly more power to return results from your spreadsheet database.
Using Pivot Tables as a database offers many general advantages. But there’s one specific advantage that’s worth considering…
The PowerPivot Advantage
Microsoft’s introduction of PowerPivot offers a new and significant benefit to using Pivot Tables as a spreadsheet database.
PowerPivot allows workbooks to store millions of rows of data in a workbook. But unfortunately, formulas and PivotTables in one workbook can not return data directly from a PowerPivot dataset stored in another workbook.
This is a problem, because each time you save another generation of your PowerPivot report, and each time you create a new PowerPivot report workbook, you must save another copy of your PowerPivot data. So before you know it, you’re buried in multiple generations of the same data.
Using Excel Services under SharePoint can help you to manage this problem. But here’s how to do it if you don’t use SharePoint:
- Set up only one workbook to contain each PowerPivot dataset.
- In each of those workbooks, set up one or more Pivot Tables as a spreadsheet database.
- Set up your report workbooks to return data from the PivotTable database in each PowerPivot workbook.
With this approach, you need to maintain only one workbook to contain each PowerPivot dataset. But you can create any number of report workbooks that can return data from any number of PivotTable databases contained in any number of PowerPivot workbooks.
With this approach, you can give each report workbook access to a massive amount of data! And you only need to manage one copy of each PowerPivot dataset.
But for now, let’s set up a PivotTable database in a spreadsheet…
Set Up the Pivot Table as a Spreadsheet Database
For convenience, I defined this Pivot Table to get its data from the Excel Table I explained in Introducing Excel’s Three Types of Spreadsheet Databases. But your Pivot Table probably will be linked to your data warehouse, to Access, or to some other “real” database.
By default, your initial Pivot Table will look something like this figure.
You can’t use this version of the Pivot Table as a spreadsheet database, because the layout is too complex; it’s not in simple rows and columns. Here, using GETPIVOTDATA is your only practical option for returning its data.
But Excel 2010 gives you the option to change the format of this PivotTable into one you CAN use as a database. To do so, first select any cell in the Pivot Table. Then in Pivot Table Tools, choose…
- Design, Layout, Subtotals, Do Not Show Subtotals
- Design, Layout, Grand Totals, Off for Rows and Columns
- Design, Layout, Report Layout, Show in Tabular Form
- Design, Layout, Report Layout, Repeat All Item Labels
- Design, Layout, Blank Rows, Remove Blank Line After Each Item
- Options, Show, +/- Buttons (hiding the buttons)
When you’re done with these six changes, your Pivot Table will look something like this.
(The default Pivot Table style applies those borders between each group of items in column A. Because they don’t get in the way, I’ve never bothered to search for a style that will eliminate them.)
Here, the Pivot Table is arranged much like an Excel Table. It has orderly rows and columns that your formulas can reference easily, with one exception: There’s no practical way for your formulas to actually reference data in this Pivot Table. Here’s why:
In Excel Tables, Excel automatically names the Table and each column in it. This lets our formulas reference the data easily, by name. But with Pivot Tables, Excel offers no help to our formulas. We therefore must use range-naming tricks to allow worksheet formulas to reference Pivot Tables easily.
So let’s set up the names…
Set Up Range Names so Formulas Can Reference Your Pivot Table Easily
Let’s define this step carefully. Otherwise, you’ll work harder than you need to.
The PivotTable Fields List gives you great control over the shape of your Pivot Table. It controls both the horizontal and vertical dimensions of the table.
But when you use the Pivot Table as a database you’ll make your work easier if you can restrict your changes to the horizontal dimension. On the other hand, your solution must adapt to changes in the vertical dimension.
Restrict Changes to the Horizontal Dimension
Excel gives you the power to add or remove PivotTable fields, which causes your Pivot Table to expand or contract horizontally. And you also can change the order of the fields in your table.
But when you use a Pivot Table database, you should plan to limit both actions.
Obviously, if you remove fields that your formulas rely on, you’ll break your formulas. And there’s no way to avoid that problem.
On the other hand, if you add or rearrange fields in your PivotTable database, you can set up range names that adapt automatically to those changes. However, that takes more work to set up. Therefore, in this post, I’ll assume that you always add new fields to the far right of your Pivot Table, and that you don’t rearrange existing fields.
Adapt to Changes in the Vertical Dimension
As you change filters in your Pivot Table, it necessarily will expand or contract vertically. That’s exactly what you want it to do.
Excel formulas can adapt easily to these changes. You just need to use dynamic range names.
How to Set Up Dynamic Range Names for Pivot Tables
Like most normal range names, dynamic range names reference areas of your worksheet. But unlike normal range names, dynamic range names can change their references in response to changes in your data.
To illustrate, if your Pivot Table is ten rows tall, dynamic range names can reference only those ten rows. But if your Pivot Table grows to a million rows tall, dynamic range names can adjust automatically to reference all of those one million rows.
Dynamic range names are defined using spreadsheet functions that return references, usually the OFFSET or INDEX functions. The references that these functions return point to new areas of your worksheet in response to changes in the values of their arguments.
In the following figure, I’ll show you how I defined the Sales.DateTime range name using the OFFSET function. But first, I needed to set up a formula that returns the number of rows in the Pivot Table.
Set Up the Pivot Table’s Sales.NumRows Range
First, I inserted several rows above the Pivot Table. I changed the Pivot Table’s name to Sales. I added the text shown in cell A1 below. I used the Create Names dialog to assign the name in cell A1 to cell B1. Then I entered this formula in the cell shown:
This formula relies on two characteristics of Excel. First, the COUNT function counts only numbers in cells, not text. (COUNTA, on the other hand, counts both numbers and text.) Second, dates are numbers.
Therefore, because every row in this table has a date, we can count the number of rows in the Pivot Table by counting the number of dates in column A.
Here are three additional things you should know about this setup:
First, notice that the dates are in a different column than the COUNTA formula. That’s necessary to avoid a circular calculation error.
Second, it’s okay if your data doesn’t include dates. Often, you can count the numbers in a column of numeric values. If that’s not possible, you can use the COUNTA function to count the numbers and text in a column. In this case, however, you should subtract the number of cells with content that’s not in your database.
To illustrate, you also could use this formula in the cell shown:
Here, I subtract 3 to back out the counts for the content in cells A1, A3, and A4.
Third, notice that I assigned the name Sales.NumRows to cell B1. By starting all names associated with this Pivot Table with “Sales.”—where “Sales” is the Pivot Table’s name—your range names will be easier to manage in Excel. Also, when you add another Pivot Table to your workbook, you can have similar ranges based on the new table’s name. For example, you could have Costs.NumRows, SKUs.NumRows, and so on.
Here are answers to three questions that might occur to you…
Is it safe to use periods in range names?
Yes. Excel treats periods in range names just like any other character. But by using them as shown, you make it much easier to identify the names associated with each Pivot Table database.
If we change the name of a Pivot Table database, do we need to change the names of our range names?
Not necessarily. However, your formulas will be easier to understand if you do keep the names in sync. This is because your formulas are more clear if you return data from a Pivot Table named “Sales” using range names that begin with “Sales.”—particularly when you look at your formulas several months after you create them. But it really doesn’t matter to Excel what name you assign to your range names.
Why do I need to set up the COUNT or COUNTA calculation in a separate cell? Can’t I just use it as part of the formula that defines my dynamic range?
Because COUNT and COUNTA must examine every cell in the range they reference, they take a comparatively long time to calculate. That is, if we were to use them in the following formulas they would need to recalculate each time the formula is used. But when we put the calculation in one cell, it needs to calculate only once—which could reduce calculation time considerably.
Set Up the Dynamic Range Names
Now you can set up the Sales.DateTime dynamic range name by using the OFFSET function. In general, the function takes this form:
To define the dynamic name, first launch the New Name dialog by choosing Formulas, Defined Names, Define Name. In the New Name dialog, enter Sales.DateTime in the Name edit box, and enter the formula…
…in the Refers-to edit box. Then choose OK.
In this formula, the function starts in cell A4, which is the top-left cell of the Pivot Table. The reference points down one row (the second argument), and stays in the same column (the third argument). Then the reference expands vertically by the number of rows specified by the Sales.NumRows value. And finally, the reference is set to one column wide.
You set up the other rows similarly. When you’re done, your names for this sheet will be defined something like this:
(Here’s an easy way to set up the formulas for each name: When you set up the first name, copy the OFFSET formula in the Refers-to box of the New Name dialog. Next, when you create the other names, paste the formula into each new name’s Refers-to box, then edit the column letter in the first argument as needed.)
Taking the Next Steps
Now that the names have been set up, your formulas can reference the PivotTable database just as they reference the other databases, as these examples show:
- Simple Table: =SUM(Amount)
- Excel Table: =SUM(Table1[Amount])
- Pivot Table: =SUM(Sales.Amount)
But, of course, you’ll typically need to use more spreadsheet functions than merely SUM.
So that’s what I’ll cover in my next post.