You can use a Pivot Table as a database in the same way that you can use Excel Tables and other Excel-Friendly Databases (EFDs).
In this third article in a series about using EFDs, we move from Excel strategies to Excel hands-on. Here are the other articles in this series:
- The Excel-Friendly Database Strategy
- Introducing Excel-Friendly Databases
- Two Functions You MUST Know to Return Values from Excel Tables
I’ve never seen a description of how to use Pivot Tables as a database that works much like an Excel Table. That’s too bad, because PivotTable databases can give your formulas access to a massive amount of data.
If you don’t set up your Pivot Table as a database, you typically must use the GETPIVOTDATA function to return data from it. That limits your power, because GETPIVOTDATA works like a “screen-scraper” function. That is, it only can return the numbers and text you see on your screen.
On the other hand, if you set up your Pivot Table as a database, you can use Excel’s more powerful functions with it, functions like SUMIFS, SUMPRODUCT, INDEX, MATCH, and so on.
Using Pivot Tables as a database offers many general advantages. But there’s one specific advantage that’s more relevant now than ever before…
The Power Pivot Advantage
Microsoft’s Power Pivot offers a new and significant reason to use Pivot Tables as a spreadsheet database.
Power Pivot allows workbooks to store millions of rows of data in a workbook. But unfortunately, only Pivot Tables and CUBE functions can return data from the Power Pivot Data Model to cells in your workbook. And both methods can return data only from the workbook in which they reside. In fact, no Excel tool can return data directly from the Power Pivot Data Model stored in another workbook.
This is a problem, because each time you save another generation of your Power Pivot report, and each time you create a new Power Pivot report workbook, you naturally save another copy of your Power Pivot data. So before you know it, you’re buried in multiple generations of similar data.
Using Excel Services under SharePoint can help you to manage this problem. But here’s how to do it if you don’t use SharePoint:
- Set up only one workbook to contain each Power Pivot Data Model.
- In each of those workbooks, set up one or more Pivot Tables, each as a spreadsheet database.
- Set up your report workbooks to return data from the Pivot Table databases in each Power Pivot workbook.
With this approach, you need to maintain only one workbook to contain each Data Model. But you can create any number of report workbooks that can return data from any number of Pivot Table databases contained in any number of Power Pivot workbooks.
With this approach, you can give each report workbook access to a massive amount of data! And you only need to manage one copy of each Data Model.
But for now, let’s set up a Pivot Table database in a spreadsheet…
Set Up a Pivot Table to Look Like an Excel Table
For convenience, I defined this Pivot Table to get its data from the Excel Table I explained in Introducing Excel-Friendly Databases. But your Pivot Table probably will be linked to your data warehouse, to Access, to a Data Model, or to some other database.
By default, your initial Pivot Table will look something like this figure.
You can’t use this version of the Pivot Table as a spreadsheet database, because the layout is too complex; it’s not in simple rows and columns, like an Excel Table. Here, using GETPIVOTDATA is your only practical option for returning its data.
But beginning with Excel 2010, you have the option to change the format of this Pivot Table into one you CAN use as a database. To do so, first select any cell in the Pivot Table. Then in Pivot Table Tools, choose…
- Design, Layout, Subtotals, Do Not Show Subtotals
- Design, Layout, Grand Totals, Off for Rows and Columns
- Design, Layout, Report Layout, Show in Tabular Form
- Design, Layout, Report Layout, Repeat All Item Labels
- Design, Layout, Blank Rows, Remove Blank Line After Each Item
- Options, Show, +/- Buttons (hiding the buttons)
When you’re done with these six changes, your Pivot Table will look something like this.
(The default Pivot Table style applies those borders between each group of items in column A. Because they don’t get in the way, I’ve never bothered to search for a style that will eliminate them.)
Here, the Pivot Table is arranged much like an Excel Table. It has orderly rows and columns that your formulas can reference easily, with one exception: There’s no practical way for your formulas to actually reference data in this Pivot Table. Here’s why:
In Excel Tables, Excel automatically names the Table and each column in it. Additionally, we easily can define our own range names that reference columns in a Table—names that shrink and grow as the Table shrinks and grows. This lets our formulas reference the data easily and accurately.
On the other hand, Pivot Tables offer no such help. By default, our range names can’t adjust automatically as pivots shrink and grow. We therefore must use a special type of range name to allow worksheet formulas to respond to changes in our pivots.
So let’s set up the names…
How to Set Up Dynamic Range Names for Pivot Tables
Excel gives you the power to add and remove Pivot Table fields, which causes your Pivot Table to expand or contract horizontally. You also can change the order of the fields in your table.
But when you use a Pivot Table as a database, you must limit both kinds of changes. That is, you must set up your Pivot Table with a fixed layout of your columns—just as you do with an Excel Table.
On the other hand, you certainly can filter your data in a Pivot Table, which causes it to expand or contract vertically. But to allow your formulas to adapt to those changes you need to use dynamic range names.
Like most normal range names, dynamic range names reference areas of your worksheet. But unlike normal range names, dynamic range names can change their references in response to changes in your data.
To illustrate, if your Pivot Table is ten rows tall, dynamic range names can reference only those ten rows. But if your Pivot Table grows to a million rows tall, dynamic range names can adjust automatically to reference all of those one million rows.
Dynamic range names are defined using spreadsheet functions that return references, usually the OFFSET or INDEX functions. The references that these functions return point to new areas of your worksheet in response to changes in the values of their arguments.
In the following figure, I’ll show you how to define the Sales.DateTime range name using the OFFSET. But first, I needed to set up a formula that returns the number of rows in the Pivot Table.
Set Up the Pivot Table’s Sales.NumRows Range
First, I inserted several rows above the Pivot Table. I changed the Pivot Table’s name to Sales. I added the text shown in cell A1 in the figure below. I used the Create Names dialog to assign the name in cell A1 to cell B1. Then I entered this formula in the cell shown:
B1: =COUNT($A:$A)
This formula relies on two characteristics of Excel. First, the COUNT function counts only numbers in cells, not text. (The COUNTA function, on the other hand, counts both numbers and text.) Second, dates are numbers.
Therefore, because every row in this table has a date, we can count the number of rows in the Pivot Table by counting the number of dates in column A.
Here are three additional things you should know about this setup:
First, notice that the dates are in a different column than the COUNT formula. That’s necessary to avoid a circular calculation error.
Second, it’s okay if your data doesn’t include dates. Often, you can count the numbers in a column of numeric values. If that’s not possible, you can use the COUNTA function to count the numbers and text in a column, and then you subtract any labels that the count includes. In this case, for example, you also could use this formula in the cell shown:
B1: =COUNTA($A:$A)-3
Here, I subtract 3 to back out the counts for the content in cells A1, A3, and A4.
Third, notice that I assigned the name Sales.NumRows to cell B1. By starting all names associated with this Pivot Table with “Sales.”—where “Sales” is the Pivot Table’s name—your range names will be easier to manage in Excel. Also, when you add another Pivot Table to your workbook, you can have similar ranges based on the new table’s name. For example, you could have Costs.NumRows, SKUs.NumRows, and so on.
Here are answers to three questions that might occur to you…
Is it safe to use periods in range names?
Yes. Excel treats periods in range names just like any other character. But by using them as shown, you make it much easier to identify the names associated with each Pivot Table database.
If we change the name of a Pivot Table database, do we need to change the names of our range names?
Not necessarily. However, your formulas will be easier to understand if you do keep the names in sync. This is because your formulas are more clear if you return data from a Pivot Table named “Sales” using range names that begin with “Sales.”—particularly when you look at your formulas several months after you create them. But it really doesn’t matter to Excel what name you assign to your range names.
Why do I need to set up the COUNT or COUNTA calculation in a separate cell? Can’t I just use it as part of the formula that defines my dynamic range?
Because COUNT and COUNTA must examine every cell in the range they reference, they take a comparatively long time to calculate. That is, if we were to use them in the following formulas they would need to recalculate each time the formula is used. But when we put the calculation in one cell, it needs to calculate only once—which reduces your calculation time.
Set Up the Dynamic Range Names
The OFFSET function is volatile, which means that it calculates every time Excel does, even though its data hasn’t changed. The INDEX function, like most Excel worksheet functions, isn’t volatile. However, the OFFSET function is easier to use for dynamic range names.
In general, the function takes this form:
=OFFSET(reference, rows, cols, height, width)
To define the dynamic name, first launch the New Name dialog. To do so, in the Formulas, Defined Names group, choose Define Name. In the New Name dialog, enter Sales.DateTime in the Name edit box, and enter the formula…
=OFFSET(Data!$A$4, 1, 0, Sales.NumRows, 1)
…in the Refers-to edit box. Then choose OK.
In this formula, the function starts in cell A4, which is the top-left cell of the Pivot Table. The reference points down one row (the second argument), and stays in the same column (the third argument). Then the reference expands vertically by the number of rows specified by the Sales.NumRows value. And finally, the reference is set to one column wide.
You set up the other rows similarly. When you’re done, your names for this sheet will be defined something like this:
Sales.DateTime =OFFSET(Data!$A$4,1,0,Sales.NumRows,1)
Sales.Product =OFFSET(Data!$B$4,1,0,Sales.NumRows,1)
Sales.Color =OFFSET(Data!$C$4,1,0,Sales.NumRows,1)
Sales.Amount =OFFSET(Data!$D$4,1,0,Sales.NumRows,1)
(Here’s an easy way to set up the formulas for each name: When you set up the first name, copy the OFFSET formula in the Refers-to box of the New Name dialog. Next, when you create the other names, paste the formula into each new name’s Refers-to box, then edit the column letter in the first argument as needed.)
Taking the Next Steps
Now that the names have been set up, your formulas can reference the PivotTable database just as they reference other Excel-Friendly databases, as these examples show:
- Gray Cell Table: =SUM(Amount)
- Excel Table: =SUM(Table1[Amount])
- Pivot Table: =SUM(Sales.Amount)
But, of course, you’ll typically need to use more spreadsheet functions than merely SUM. So that’s what I cover in the last article in this series: The Two Functions You MUST Know to Return Values from Excel Tables.