You can raise your Excel productivity significantly and slash your Excel errors—if you keep your data in an Excel database.
Microsoft, Oracle, IBM, and others sell REAL database products. So what’s an Excel database?
If all those other database products were big-rig diesel trucks of various capacities and designs, Excel’s database would be a two-seat Jeep.
Big-rig databases can haul a massive amount of data down the information highway, which Excel can’t do. But Excel can deliver small, customized payloads, nearly anywhere, quickly and inexpensively—which big-rig databases can’t do.
Wikipedia defines a database as, “an organized collection of data, generally stored and accessed electronically from a computer system.” And that’s exactly what we can set up in Excel.
An Excel database is typically a spreadsheet with rows and columns of data—organized and formatted in a way that allows spreadsheet formulas to use the data easily. But it also could be a Power Pivot Data Model, as you’ll see shortly.
The following figures show you how Excel can use its various databases. But if you’ve not read the short article, An Introduction to Excel Data Plumbing, you might want to start there.
Excel Database 1: Excel Tables
Excel Tables are the most-frequently used Excel database. Strategically, here’s how they typically should fit into your productive reporting system:
That is, Power Query retrieves, shapes, and cleans your source data and writes it to Excel Tables. Then, Excel Formulas retrieve, transform, and summarize that data for your reports, analyses, and so on.
Power Query was the name of the add-in that Microsoft created for Excel 2010 and 2013 to import data from just about anywhere into Excel. In 2016 and above, you’ll find its features in your Ribbon’s Data, Get & Transform Data and your Data, Queries & Connections groups. But most Excel users still call its features Power Query.
To make your reports, plans, and so on interactive, the Excel Formulas could reference control cells, which could be updated by using data validation lists or slicers.
This figure, from my Recovery Tracker dashboard and training, illustrates several of the charts produced by the data flows in the plumbing diagram above.
For these charts, the Source Data is the Federal Reserve Economic Database (FRED). Power Query downloads, shapes, and cleans the data, then writes it to two Excel Tables. As soon as the Tables are updated, the data flows through staging tables to 27 charts, four of which are shown here. (The red dotted line shows where each measure was in February 2020, just before the beginning of the Covid recession, which is shown by the gray shading.)
To see the full Recovery Tracker dashboard or other examples, just click the link or the chart figure.
Excel Tables have at least four significant advantages for your Excel work.
1. They store your data in rows and columns, which allows SUMIFS, COUNTIFS, and similar functions to aggregate your data easily.
2. Tables are self-naming, a feature that makes your Excel formulas become somewhat self-documenting.
3. We can use the controls in a Table’s header area to sort or filter the data easily. Among other benefits, this makes it easier to debug your SUMIFS formulas.
4. We can set up calculated columns in Excel Tables, calculations that our formulas can reference in our reports.
This is why Excel Tables are Excel’s most commonly used database.
Excel Database 2: Pivot Tables
It might surprise you to learn that you also can use Pivot Tables as your database. And when you do so, you can use any Excel worksheet function you can use with Tables—without needing to use the GETPIVOTDATA function at all.
Here’s the plumbing for a Pivot Table database:
For interactive reporting, you could use slicers to control both the Pivot Tables and the Normal Formulas in this diagram. So, for example, you could use a slicer with a Pivot Table to return only sales detail for all divisions to a Tabular Range. You also could use a validation list with your formulas to choose a specific division. In either case, that would allow you to report and analyze sales by division…perhaps by product, or region, or whatever.
Of course, you could use your pivots to choose both sales and divisions in this case. But by including division detail in a Tabular Range, your formulas would have access to data for other divisions. This would allow you to compare each division to a similar division, or to division totals, or whatever.
However, to use Pivot Tables in this way, you must…
- Set up your pivot with a Tabular layout, as shown at the right.
- Use dynamic range names to reference the columns of your table.
My Excel productivity training explains this process.
Notice in the preceding plumbing diagram that we can use pivots with Data Models.
The Power Pivot Data Model offers an alternate location to store data in our workbooks. One advantage for doing so is that—theoretically—there’s no limit to the number of rows of data we can store in a Data Model. Another advantage is that the Data Model compresses its data, so that it requires less space in a saved file, and in memory.
However, Microsoft gives us only two ways to retrieve a Data Model’s data in Excel. We can use Pivot Tables, as the previous plumbing diagram shows, or we can use CUBE functions.
Excel Database 3: The Data Model with CUBE Functions
When we use the Power Pivot Data Model as our database we can set up formulas to return our data to either Excel Tables or directly to our reports, analyses, and so on, as you can see here:
However, to use formulas to retrieve data from the Data Model, we only can use formulas that rely on any of Excel’s seven CUBE functions: CUBEMEMBERPROPERTY, CUBESET, CUBEMEMBER, CUBERANKEDMEMBER, CUBESETCOUNT, CUBEKPIMEMBER, and CUBEVALUE.
That’s a challenge, because Excel’s CUBE functions are weird. They’re like no other function in Excel—which is why they’re challenging to learn. And that’s why I cover CUBE functions in my Excel Productivity training.
For interactive reporting, you can control CUBE Formulas just as you control Normal Formulas. That is, you can use validation lists or slicers to modify their control ranges.