If you aren’t careful, simple edits to your reports and analyses can cause significant errors.
Years ago, for example, I heard about an expensive error when I was writing for LOTUS magazine, before Excel was invented.
A contractor was preparing a bid using Lotus 1-2-3. At the last minute, he realized that he had overlooked a $100,000 expense. So he inserted a row in his list of costs, entered the expense, recalculated, and submitted his bid.
Unfortunately, he made the mistake shown above. When he inserted the new row he forgot to modify the SUM function at the bottom of the column. Here, the 100 looks like it’s included in the total, but it’s not.
Because the contractor won the bid but lost his shirt, he tried to sue Lotus. But the judge tossed the case out of court. Lotus couldn’t be responsible for their users’ stupid mistakes, the judge ruled.
This article describes three ways to protect yourself against errors like these. One uses an automatic setting in Excel; another uses a range-name trick that you’ve probably never seen before; and one uses very convenient spreadsheet formatting.
Using Excel’s Automatic Solution
Suppose you have a spreadsheet with this display. The formula in cell B8 is:
=SUM(B4:B6)
You could have arrived at this point by inserting row 7, as shown. Or you could have entered the numbers and formula shown in the range B4:B8.
When you enter a value cell B7, Excel can automatically modify the SUM formula to include that new value. To set up Excel to do this, choose Tools, Options. In the Edit tab, make sure there’s a check in the checkbox titled “Extend data range formats and formulas.”
This behavior has many quirks. Here are several:
- The original SUM function must include at least three cells in its range. That is, if the formula in cell B8 were =SUM(B5:B6), Excel would not modify the SUM formula when you enter a value in cell B7.
- This behavior works if the SUM formula is below a column or to the right of a row of data. It does not work if it’s above the column or to the left of a row.
- Excel modifies the SUM formula if it is within 20 cells of the new data. In the figure, for example, Excel will update a SUM formula in cell B27, but not in cell B28.
These aren’t all the quirks, but they’re enough to suggest that alternate approaches would be useful.
Using the ‘NextUp’ Relative Reference
Let’s take a look at this figure again. The original formula in cell B7 is =SUM(B4:B6). We can change that formula so that the entire problem goes away.
To see how this works, I need to explain two aspects of Excel range names.
First, Excel names can use absolute or relative references. Typically, we use absolute references. That is, we could define MyCell as =Sheet1!$A$1.
But we also could define a name using relative references. To illustrate, assume that cell B5 is the active cell. We could define the name NextUp as =Sheet1!B4. That is, cell B4 is the next cell above cell B5, the active cell.
Because NextUp uses a relative reference, the name is defined with reference to the active cell, and that relative reference applies to any cell in which the name is used. For example, if NextUp is used in cell M50 the name would refer to cell M49.
There’s one significant problem with this definition of NextUp, however: It’s defined in terms of Sheet1. Therefore if cell D5 of Sheet3 is active, NextUp will reference cell D4 of Sheet 1.
That won’t do at all.
To get around this problem, we must change the way that NextUp is defined. In addition to using a relative cell reference, we also must use a relative sheet reference. Again, assuming that cell B5 is the active cell, we define NextUp as:
=!B4
(Because relative sheet references are uncommon, you should know one unique characteristic of them. A name using a relative sheet reference and an absolute cell reference, like…
=!$A$3
…would refer to the specified cell in every active worksheet. Unlike normal cell references, this reference will not change if you insert rows above cell A3 or columns to its left, or if you cut and paste the referenced cell to a new location. That is, a name defined as shown always will reference cell A3 until you manually change its definition.)
With NextUp properly defined, we can use the name in our SUM formulas. In the figure, that is, we can change the SUM formula to be:
=SUM(B4:NextUp)
In fact, we should go one step further. We should anchor the SUM formula in the cell with the text above the column of numbers. That way, we could insert rows for new data above or below the column of numbers and ensure that the new numbers will be included in our SUM. That is, the SUM formula in cell B7 should be:
=SUM(B3:NextUp)
If there’s a chance that a date or some other numeric title would appear in cell B3, you could use the formula:
=SUM(B3:NextUp,-N(B3))
Here, the N() function returns the numeric value of cell B3. That is, it returns zero if the cell is a label, otherwise it returns the value.
Honestly, I seldom use this second approach, because I seldom include dates without a border row, as shown below. However, it’s useful to know that a solution exists if you ever need it.
After you enter one of the two formulas above, it’s always a good idea to check that the NextUp reference is working correctly. To do so, select the cell that contains the formula; copy the text of the reference (here, “B3:NextUp”); press the F5 function key to launch the Go To dialog box; paste the reference text into the Reference edit box; then press OK. After you do so, Excel should select the expected range.
Using Border Rows
This figure illustrates a simple approach that’s extremely useful when your display doesn’t need to look like it was printed in a magazine. More generally, I call this design a gray-cell table. And if it’s a large table, I call it a gray-cell database.
(When IT people make fun of me for calling it a database, I explain that I’m relying on the Wikipedia definition: A database is an organized collection of data, generally stored and accessed electronically from a computer system.)
In a gray-cell table, all summary formulas are anchored in the gray cells. For example, the formula in cell B8 is:
=SUM(B3:B7)
One useful feature of this approach is that it leaves no doubt about where to insert new data. You know that any data inserted between the gray borders always will be included in your summary formulas.
If you prefer to use range names in your formulas, rather than cell addresses, assign the names in row 2 to the areas bounded by the gray borders. That is, select the range B2:C7, choose Insert, Name, Create; ensure that Top Row is checked; then choose OK.
To assign the range names to formulas, choose Insert, Name, Apply; make sure that the names you want to apply are selected; then choose OK. This approach would change =SUM(B3:B7) to =SUM(Sales).
Wrapping Up
If the contractor who got into trouble using a Lotus spreadsheet had used any of these techniques with Excel, he probably would have saved himself some expense and embarrassment.
Perhaps they’ll save you some problems as well.