After I posted The Most Powerful Ways to Summarize Excel Data for Reporting and Analysis, a reader asked why I hadn’t discussed the SUMPRODUCT function.
I told him that I’ve played with it from time to time, but it seemed to have some problems. “Take another look,” he replied. “It can do everything that you discuss in the array article.”
So I did take another look, and he’s right.
Rather than directly explaining what I learned, I thought it would be useful also to explain the process I followed to learn more about SUMPRODUCT. This might help when you have other questions about Excel.
Getting Started With SUMPRODUCT
This figure shows a database of the sales data I used in the array article.
To start my SUMPRODUCT tests I opened the workbook that contains this figure. I selected an empty cell. And then I did what I always do when I’m unsure about how to use a function:
In the formula bar, I type the equal sign, the function name, and the left parenthesis, like this…
Then I pressed Ctrl+A to launch the Function Arguments dialog shown below.
Tip: After you type the function name and the left parenthesis, try pressing Ctrl+Shift+A. When you do so, Excel completes the function with the names of its arguments. To illustrate, if you type…
=pv( …into your formula bar and press Ctrl+Shift+A your formula bar will contain: =pv(rate,nper,pmt,fv,type) Press Enter to enter the formula into your cell. Excel returns a#NAME? error value, but that’s not a problem. Just replace the name of each argument with the appropriate range names, cell references, or data. Note that some functions, including SUMPRODUCT, use “,…” as the last argument. When they do so you must remove those characters before you can enter those functions. That is, you must change… =sumproduct(array1,array2,array3,…) |
In this case, after pressing Ctrl+A, I got this dialog:
Whenever I’m unsure about how to use a function, I follow a similar approach. I type in a function, enter the left parenthesis, and then press Ctrl+A.
One of the most useful features of the Function Arguments dialog is the link in its bottom-left corner. The link offers a quick way to launch the help topic about the current function.
In the case of the SUMPRODUCT function, however, the help topic wasn’t very helpful. It explained that the function could be used to find the sum of the products of several columns. The article explained that the function uses this syntax:
SUMPRODUCT(array1, array2, array3,…)
In the figure, for example, this formula…
=SUMPRODUCT(Units,Units)
…finds the sum of the squares of the Units range.
That information isn’t very useful. But maybe we could take this function at least one step further.
Beginning to Take SUMPRODUCT to the Next Level
One of the examples in The Most Powerful Ways to Summarize Excel Data for Reporting and Analysis was:
{=SUM(IF((Clients=”Smith”)*(Colors=”Blue”),Units,0))}
This formula returns the total number of blue products sold to Smith. Logically, we should be able to do the same thing with the SUMPRODUCT function. So I entered this formula in an empty cell:
=SUMPRODUCT(Clients=”Smith”,Colors=”Blue”,Units)
When used in an array, an argument like Clients=”Smith” causes Excel to create a temporary list of TRUE and FALSE values in RAM. (Here, for each cell in the Clients range, Excel creates a list of whether the contents of the cell equals Smith.) This list is created in array formulas and it’s also created with the SUMPRODUCT function, which expects array arguments.
As needed, Excel usually converts those values of TRUE and FALSE to 1 and 0. Therefore, I expected the formula above to multiply two arrays of ones and zeros by the Units column, and return the sum of the resulting array, as shown in the figure at the right.
To illustrate, row 8 in the figure at the beginning of this article is the first row where Smith bought a blue product. In this table, that row is represented by the marked row, which is the first with a non-zero value in the right-most column.
But unfortunately, my idea failed. The formula gave me a value of zero, rather than 15. (The sum of the four non-zero numbers in this figure’s right-most column is 15.)
After messing around with this formula for several minutes, I had an idea. Usually, when we multiply by TRUE or FALSE, Excel treats those values as 1 and 0, respectively. But at times, Excel fails to make that adjustment correctly. Could that be happening here?
That’s an easy thing to test. All we need to do is to coerce the TRUE and FALSE values in the first two arrays into values of 1 and 0.
But wait a minute, you might be asking. What’s coercion?
A Side Trip Into Coercion
Excel has the marvelous ability to change text that looks like a number into the number itself. For example, Excel has no problem with this formula:
=”3″*5
Even though “3” is text, Excel knows exactly what this formula is about, and correctly returns 15. (Coercion is an Excel invention. If you began your spreadsheet career using Lotus 1-2-3, you might remember that
1-2-3 returned an error if you tried to multiply “3” times 5.)
Similarly, Excel can coerce a number into text:
=”January ” & 13
Here, Excel correctly returns “January 13”.
And finally, Excel can coerce logical values into numeric values:
=0+TRUE returns the value of 1
=1*TRUE returns the value of 1
=0+FALSE returns the value of 0
=1*FALSE returns the value of 0
Taking SUMPRODUCT to the Next Level
So, to test whether Excel is having problems coercing TRUE and FALSE to 1 and 0, I merely needed to force the coercion within the formula.
Therefore, I changed the SUMPRODUCT formula to:
=SUMPRODUCT(0+(Clients=”Smith”),0+(Colors=”Blue”),Units)
This formula correctly returned 15.
Admittedly, I wasn’t thrilled with this version of the formula. Having to add zero to every logical test seems rather kludgy.
(Kludge: 1. A system, especially a computer system, that is constituted of poorly matched elements or of elements originally intended for other applications. 2. A clumsy or inelegant solution to a problem.)
Even so, this solution seemed to work until I tried it with a table like this one.
Here, I defined three range names:
Clients =$B$4:$B$10
Colors =$C$4$C$10
Data =$D$4:$G$10
Then I entered this formula:
=SUMPRODUCT(0+(Clients=”Smith”),Data)
I expected to get an answer of 4. But instead, I got #VALUE!.
Again, I tried a variety of workarounds. But they all gave me the same result.
Finally, I posted a question on the Microsoft.Public.Excel newsgroup. Two kind souls quickly gave me the answer.
Taking SUMPRODUCT to the Final Level
This formula works with the figure above:
=SUMPRODUCT((Clients=”Smith”)*Data)
So does this formula:
=SUMPRODUCT((Clients=”Smith”)*(Colors=”Blue”)*Data)
Notice that there isn’t a comma in either of these formulas. That is, to replace SUM-IF formulas with a SUMPRODUCT version, the correct syntax is:
=SUMPRODUCT(array_product)
Not only does this version work, it’s cleaner. We can forget about the commas and (usually) the coercion.
This approach also allows us to count the number of items. To do so, we sum the product of all of the tests. To illustrate, this formula returns the number of blue items sold to Smith:
=SUMPRODUCT((Clients=”Smith”)*(Colors=”Blue”))
This version works only if we have two or more tests. However, if we want to use SUMPRODUCT to find the number of items sold to Smith, we must coerce the test like this:
=SUMPRODUCT(0+(Clients=”Smith”))
Finally, out of curiosity, I did try one other version of the formula:
=SUMPRODUCT(IF((Clients=”Smith”)*(Colors=”Blue”),Units,0))
But this version returned a #VALUE! error. To get it to work I had to array-enter the formula. Doing so should not have been necessary.
This final problem looks like a bug to me. But it probably never will be fixed, because there is seldom a reason to use it.