Excel Aggregation Functions
Use Excel's SUMPRODUCT to Summarize Worksheet Data
Excel's SUMPRODUCT function offers much of the power of Excel array formulas, but without special treatment.
After I posted The Most Powerful Ways to Summarize Excel Data for Reporting and Analysis,
a reader asked why I hadn't discussed the SUMPRODUCT function.
I told him that I've played with it from time to time, but it seemed
to have some problems. "Take another look," he replied. "It can do
everything that you discuss in the array article."
So I did take another look, and he's right.
Rather than directly explaining what I learned, I thought it would be
useful also to explain the process I followed to learn more about
SUMPRODUCT. This might help when you have other questions about
Excel.
Getting Started With SUMPRODUCT
This figure shows the database of sales data I used in the
array article.
To start my SUMPRODUCT tests I opened the workbook that contains this
figure. I selected an empty cell. And then I did what I always do when
I'm unsure about how to use a function:
In the formula bar, I type the equal sign, the function name,
and the left parenthesis, like this...
Then I pressed Ctrl+A to launch the Function Arguments dialog shown
below.
Tip: After you type the left parenthesis that follows a
function, you also can press Ctrl+Shift+A. When you do so, Excel
completes the function with the names of its arguments.
To
illustrate, if you type...
=pv(
...into your formula bar and press Ctrl+Shift+A your formula bar
will contain:
=pv(rate,nper,pmt,fv,type)
Press Enter to enter the formula into your cell. Excel returns
a#NAME? error value, but that's not a problem. Just replace the
name of each
argument with the appropriate range names, cell references, or
data.
Note that some functions, including SUMPRODUCT, use "..." as
one of the arguments. When they do so you must remove those
characters before you can enter those functions. That is, you
must change...
=sumproduct(array1,array2,array3,...)
...to...
=sumproduct(array1,array2,array3)

In this case, after pressing Ctrl+A, I got this dialog:
One of the most useful features of the Function Arguments dialog is the link in
its bottomleft corner. The link offers a quick way to
launch the help topic about the current function.
In the case of the SUMPRODUCT function, however, the help topic wasn't very helpful. It explained that the
function could be used to find the sum of the products of several
columns.
The article explained that the function uses this syntax:
SUMPRODUCT(array1,array2,array3,...)
In the figure, for example, this formula...
=SUMPRODUCT(Units,Units)
...finds the sum of the squares of the Units range.
That information isn't very useful. But maybe we could take this function at least one step
further.
Beginning to Take SUMPRODUCT to the Next Level
One of the examples in The Most Powerful Ways to Summarize Excel Data for Reporting and Analysis was:
{=SUM(IF((Clients="Smith")*(Colors="Blue"),Units,0))}
This formula returns the total number of blue products sold to Smith.
Logically, we should be able to do the same thing with the SUMPRODUCT
function. So I entered this formula in an empty cell:
=SUMPRODUCT(Clients="Smith",Colors="Blue",Units)
When used in an array, an argument like Clients="Smith"
causes Excel to create a temporary list of TRUE and FALSE values in RAM.
(Here, for each cell in the Clients range, Excel creates a list
of whether the contents of the cell equals Smith.) This list is
created in array formulas and it's also created with the SUMPRODUCT
function, which expects array arguments.
As
needed, Excel usually converts those values of TRUE and FALSE to 1 and
0. Therefore, I
expected this formula to multiply two arrays of ones and zeros by the
Units column, and return the sum of the resulting array, as shown in the
figure at the right.
To illustrate, row 8 in the figure at the beginning of this article
is the first row where Smith bought a blue product. In this table, that
row is represented by the marked row, which is the first with a nonzero
value in the rightmost column.
But unfortunately, my idea failed. The formula gave me a value of zero,
rather than 15. (The sum of the four nonzero numbers in this figure's
rightmost column is 15.)
After messing around with this formula for several minutes, I had an idea.
Usually, when we multiply by TRUE or FALSE, Excel
treats those values as 1 and 0, respectively. But at times, Excel
fails to make that adjustment correctly. Could that be happening here?
That's an easy thing to test. All we need to do is to coerce
the TRUE and FALSE values in the first two arrays into values of 1 and
0.
But wait a minute, you might be asking. What's coercion?
A Side Trip Into Coercion
Excel has the marvelous ability to change text that looks like a
number into the number itself. For example, Excel has no problem with
this formula:
="3"*5
Even though "3" is text, Excel knows exactly what this formula is
about, and correctly returns 15. (Coercion is an Excel invention. If you
began your spreadsheet career using Lotus 123, you might remember that
123 returned an error if you tried to multiply "3" times 5.)
Similarly, Excel can coerce a number into text:
="January " & 13
Here, Excel correctly returns "January 13".
And finally, Excel can coerce logical values into numeric values:
=0+TRUE returns the value of 1
=1*TRUE returns the value of 1
=0+FALSE returns the value of 0
=1*FALSE returns the value of 0
Taking SUMPRODUCT to the Next Level
So, to test whether Excel is having problems coercing TRUE and FALSE
to 1 and 0, I merely needed to force the coercion within the formula.
Therefore, I changed the SUMPRODUCT formula to:
=SUMPRODUCT(0+(Clients="Smith"),0+(Colors="Blue"),Units)
This formula correctly returned 15.
Admittedly, I wasn't thrilled with this version of the formula.
Having to add zero to every logical test seems rather kludgy.
(Kludge: 1.
A system, especially a computer system, that is constituted of poorly
matched elements or of elements originally intended for other
applications. 2. A clumsy or inelegant solution to a problem.)
Even
so, this solution seemed to work until I tried it with a table like the
one shown here.
Here, I defined three range names:
Clients 
=$B$4:$B$10 
Colors 
=$C$4$C$10 
Data 
=$D$4:$G$10 
Then I entered this formula:
=SUMPRODUCT(0+(Clients="Smith"),Data)
I expected to get an answer of 4. But instead, I got #VALUE!.
Again, I tried a variety of workarounds. But they all gave me the
same result.
Finally, I
posted a question on the Microsoft.Public.Excel newsgroup.
Two kind souls quickly gave me the answer.
Taking SUMPRODUCT to the Final Level
This formula works with the figure above:
=SUMPRODUCT((Clients="Smith")*Data)
So does this formula:
=SUMPRODUCT((Clients="Smith")*(Colors="Blue")*Data)
Notice that there isn't a comma in either of these formulas. That is,
to replace SUMIF formulas with a SUMPRODUCT version, the correct syntax
is:
=SUMPRODUCT(array_product)
Not only does this version work, it's cleaner. We can forget about
the commas and (usually) the coercion.
This approach also allows us to count the number of items. To do so,
we sum the product of all of the tests. To illustrate, this formula
returns the number of blue items sold to Smith:
=SUMPRODUCT((Clients="Smith")*(Colors="Blue"))
This version works only if we have two or more tests. However, if we
want to use SUMPRODUCT to find the number of items sold to Smith, we
must coerce the test like this:
=SUMPRODUCT(0+(Clients="Smith"))
Finally, out of curiosity, I did try one other version of the formula:
=SUMPRODUCT(IF((Clients="Smith")*(Colors="Blue"),Units,0))
But this version returned a #VALUE! error. To get it to work I had to
arrayenter the formula. Doing so should not have been necessary.
This final problem looks like a bug to me. But it probably never will
be fixed, because there is seldom a reason to use it.
