Managing Excel Data
How to Use SUMPRODUCT to Find the Last Item in an Excel List
When you have a list with repeated items, here's how you can use SUMPRODUCT to find information about the last occurrence of any item in the list.
Suppose you have a Sequential List of items, as in the second table
here. And suppose you want to show the date of the first and last entry
in the list.
How would you do it?
This question isn't limited to dates, by the way. The Sequential List
could display numbers or text instead.
How would you write the formulas used in the range B4:C6 to return the
dates of the first and last items?
Many Excel users quickly could write a formula that returns the first
date for each item in the list. They would use an INDEXMATCH formula
like this for the cell shown:
B4: =INDEX(Date,MATCH(A4,Item,0))
If you've not used INDEXMATCH formulas before, search this site for
INDEX and MATCH to find many pages that describe this very useful
technique.
However, INDEXMATCH won't help us to find the last occurrence of
each item in the list. This is because MATCH (with a matchtype
equal to zero) returns the first item in an unsorted list. There's no
version that returns the last item.
However, as the title of this short article suggests, Excel's
SUMPRODUCT function provides just the power we need. Here's the formula
for the cell shown:
C4: =INDEX(B:B,SUMPRODUCT(MAX((Item=A4)*ROW(Item))))
Even if you use SUMPRODUCT a lot, you might find it difficult to
understand this formula at first glance. So let's take it one step at a time...
SUMPRODUCT
works with arrays of data. But it's not limited to mere multiplication,
as the "PRODUCT" part of the name implies. In fact, here's how I tend to
think of SUMPRODUCT:
SUMPRODUCT = (do stuff as though the formula were entered as an
array, and then return the results)
I know, that's not a very scientific explanation, but it's not a bad
rule of thumb.
Looking inside the SUMPRODUCT formula above, let's start with:
(Item=A4)
Here, the function compares the value of each cell in the Item range
(cells A10:A19) to the text in cell A4 ("Hats"). In its memory, Excel
sets up an array with TRUE where an Item equals "Hats" and FALSE where
it doesn't.
Now consider: (Item=A4)*ROW(Item)
Here, I've told the function to multiply the row number for each item
by the corresponding value in the TRUE/FALSE array.
Because TRUE evaluates as 1 in a formula, and FALSE as 0, this
product returns an array of numbers. Most of the numbers equal zero, where
the Item doesn't equal "Hats". But where an Item does equal "Hats", the
array contains the row number where the item is found.
The next step: MAX((Item=A4)*ROW(Item))
This is simple. MAX returns the value for the largest row number in
the array of numbers and zeros,
which is where "Hats" occurs last in the list.
And the next step: SUMPRODUCT(MAX((Item=A4)*ROW(Item)))
SUMPRODUCT is merely the function that tells Excel to handle the
earlier calculations as arrays. We use this section of the formula to
return one simple number: The number of the row in your worksheet where
"Hats" appears last.
Here's the whole formula again:
C4: =INDEX(B:B,SUMPRODUCT(MAX((Item=A4)*ROW(Item))))
This formula now should start to make more sense. There's only one
step left to go:
INDEX(B:B, [the last row number for "Hats"])
Here, INDEX is looking at all of column B in the spreadsheet. Among
other information, column B includes the date information in the
Sequential List. INDEX returns the date found in the row number
specified by the SUMPRODUCT function. And that row number is where the
last occurrence of "Hats" can be found.
Finally, you might consider a variation of the formula, a variation that
eliminates one risky aspect of this approach: The formula shown above
for cell C4 references an entire column.
Referencing an entire column is a problem in this case because if you
cut and paste the Sequential List to some other location, the formula
still will reference column B, but the list won't be in it. And the
formula will fail.
A modified formula corrects this problem:
C4: =INDEX(Date,SUMPRODUCT(MAX((Item=A4)*ROW(Item)))ROW(Date)+1)
Rather than referencing column B, this formula references the Date
range. To get the index value we need for this range, we subtract the
row number of its first row, and then add 1. (Because the Item and Date
ranges always must be in the same rows, we also could have used the
first row number of the Item range.)
Enter the formula in cell C4 and then copy the range B4:C4 down two rows.
When you do so, the new formulas will return the first and last occurrences of Shoes and Socks, respectively.
