Data Management
Five Ways to Calculate Frequency Distributions in Excel
Although Excel's FREQUENCY function was designed to calculate frequency distributions, you also can use the SUMIF, SUMPRODUCT,
INDEXFREQUENCY, and COUNTIFS functions. Here's a summary of the methods and your options.
Walt C. captures bloodpressure readings and wants to find how often the readings fall into
various ranges of values.
This is a common need for a wide
variety of data.

In sales, you might want to
find the distribution of the number of products
purchased per order.
 In accounting, you might want to find the distribution of the ages of all
outstanding invoices.
 In manufacturing operations, you might want to find the distribution of production
times for a particular assembly.
This figure shows my simplified view of Walt's data. As I usually do, I added the gray
borders shown in rows 2 and 13, and I anchored the formulas that look at this data in those
empty rows. By doing so, I always can insert new rows of data between the shaded rows.
I also created my own artificial
data in column C. To do so, I entered this formula in the cell shown and copied it down the
column of data:
C3: =INT(RAND()*70+50)
This formula returns a random integer between 50 and 119.
This table shows the frequency distribution for the data above. That is,
there is one item that is 60 or less; three items from 81 through 85, and so on.
I didn't include border rows in this figure, because the FREQUENCY function, which I'll
describe shortly, needs to reference only the actual Bin data. This function would give
different answers if it were to reference empty border rows.
After I set up this data I assigned range names to the Value and Bin ranges. To do so in
any version of Excel:
 Select the range C1:C13; press Ctrl+Shift+F3; make sure that only Top row is
checked; then choose OK.
 Select the range C16:C24 and follow the same approach.
I assigned these range names for two reasons. First, replacing cell references with
names makes my formulas much easier to understand. Second, using range names reduces the
chance that I'll reference an incorrect range in a formula.
With these preparations out of the way, I thought of five ways to calculate the counts
in column D. Although Walt only needed to use one of these, it's useful to have the other
methods in the back of your mind.
1. Excel's FREQUENCY Function, with the Standard Method
Take a look at Excel's help topic for the FREQUENCY function. To do so, type...
=frequency(
...into a cell, and then press Ctrl+A to launch the Function Arguments dialog. The
bottomleft corner of this dialog has a link titled Help on this function. When you
click that link, Excel launches the help topic about the function.
The FREQUENCY help topic explains that the function has this syntax:
FREQUENCY(data_array,bins_array)
Here, data_array is the data in the range C2:C13, and the bins_array is
the range C18:C25. The topic explains that because FREQUENCY returns an array, you would
need to select the range D18:D25, type in...
=FREQUENCY(Value,Bin)
...and then press Ctrl+Shift+Enter, to enter the formula as a multicell array.
The problem with this approach is that you can't modify the number of bins in your
distribution easily. To illustrate, if you select row 21 and then try to insert (Ctrl++) or
delete (Ctrl+) a row, Excel launches an error dialog that says, "You can not change part of
an array."
However, we can use a different approach with the FREQUENCY function.
2. Excel's FREQUENCY Function, with INDEX
We know that the FREQUENCY function returns an array. But consider the syntax for the
INDEX function:
=INDEX(array,row_num,column_num)
That is, INDEX accepts arrays. To illustrate, this formula...
=INDEX(FREQUENCY(Value,Bin),6)
...entered
normally (not arrayentered) returns the value 3. This corresponds with the value in
cell D23 in the results table repeated here.
Therefore, we can enter this formula in the cell shown...
D17: =INDEX(FREQUENCY(Value, Bin),$A17)
...and copy the formula down the column as needed. (You enter the formula in one line,
of course.)
The value of this approach is that if you want to adjust your distribution of bins by
inserting or deleting rows, you can do so easily.
3. Excel's SUMIF Array Formulas
After I first set up the counts using method 1, I wanted to find a quick way to check
the results. The SUMIF approach came immediately to mind.
To use this approach in the figure above, arrayenter these formulas in the cells shown:
D17: =SUM(IF((Value<>"")*(Value<=$C17),1,0))
The first test checks whether the source Value contains an empty cell. For each cell
that's not empty, the test returns TRUE, otherwise it returns FALSE. The second test returns
TRUE when values in the source data are less than or equal to the amount in cell C17.
Multiplying the tests returns an array of TRUEs and FALSEs that is TRUE whenever both test
results are TRUE.
The rest of the IF function returns an array of 1s and 0s; and the SUM function returns
the sum of these results.
Be sure to arrayenter this formula. That is, with only cell D17 selected, type in the
formula shown and then arrayenter the formula by pressing Ctrl+Shift+Enter. Follow the same
approach for the two formulas that follow.
D18: =SUM(IF((Value>$C17)*(Value<=$C18),1,0))
When the two test results are multiplied in this array formula, we get an array with
TRUE whenever numbers in the VALUE range need to be counted, the IF replaces the TRUE values
with 1s, and the SUM returns the sum of those 1s. After you arrayenter this formula, copy
it to the range D19 through D23.
D24:
=SUM(IF((Value>$C23),1,0))
Array enter this simple formula.
4. Excel's SUMPRODUCT Function
Excel's SUMPRODUCT function works much like the SUMIF approach. But it doesn't need to
be array entered when you use the syntax I'll demonstrate. (SUMPRODUCT does need to be
arrayentered if you use other syntax, however.)
To use this approach, enter the following formulas in the cells shown:
D17: =SUMPRODUCT((Value>0)*(Value<=$C17))
D18: =SUMPRODUCT((Value>$C17)*(Value<=$C18))
D24: =SUMPRODUCT((Value>$C23)+0)
At this point only the formula in cell D24 needs explanation. The test in this formula
returns TRUE and FALSE, not 1 and 0. Before the results can be summed, they need to be
convereted to 1 and 0. By adding zero to those results, Excel converts TRUE and FALSE to 1
and 0, as required. We would get the same result if we multiplied the test by 1, using this
formula:
D24: =SUMPRODUCT((Value>$C23)*1)
As with the SUMIF approach, the first and the last formulas need to be different from
the interior formulas. Copy cell D18 down the column to cell D23.
5. Excel's COUNTIFS Function
New Excel (Excel 2007) adds a fifth way to find the same results, the COUNTIFS function.
Classic Excel offers the
SUMIF and
COUNTIF functions, which accept only one test
criteria. But the SUMIFS and
COUNTIFS functions accept any number of criteria. They use this
syntax:
=COUNTIFS(criteria_range1,criteria1,criteria_range2,criteria2,...)
=SUMIFS(sum_range,criteria_range1,criteria1,criteria_range2,criteria2,...)
To use the COUNTIFS function in New Excel, enter these formulas for the cells shown:
D17: =COUNTIFS(Value,"<>0",Value,"<="&Bin)
D18: =COUNTIFS(Value,"<="&Bin,Value,">"&$C17)
D24: =COUNTIFS(Value,">"&$C23)
As before, copy cell D18 down the column through cell D23.
Which Version Should You Use?
There's no good answer to this question. Each version has advantages over the other
versions. Each version can be used in circumstances where the other versions probably
wouldn't be recommended.
However, if you're somewhat familiar with them all, you'll recognize what's happening
when you read workbooks created by others. And you'll have alternatives when you're not able
to make one approach work.
