An Excel Tutorial
How to Create Normal Curves With Shaded Areas In Classic Excel
You can create shaded areas in Excel charts to specify areas of special interest. Here, we show you how to do this using normal curves.
In
An Introduction to Excel's Normal Distribution Functions I
presented several figures somewhat like the one below. This
article explains how to create the figures in Classic Excel. If you
use Excel 2007 or after, you should read
How to Create Normal Curves with Shaded Areas in
New Excel.
Even
if you have no particular reason to chart a normal curve, you
might find the techniques interesting. This is because you might
need to use similar techniques when you create other charts.
One general technique I used in this workbook was to set up
two worksheets, a Control worksheet and a Data worksheet.
Let's address each of these separately.
First, however, you might wonder how Excel could produce the
colors used for this figure.
This
link explains how to get the free workbook that uses these
colors and the free workbook that I explain how to create below.
Set Up the Control Worksheet
In general, you create a normal curve just as you create any
other chart in Excel: You set up the data and then chart it.
To
set up the data, open a new workbook with at least two
worksheets. Name one of these Control and the other Data.
In the Control worksheet, first set up the area shown here.
After you enter the labels in column A, assign them as range
names in column B. To do so, select the range A2:B8 and choose
Insert, Name, Create. Ensure that only Left Column is checked,
then choose OK. Do the same for the range A11:B14.
The numbers in the Settings section are values; enter them as
shown. Here's a short explanation of each:
Mean. The average value of your data.
StdDev. The Standard Deviation of your data.
NumRows. The number of rows of data to be plotted.
Zmin. The smallest standard deviation to be plotted.
Zmax. The largest standard deviation to be plotted.
PctClear. The percentage of the unshaded area from the left side
of the curve, from .0001% to 99.999%.
PctShade. The percentage of the shaded area from the left side
of the curve, from .0001% to 99.999%.
The numbers in the Calculations section are returned by
formulas:
ShadeLeft =NORMSINV(PctClear)*StdDev+Mean
NORMSINV is the inverse of the NORMSDIST function. Given the
probability that a variable is within a certain distance of the
mean, it finds the z value...the number of standard deviations.
To calculate the extent of the shade at its left side, we
multiply the NORMSINV result by the amount of the standard
deviation, then add the mean.
ShadeRight =NORMSINV(PctShade)*StdDev+Mean
We use the same logic with the PctShade area as for the
PctClear area. By doing so, we define the right side of the
shaded area.
CurveMin =Zmin*StdDev+Mean
This formula defines the left side of the curve.
CurveMax =Zmax*StdDev+Mean
This formula defines the right side of the curve.
Set Up the Data Sheet
This
figure shows the top of the Data sheet. Let's take each of these
columns in turn:
The Sequence value in column A is a counter, from 1 to the
number of rows of data. The counter is for convenience only;
it's not used in the formulas. In this example, I used 100 rows.
Enter 1 in cell A2, then select the column from cell A2
through the bottom of your spreadsheet. To do so, hold down Ctrl
and Shift; and press the Down arrow key. To enter the values,
choose Edit, Insert, Series; enter 100 as the stop value in the
Series dialog; then choose OK.
Column B returns the Z values from the lowest value to the
highest value:
B2: =Zmin
B3: =(ZmaxZmin)/(NumRows1)+B2
Column C calculates the chart's X value for each Z value:
C2: =B2*StdDev+Mean
Column D calculates the chart's Y value:
D2: =NORMDIST(C2,Mean,StdDev,FALSE)
Column E calculates the Y values for the area to be shaded:
E2: =IF(C2>ShadeRight,NA(),IF(C2<ShadeLeft,NA(),D2))
Copy cell B3 and the formulas in the range C2:E2 down their
columns to the last sequence number, in row 101.
To complete this step, assign the labels in row 1 of the Data
sheet to the data beneath those labels. To do so, first select
the range A1:E101. Choose Insert, Name, Create. In the Create
Names dialog, make sure only Top Row is checked; then choose OK.
Set Up the Chart
You will use one data series to generate the normal curve and
another data series to display the shaded area.
To set up the chart of the normal curve, select the range
C2:D101. Choose Insert, Chart, or click the Chart Wizard icon.
Under Chart Type, choose XY (Scatter). Under Chart SubType,
choose "Scatter with data points connected by smooth Lines
without markers." Then choose Finish.
If you click on the curve, your formula should look something
like this:
=SERIES(,Data!$C$2:$C$101,Data!$D$2:$D$101,1)
You'll find this formula easier work with if you convert its
cell references to range names. To do so, edit the formula to:
=SERIES(,NormalCurve.xls!X,NormalCurve.xls!Y,1)
Here, NormalCurve.xls is the name of my workbook. Your SERIES
formula could have a different name.
At
this point, your chart should look something like the one shown
here.
To set up the data series for the shaded area beneath the
curve, it's easiest to copy the first data series and then
modify it.
To copy the data series, first select the curve in the chart.
Then select the SERIES formula in the formula bar. Copy this
formula by pressing Ctrl+C. Then press Esc to deselect the
formula.
To paste the data series, first click anywhere on the edge of
the chart object. After you do so, the SERIES formula will
disappear and the words "Chart Area" will appear at the far left
side of your formula bar. To paste the formula, click in the
formula bar area and then press Ctrl+V.
After
you paste the SERIES formula, your chart will look something
like this.
At this point, the chart displays two identical data series,
Series 1 and Series 2. Because we want the curve to be shown on
top of the shaded area, we'll modify Series 1 to display the
shading, and we'll leave Series 2 undisturbed.
To select Series 1, first click on the curve. Then look in
your formula bar to see which series you've selected. If you see
Series 2, press your Down Arrow key once to select Series 1.
With Series 1 displayed in your formula bar, change the
formula so that it looks like one of these two versions:
=SERIES(,Data!$C$2:$C$101,Data!$E$2:$E$101,1)
=SERIES(,NormalCurve.xls!X,NormalCurve.xls!Area,1)
You'll use the second version if you have assigned range
names as I suggested above.
At this point, your chart will look the same as it did
before. It's now time to change that condition.
With Series 1 selected, choose Format, Selected Data Series.
In the Y Error Bars tab, choose to display the Minus version of
the error bar, and set the Error Amount to a Percentage value of
100%. Then press OK.
Here's
your initial result. It doesn't look very good, but at least
we're heading in the right direction.
We next need to change the format of the error bars in two
ways.
First, we need to make them thicker, which will eliminate the
banding in the shaded area, and we typically will change the
color of the shaded area.
To format the error bars, first click on the errorbar area
in your chart. Choose Format, Selected Error Bars. In the
Patterns tab, modify the Line settings. Set its color to any
color you prefer. And set the line's weight to its thickest
setting. Then choose OK.
After
you make these changes, your chart should look something like
this.
Second, you need to do is to clean up the chart slightly.
To eliminate the legend at the right, select the Legend
object within the Chart Area, and then press Delete.
To remove the shading in the Plot Area, first select the Plot
Area. Choose Format, Selected Plot Area and set the Area shading
to None. Then choose OK.
Your chart now will look like this the next one below.
If you want to format the chart so that it looks more like
the figure at the beginning of this article, you need to take
several additional steps.
First,
you assign a color pattern to the Chart Area, and remove its
border. To do so, select the Chart Area. Choose Format, Selected
Chart Area, in the Patterns tab choose a color for the Area and
choose None for both the Border.
If you want to remove the gridlines, select the chart. Choose
Chart, Chart Options. In the Gridlines tab, deselect all axes.
Then choose OK.
If you want to remove the border, select the Plot Area.
Choose Border, Selected Plot Area. Set the Border pattern to
None, then choose OK.
Finally, to assign a number format to the Y axis, first
select the axis. Then choose Format, Selected Axis. In the
Number tab choose the Custom category and type in this number
format: #,###.00. Then choose OK.
Your
chart should look something like the one shown here.
Now, you merely need to adjust its size as needed.
As I mentioned at the beginning of this article,
use
this link to download a free copy of the workbook
described here.
Further Information About Charting Normal Curves
I'd like to thank Jon Wittwer, PhD, for introducing me to
this method for shading curves. He understands a lot more about
techniques for statistical analysis than I do. In
Graphing a Normal Distribution Curve in Excel, he provides a
brief description of this method, and provides a download file
with additional examples.
