Years ago, I read that Prof. William S. Cleveland had suggested that data could be smoothed by calculating a centered trendline through adjacent data—a method I immediately wanted to try.
Back then, unfortunately, I quickly learned that Excel’s TREND function offers no practical way for us to tell it to ignore certain data. Specifically, TREND returns the #VALUE! error if null strings or FALSE values are included in one of its arguments.
And therefore, there was no practical way to force the TREND function to ignore periods that needed to be excluded from its calculation.
So I set the idea aside until Excel could deal with that issue.
I recently realized that we now have the power to do what I wanted years ago. Specifically, Excel 365 and 2021 now have a way for TREND—and any other function—to ignore data we want it to ignore. And that’s how I was able to create the following chart.
Here, the fat blue line is the average daily price of gasoline over the past seven-plus years. And the skinny orange line is the smoothed version of that line, using a 31-day centered moving trend.
As you can see, the smoothed line tracks the original line very closely. Notice particularly that the turning points of both the original and smoothed lines are about the same.
And if we go to an extreme with a two-year trend, the orange line shows a broad trend that looks very reasonable:
How to Calculate the Centered Moving Trend
I started with a table of data that I downloaded from the Federal Reserve Economic Database (FRED)—which is the largest source of free economic data in the world.
Here, the DataID is the key field for the gasoline prices that I decided to plot. Because I used Power Query to download the data, I could easily filter out the rows for weekends and holidays, which had no data. (This is why the numbers for the 31-day and 731-day trends in the charts refer to the number of days with data, not to the number of calendar days.)
Then I set up the Staging Table shown here:
Because the formulas in the rest of the Table are fairly ordinary, and because you can download the example workbook for yourself, I’ll concentrate on the formulas in column J, which calculate the centered moving trend.
The formulas in column J rely on two functions that Excel 365 and 2021 support, and one function that’s been in Excel from the very beginning:
LET( name1, name_value1, [name2], [name_value2], […],[…], result )
The LET function allows us to assign names to calculated results and constants within the function, and then use those named values farther to the right. And finally, for the last argument, we can return a previously named value or a final calculation.
FILTER( start_array, include_array, [if_empty] )
The FILTER function relies on the start_array and the include_array. The start_array is the array you want to filter. It can have any number of columns. The include_array is a single-column array. It can’t contain text values, and it must have the same number of rows as the start_array. And then FILTER returns a new array for all rows where the include_array did not contain zero, FALSE, or an empty cell.
FILTER also works with columns of values, using similar logic.
TREND( known_ys, [known_xs], [new_xs], [const] )
TREND, which has been in Excel forever, uses the least-squares method to find the trend through the known_ys array for each point in the known_xs array. Optionally, it returns the value from that trend for the new_xs value.
Here’s the formula that calculates the smoothed value for cell J3:
X, FILTER(Date, Include),
Y, FILTER(Value, Include),
Result, TREND(Y, X, $F3),
Here, I wrapped the arguments in the formula. You could do the same in your worksheet, or you could set up one long formula.
To begin, the LET function defines the name Include. The Include formula tests whether each date in the Date column is within a range that’s defined by the user-specified number of OffsetDays. So for the first chart on this page, the OffsetDays value was 15, and for the second chart, the OffsetDays value was 365.
Because TRUE * TRUE equals 1, and TRUE * FALSE equals 0, the Include array contains the value of 1 for all rows where values in the Date column are within the desired range, and 0 otherwise.
Next, the LET function defines the X array to be a one-column array for each row in the Date column—where Include had a non-zero value. Then LET defines the Y array similarly.
Finally, LET defines the Result value to be the number for the current date from the array of values in the trendline. And then LET returns that Result. It’s the “centered” value because the size of the Include array is defined by adding and subtracting the same number of rows.
However, the first and last values in the Val2 column are not “centered.” In the Table above, for example, because Offset was equal to 365, the first 365 values in the table were NOT centered. Nor were the last 365 values in the table.
To see why those values weren’t centered, think about cell J3 again. It calculates a trendline that begins with cell F3 and ends 365 rows later, and then cell J3 returns the first value from that trendline. Cell J3 can’t return the centered trend value because the table doesn’t contain those 365 historical values.
Similarly, in the last cell of the Val2 column, the TREND calculation can’t predict the future. And therefore, its smoothed value is from a trendline that contains only historical data.
Even so, when you look at this chart again…
…and at this chart…
…you probably wouldn’t have noticed that issue. If the accuracy of the endpoints of the smoothed line are really important, then you won’t be able to show smoothed data for the last number of rows specified by the Offset value. And so that you have enough history for the beginning number of rows, you would need to set up your staging table differently.
Usually, however, that inaccuracy in the chart shouldn’t be an issue.
Finally, you might be interested in How to Smooth Monthly Trends with Centered Moving Averages. I’ve not compared the methods for smoothing lines yet, but I plan to do so in the future.