Back to blog

forecasting

Why your forecast is wrong: a 5-minute audit

·5 min read

Most FP&A forecasts are wrong in the same way: they extrapolate the current trend with a linear model, fit one number, and ship it. The CFO asks "are we sure?" and the analyst says "yes" because the alternative — "let me run 5 algorithms and compare RMSE" — sounds like a week of work.

It's a five-minute audit. Here's how.

Step 1: pull the last 24 months of the line you're forecasting

Just one line. Revenue is the obvious one to start with. You want at least 24 monthly data points so seasonality can be detected; 36 is better.

Step 2: hold out the last 6 months

Train your forecast on months 1–18. Compare its predictions against what actually happened in months 19–24. This is your out-of-sample error — the only number that matters.

Step 3: try at least three algorithms

Do not pick one. Run:

  • Linear regression — your current default, almost certainly
  • ARIMA or SARIMA — handles seasonality and autocorrelation
  • Holt-Winters — exponential smoothing with trend + seasonality
  • (Bonus) Random Forest or Gradient Boosting — captures non-linearity if you have driver inputs

Compute R², RMSE, MAE, and MAPE for each on the holdout. Whichever algorithm wins on RMSE is your forecast.

Step 4: read the residuals

Plot the errors over time. If your model is biased, the errors will trend up or down. If they're heteroskedastic (variance changes over time), your confidence interval is lying. The "best" algorithm by RMSE may still have a residual pattern you don't want.

Step 5: pick the simplest model that wins

If ARIMA beats Random Forest by 0.4% MAPE, pick ARIMA. Simpler is more interpretable, more debuggable, and less likely to overfit. The forecasting paper everyone cites is Hyndman & Athanasopoulos — read chapter 5.

What you'll find

If you've never done this, the most likely outcome is one of:

  1. Linear was wrong by 4–7% on the holdout, ARIMA by 1.5%. You've been hosing the budget for months.
  2. All three algorithms agree to within 1%. Your series is well-behaved; pick the simplest.
  3. They diverge by 10%+. Your line has a structural break (acquisition, pricing change, COVID-equivalent shock). The model can't fix that — domain knowledge can.

In two of three cases, you have an actionable upgrade. The third tells you to stop trusting the model and call sales.

The five-minute version

EPM Lite ships 15 algorithms you can compare side-by-side with one click. R²/RMSE/MAE/MAPE on every run. The audit takes about 90 seconds when you're not doing the math by hand.

The point isn't to use the most algorithms. It's to never pick one without comparing. Linear-by-default is the silent killer of forecast accuracy.

See the comparison live →

Get the FP&A weekly

One email per week. Practical patterns from the trenches + product updates. Unsubscribe in one click.