As mentioned in the Intermediate Lesson, there are limitations to Historical CLV. Since the analysis is backward-looking, historical customer lifetime value analysis can produce misleading results when the company, the market, or both have changed. In addition, historical CLV is limited when trying to measure the CLV of customers acquired using new channels or tactics. Luckily, there are several methods that can be used to predict CLV, including extrapolation, supervised learning algorithms, and probabilistic modeling.
Extrapolation, or forecasting, can take many forms. One such form is a simple moving average, in which you forecast a customer's future spending based on a customer's spending over a past period. Moving averages can be more useful than simple averages because it takes into account any possible trends in a customer's spending. Be careful in choosing the moving average period, though, because the longer the period, the more the prediction is smoothed (an extreme case of this would be to take an average of all customer months); the shorter the period, the more variation there is in the results.
Note that moving averages with short periods can hide seasonality. For example, using a three-month moving average from August to October to predict November and December sales might cause you to underestimate CLV and underinvest in customer acquisition. Conversely, using holiday sales data to predict January and February sales might cause you to overestimate CLV and overinvest in customer acquisition.
Straight-line extrapolation, also known as a simple linear regression, offers similar trade-offs to moving averages but can also be useful. Since the extrapolation is a straight line, it does not take into account seasonality. However, a simple linear regression does produce a line of best fit, which minimizes the overall difference between predicted values and actual values. Polynomial regressions can be used to fit data that have non-linear shapes to produce more accurate predictions. A big advantage to both linear and polynomial regressions are that they are available in Excel and do not require a data science team to create them.
Advanced statistical methods can provide even more accurate predictive CLVs. Two popular methods used by Custora are Bayesian Inference and Pareto/NBD, both of which are described in the follow sections. These summaries will give you context into how these advanced predictive methods work. You will also find links to more rigorous treatments of these topics interspersed in the text.