We use cookies. You have options. Cookies help us keep the site running smoothly and inform some of our advertising, but if you’d like to make adjustments, you can visit our Cookie Notice page for more information.
We’d like to use cookies on your device. Cookies help us keep the site running smoothly and inform some of our advertising, but how we use them is entirely up to you. Accept our recommended settings or customise them to your wishes.

Using Multivariate Regression to Forecast Retail Revenue

Brands that can accurately forecast revenue are able to make budget recommendations for banners and ad types, understand the incremental budget needs for campaign goals, and project the impact of budget shifts. These actions help optimize media performance and maximize profits. Multivariate regression is a data driven method that effectively unpacks the relationships between spend and revenue.

Forecasting without much historical data can always be challenging, but a simple starting point would be the square root model, which can be used to provide revenue forecasts for varying budgets. However, as time goes on, more robust models are needed to provide forecasts by ad type for each banner to account for the variance in return in each ad type. Proportional investment in ad type will vary as spend scales when directing spend to the most efficient ad type first. Multivariate regression reveals a clearer picture of the relationship between spend and revenue and, therefore, leads to improved ROAS, while also showing if additional inputs should be included in the model, i.e. seasonality, which can help guide you as to when to start planning campaigns.

The next few sections cover the details of the data and methodology for the model. For those who are inclined to skip these sections, please feel free to jump down to the interpretation section.

Figure 1

Figure 1. Data considerations and model limitations



The following variables were available for analysis: spend, revenue, competitive media spend, and number of social posts. However, the initial correlations indicated that there was no direct relationship between competitive media spend (Figure 2.) and number of social posts (Figure 3.) to the brand’s revenue; they did not help or hurt the brand revenue and  were, therefore, not included in the model. It was also determined that there was monthly seasonality and Black Friday holiday weekend should be its own input.

Figure 2

Figure 2

Figure 3

Figure 3


The method utilized was multivariate regression using variables of spend, month, and Black Friday holiday weekend to predict revenue (sixty output equations). This process allowed for predictions of revenue to be made based on the historical relationship between these variables and revenue. The model outputs equations that minimize error between predictions of revenue and actual revenue. The relationship between spend and revenue was found to be linear for brand text ads and polynomial for non-brand ad types. This means the higher the spend for non-brand ad types, there will be incrementally less expected revenue. After predictions are made (for months where available), YoY % change in return on advertising spend (ROAS) is then used to transform predictions in order to implement a reactive performance trend to the model which improved overall R2. The final conceptual model is below in Figure 4.

Figure 3

Figure 4


The modelled revenue lined up with the actual revenue well, with some outliers occurring during holiday, tax return season, and certain promotions and sales. The following is a typical modelled result:

Figure 5

Figure 5

The following table compares the error (mean absolute error and mean absolute percentage error) between forecasted revenue and ROAS to the actuals for a typical month after modelling:


The models lined up to the actuals and the forecasts continue to have an error of about 5.5% which is well within reason. Furthermore, when actuals differ from forecasts (for example, sales), the forecasts generally underestimate the actual revenues and therefore, do not over promise. This work led to better spend optimizations across channels and ad types within each channel and revealed natural seasonal opportunities which improved campaign and holiday planning by guiding the team as to when to start spending and how much to spend in particular ad types and channels.  This also increased ROAS significantly across the brands.

This particular client case is just one example of how the power of multivariate regression can shed light on consumer behavior. In addition to making budget recommendations and discovering seasonal trends, depending on the available data, this approach can also establish: the consumer decision journey via brand health tracking to promote conversion through the purchase funnel, connect the “indirect” contribution that upper/mid-funnel metrics (awareness, consideration, preference and recommendation) have on consumer actions (such as site visitations and bookings), and identify synergies that exist across media and swim lanes (i.e. TV impact on search, social impact on brand metrics). The possibilities are endless.

Want to learn more about media analytics? Check out our other blogs here.