How much data do we need?
Recast asks for 27 months of data in order to provide a two year look back window into your historical data.
Why 27 months?
The initial months are not actually usable in the model because (for example) the spend 28 months ago would affect your sales 27 months ago, but we don’t have data from 28 months ago. We adjust the model so that it uses the spend from those first few months to predict sales accurately by month 3. By asking for 27 months, we’re able to accurately model two years of data (and we exclude the months before that from the computation).
In statistics, more data is generally better, but for several practical considerations we don’t look back further than two years:
- Our model reflects the reality that marketing performance changes over time. Therefore, estimates of how Facebook was doing three years ago would have very little influence on estimates of how it is doing today, so there are diminishing returns on going further back in time
- The further back you go, the more difficult it is to provide accurate, verifiable data. Ways of storing, tracking, and handling data change over time, so asking for more historical data would introduce more opportunities for data errors.
- More data takes more compute time. In order to refresh models in a timely manner each week, we limit the look-back period.
- Two years gives us a chance to observe seasonal patterns and holidays more than once, so we can distinguish between anomalies and strong seasonal behavior.
Updated 2 months ago