š Data Guide
A useful model all comes down to the quality of data it is trained on. Learn how to make sure we are feeding your model the best data possible.
Data Transfer
Recast clients have a number of different options for exchange of data with Recast. The data exchange options listed below are already fully supported and can be implemented for new clients very quickly.
Recast Owned Data Sources
AWS S3
SFTP
Client Owned Data Sources
AWS S3
SFTP
Google Sheets
Google Cloud Storage
Google BigQuery
Google Drive
Snowflake
Postgres (including Redshift)
Visit Recast Data Sources for more information on your options for data exchange with Recast.
Data format
For a Recast MMM, what we need to ingest is a pretty simple table of historical marketing activity ā ideally going back about two and a half years.
It looks like this:
One row per day
One column for each marketing channel with:
the amount of spend in that channel on that day
the businessās KPI on that day.
Make sure you add all your channels to each column: branded search on Google, non brand search on Google, Google shopping, etc. Youāll need the amount of spend daily on each channel.
The business KPI can be revenue by day, profit by day, conversions by day, marketing qualified leads by day, or whatever metric the business is goaling the marketing against.
Long format is also acceptable. The only caveat we have for sharing your data in long format is including a single column containing the desired channel naming conventions- āRecast Categoryā. This should contain all unique combinations of dates and āRecast Categoryā historically.
ā”ļø Template
Data Warehouses
We truly recommend that every brand has all its data in a marketing data warehouse. Youāre going to need it for reporting and for multiple analyses that youāre going to want to do no matter what. We think itās a worthwhile investment to get it set up ā whether you work with Recast or even if you donāt do MMM.
Upload Cadence
We have a weekly upload schedule. We recommend sharing 27 months - and this will need to be re-shared every week. We recommend re-uploading the entirety of the dataset each week to capture any corrections to marketing spend.
Different types of data Recast requires
Data Sets Overview
Data Set | Includes | |
---|---|---|
Daily marketing spend and business KPIs | Required | One row per day of your marketing activity in every channel, with channels as columns and rows as days, with an additional column for your dependent variable (typically sales or new customer acquisitions) |
Promotional calendar | Required if you run promotions, have large product launches, or do other big non-spend marketing activity. This should also include product launches and other non-pricing promotions | A list of promotions you have run, their dates, and any helpful metadata (e.g. was it 25% off sale, a buy-one-share-one, elon musk flamethrower launch, went on shark tank, etc.) |
Incrementality tests | Optional. Useful if you have run incrementality tests and would like to āpinā Recast to those estimates | A list of incrementality tests you have run, including the channel, the dates of the test, the type of test run, and the results |
Setting up the data feed
For the initial model run, it is okay to have an āad hocā dataset compiled manually. In order to set up subsequent refreshes of the model, Recast will need to have programmatic access to the daily marketing data.
Typically Recast clients transfer data to Recast in one of three ways:
A shared S3 bucket. This is Recastās preferred approach. We can set up a shared S3 bucket, and you can drop CSV files on a regular basis. To get started, we just need your AWS account number. Once we have that, we will set up the bucket and provide you with instructions to access the bucket.
Direct access to your data store (e.g. BigQuery or Snowflake)
A Google Sheet
In order to run regular refreshes, we require the following:
ā
The access point (e.g. the Google Sheet URL) does not change from week to week
ā
The schema of the data does not change from week to week
ā
We need to know (or be able to determine from the data) when the data is complete. For example, it is often the case that some channels are updated more quickly than others. In this case, either the data needs to disambiguate missing values from zeroes, or we need to be given a rule (e.g. āalways use the data up until three days before the refresh dateā)
ā
We need a day-of-week and a time-of-day to refresh the data and to kick off the model
Daily Marketing Spend
Recast requires a dataset in the format laid out below. It has:
One row per day
One column per marketing channel (below, Facebook and Podcast)
One column for the dependent variable (typically revenue or new customer acquisition)
We are able to accept datasets in a ālongā or āwideā formats, or to do merging across multiple files. The key for Recast is to have a consistent data format that we can easily transform into the schema below
Example Data
Date | Revenue | Podcast | |
---|---|---|---|
1/1/2024 | $1,000,000 | $30 | $1000 |
1/2/2024 | $500,000 | $30 | $0 |
1/3/2024 | $750,000 | $30 | $0 |
Notes on different data types
Channels that have āflightsā or ādrop datesā, like podcast, direct mail, etc.
We would prefer to have the spend on the day the ads begin to be distributed, e.g. the date of the podcast, the first āin-homeā date of the direct mail drop, and so on. This is in contrast to pre-spreading the spend out over the period of the campaign. Recastās time-shift curve accounts for the delay between the spend on the advertising and when it is received by consumers.
For long-term contracts (e.g. with influencer partners), weād like to have the spend tied to each specific promotion. E.g. if youāre doing 3 sponsored posts, weād like to have three spend entries in the column representing influencer spend, one for each post.
Affiliate spend
Depending on how your affiliate program is set up, Recast may be able to handle it like any other spend channel, or we may have to handle it in a different way. The crucial question is: do you pay your affiliate partner on a ācommission basisā (a specified rate per conversion or sale), or do you pay upfront?
If you pay per conversion, then your spend in the affiliate channel is directly caused by conversions or revenue, rather than the other way around. To handle this, we have two options:
Recast can subtract the affiliate spend from your target variable, to account for the fact that some channels will drive more conversions/revenue through affiliates than others
We can include other variables (e.g. the number of active affiliates) to represent the size and intensity of the affiliate program
Non-spend channels (like email)
While Recast can handle these channels, our current recommendation is not to include them, as they are typically run very differently than the other channels
Branded search
Recast currently handles branded search like any other channel and we control the impact that branded search channels have on the model by constraining their incrementality to reasonable levels. An in-development version of the Recast model will handle these channels more explicitly as āoutcomesā resulting from spend in other channels.
Naming/schema changes
When the names of channels change, e.g. from facebook_prospecting to facebook_prospecting_spend, our updating scripts will typically break. Please give us as much notice as possible for naming changes. These changes will typically require a week to incorporate and may delay model refreshes. When the schema of the input data changes it requires Recast to re-write the ingestion code; this will result in delays to the model refresh.
Changes to past data
When Recast refreshes the model each week, it doesnāt just run the last week of data, but it re-estimates the entire model for all of history. This means that changes to past data can affect current estimates. This can result in larger-than-expected changes to your results.
Adding a channel
It is common to add new channels for testing purposes. When new channels are added, a few things need to happen on our end in order to ensure that the channel is included in the refresh run.
Please give us a one week notice for any additional channels that are being added to the model. We require the exact naming format for the channel ahead of time in order to be able to ensure that the model refreshes correctly.
Promotional Calendar
Since we include promotions in our forecasts, we would like to have your promotional calendar as far out as you can produce it (up to a year from the current date)
Name | Start Date | End Date |
---|---|---|
Mother's Day Promotion | May 3 2023 | May 11 2023 |
Spring Sale | March 2 2023 | March 18 2023 |
New Years Eve | December 29 2023 | January 2 2024 |
Incrementality Tests
Recast is able to āpinā the estimates for the incrementality of any channel to the results of incrementality tests
Channel Name | Start Date | End Date | Experiment Type | Incrementality Estimate | Confidence Level |
---|---|---|---|---|---|
May 3 2023 | May 11 2023 | In-Platform | 1.5x | +/-0.1 | |
Youtube | March 2 2023 | March 19 2023 | Geo Holdout | 2x | +/-0.3 |
Google Branded Search | January 5 2023 | January 26 2023 | Blackout | 0.7x |