📊 Data Collection
What information aside from raw data outputs do you typically request from partners before beginning any modeling?
Business context and the results of lift tests that we can incorporate as priors. A calendar of product launches, promotional events, or pricing promotions where applicable.
Who owns the Data QA/Validation process to ensure accuracy before the modeling process?
We share the model inputs back to the client visually and set the expectation that they validate the data. We get alerts for unexpected or erroneous-looking data (missing data, changing historical data, big unexpected changes in spend patterns), which we will communicate to the client.
Do you have a process to ensure data standardization across multiple data sets?
We receive prepared data in a single dataset for each model and require that the column headers are consistent week to week.
We advise clients to use clear consistent naming conventions to stay organized and facilitate decision-making because the headers/names they choose will be used in their model output. Specifically, naming conventions should be set in such a way that they can be parsed for post-modeling aggregation of results on the client side.
What are your requirements for data passing and is there a way to automate this process?
We can work with pretty much any form of automated data access. We have full S3 support and by default set up a shared S3 location and issue credentials for all new clients. If that doesn't work easily with your infrastructure, we can almost certainly work with whatever you have (direct connection to data warehouse, sftp, etc.)
How much historical data is required for initial modeling phase?
Ideally we want at least 27 months of historical data. If you have less than that we can discuss!
🚚 Delivery and Timing
How frequently do you provide results and can the delivery cadence be customized at all?
By default we deliver fresh results weekly; we can discuss alternate cadence if needed
When should I schedule my refresh for?
Recast wants to refresh your model once a week. In order to do that, Recast needs comprehensive and complete data. Recast will set an automated schedule to ingest your data and launch a model on the day and time of your choice (it typically takes about 24 hours after a model run is launched before your dashboard will be updated). There are two choices to make when considering what day to refresh your model:
-
What day of the week should the model run (refresh date)?
-
What day of the week should we use data through (last data date)?
The refresh date should be the date before you want to review results. The last data date should be as close to the refresh date as possible, assuming you have complete data for every channel. If you, for example, only get new TV numbers every Friday, the last data date should be on Friday, regardless of when the refresh date is.
How should we report spend for direct mail?
When it comes to direct mail, clients wonder how they should report the spend. Three possibilities often present themselves:
-
Report all spend on the day that the mail gets put in the mailbox (the “drop date”). If you sent out 100,000 mailers at the cost of $200,000, report the entire $200,000 on a single day (e.g. Nov. 4th)
-
Report spend spread across estimated “in-home” dates. If you spent $200,000 on 100,000 mailers, the mailing company may estimate that 40% arrived in the mailbox on Nov. 7th, 30% on Nov. 8th, and 30% on Nov. 9th. Divide the spend over these three days.
-
Report spend based on the mailing company’s estimated response timetable. Mail companies can often provide response curves that estimate how long it takes for customers to respond to direct mail, typically over a 30-120 day window. You could spread the spend using this curve that they provide.
At Recast, we recommend 1 and recommend against 3. The reason we prefer 1 is because Recast has built a strong infrastructure to handle the delay between when money is spent into a channel and when customers purchase because of the ad they saw. The “time shift” curve estimates the amount of time it takes to see return on ad spend. This catches and incorporates the time it takes for the mail to get delivered, the time it takes for mail to get picked up and read, and the time it takes for the customer to take action. It’s also generally the easiest to implement.
2 separates the time it takes to get delivered from the time it takes to get picked up, read, and acted on. There’s nothing wrong with this theoretically, except that it relies on estimates of “in home dates” provided by a third party that might not be reliable. By letting Recast estimate the total time from “drop date” to return on ad spend, we cut out this potential source of bias due to bad estimates.
3 attempts to estimate the total time shift curve for us, so that the delay between “spend” and “return” is 0 days. We consider this a significant source of bias and discourage using this method. Direct mail companies compute these curves without knowledge of other marketing efforts your company is engaged in, which makes it impossible to estimate the true “incremental” effect of this spend, so while it may be a useful reference point with which to compare Recast’s results, using it to determine when to place the spend can introduce significant bias.
Please be clear on what method you are using and be consistent over time. Recast will configure your model differently based on which method you are using, so please communicate that. If you reporting is using methods inconsistently over time, the time shift curves could lead to mis-attribution and poor results.