CPA Models Data Export

intercept.csv

The intercept (now titled baseline) is the non-marketing driven sales (sometimes called organic), or a prediction of "what would our volume be if we turned all marketing off." It models yearly seasonality to account for typical sales cycles in the business.

Use cases:

Compare marketing-driven vs. non-marketing-driven sales at different periods in time

Rows:

One row for each day in the model's historical dataset

Columns:

date: The date
channel: always "intercept"
p columns: median estimate and confidence interval for intercept in the same units as your outcome variable (e.g. new subscribers)

cpa.csv

The CPA is the estimated average incremental cost per acquisition for each channel on each day. Days with no spend will have missing values. It will be impacted by how much money is spent on that channel (i.e., it incorporates the effect of saturation). If you have “demand capture” channels in your model (like affiliate or branded search), the CPA also reflects how much additional spend we expect into demand capture channels and how many additional conversions that spend will drive.

Use cases:

See historical performance of channels
Calculate the average cost per acquisition for a channel, both present day and historically

Rows:

One row for each day and each channel in the historical dataset

Columns:

date: The date
channel: The spend channel
cpa_p or total_effect_cpa_p columns: Median estimate and confidence interval for the incremental cost per acquisition in dollars.

me.csv

The ME is the estimated marginal effectiveness (or marginal cost per acquisition), or the cost to acquire the last (most expensive) customer from a given channel on a given day. Because of channel saturation, this cost will always be the same as or higher than the average customer given in the cpa.csv file. If you have “demand capture” channels in your model (like affiliate or branded search), the MCPA also reflects how much additional spend we expect into demand capture channels and how many additional conversions that spend will drive.

Use cases:

Calculate the cost of obtaining one additional customer for each channel, at the level of spend on that day

Rows:

One row for each day and each channel in the historical dataset

Columns:

date: The date
channel: The spend channel
mcpa_p columns: Median estimate and confidence interval of the incremental marginal cost per acquisition in dollars.

in_period_effect.csv

The in-period effect is the estimated number of customer acquisitions attributable to a given channel on that day. It takes account of the time shift, so if you spent a lot of money in a channel yesterday, but none today, the in-period effect will be greater than zero to represent the conversions that were driven from spend yesterday but didn't happen until today.

Use cases:

Assign credit to a channel for driving a certain number of conversions in a time frame

Rows:

One row per day, per channel

Columns:

date: The date
channel: The marketing channel
in_period_effect_p: The median and confidence interval for the number of acquisitions attributable to that channel on that day

cumulative_shift_curves.csv

The cumulative shift curves summarize how long it takes to realize the return from money spent on a given day. For each channel we estimate how much return has been realized within a given number of days of spending the money. These estimates are a percentage of the total realization, so will always be at or below 100%.

Use cases:

Estimate how long you will need to wait to see signal from spend in a given channel

Rows:

For each channel,the number of days after the original spend (so days=3 indicates what percent of sales are realized on or before the third day after the money was sent).

Columns:

channel_name: The channel
days: Number of days after original spend
p columns: Median estimate and confidence interval for the total percentage realized by that point in time

Our default is to include 180 days out.

shift_curves.csv

The shift curves summarize how much return we expect to realize on a given day after the money has been spent into a channel. For a given channel, these numbers represent the % of the total realized gains that we will see on a particular day.

Use cases:

Calculate the additional sales you can expect x days after spending into a channel.

Rows:

For each channel, the number of days after the original spend (so days=3 indicates what percent of sales are realized on the third day after the money is spent).

Columns:

channel_name: The channel
days: Number of days after original spend
average_effect: the percentage of sales we expect to be realized that day if the return on investment happens about as fast as we expect
slow_roi_effect: the percentage of sales we expect to be realized that day if the return on investment is on the slow end of what we expect
fast_roi_effect: the percentage of sales we expect to be realized that day if the return on investment is on the fast end of what we expect

We include days until the return realized on a given day is below 0.1%.

predicted.csv

The predicted value for the outcome variable from the model for each day in the historical dataset

Use cases:

Compare the model's fit to what actually happened on a given day

Rows:

Each day in the historical dataset

Columns:

date: The date
p columns: the median and confidence interval for our prediction on a given day

clean_data.csv

The data fed into the model, including the outcome variable and the channel spend variables

Use cases:

Compare what data was used in the model to external sources for validation

Rows:

Each day in the historical dataset

Columns:

date: The date
Outcome Variable: the name of your specific outcome variable
Channel names: The names of your specific marketing channels

spike_summaries.csv

A summary of the estimated effect of a spike in the model. Spikes can be anything that has a large effect on sales, typically a sale or a holiday that effects normal business operations. Spikes can have a positive component by increasing sales, and a negative component by cannibalizing sales both before and after the spike. The spike summary can be thought of as the difference between what did happen and what would have happened if the special event never happened, but your marketing spend stayed the same.

Use cases:

See whether a sales event positively contributed to revenue

Rows:

One row for each spike in the model

Columns:

date: The "central" day of the spike, typically the day where sales jumped the most.
spike_group: The numerical group the spike belongs to
spike_group_name: The name of the spike group
p columns: the median and confidence interval for the effect of the spike on the outcome variable (in the same units as the outcome variable).

spike_group_summaries.csv

A summary of the estimated average effect for a group of spikes in the model. Spikes can be grouped if we expect multiple spikes to have similar behavior (for example, reoccurring 20% off sales). The summary indicates what effect an average spike in this group will have on the outcome variable, holding marketing spend constant.

Use cases:

Rank different types of sales events by their impact on sales

Rows:

Each spike group

Columns:

spike_group: An ID for that group
spike_group_name: The name of that spike group
mean column: The mean estimate for the average effect of that group on the outcome variable
p columns: A median estimate and confidence interval for the average effect of that group on the outcome variable
dates: the dates of the spikes in that spike group

spike_series.csv

The spike series describes how the effect of spikes happen over time. Each spike can have an effect leading up to the spike and in the days after the spike. This dataset shows whether that effect was positive, negative, or zero for each day.

Use cases:

Visualize the effect of a spike event

Rows:

One row for each day, for each spike

Columns:

spike_date: The date of the spike that the row is summarizing
spike_group: The numerical group the spike belongs to
spike_group_name: The name of the spike group
date: The date we're estimating the effect for
daily_p columns: The effect (median + confidence interval) on the outcome variable for that spike on that day
cumulative_p columns: The cumulative effect of the spike for all days up to and including that day

response_impact.csv

An estimate of the impact on the outcome variable if you spend a certain amount of money in a certain channel. This is an estimate at the current (non-historical) channel performance only and summarizes how many new customers you will obtain for a given level of spend. It demonstrates the effect of saturation, so as spending increases the estimated response's growth will slow. The impact is how many new acquisitions you will receive in total, not on the first day of spend or any specific number of days of spend.

Use cases:

Plan how you can scale a channel to achieve marketing goals

Rows:

One row for each channel, at each spend level. Each channel has 100 rows, with spend levels chosen so that they're in the same range as your historical spend.

Columns:

channel: The channel
spend: An amount of daily spend that could be spent in the channel
p columns: The median and confidence interval on the number of acquisitions you can expect from that spend.

response_cpa.csv

An estimate of the CPA if you spend a certain amount of money in a certain channel. This is an estimate at the current (non-historical) channel performance only, and summarizes how much your average CPA will be for different levels of spend. It demonstrates the effect of saturation, so as spending increases the estimated CPA will increase.

Use cases:

Identify how much you can scale a channel while maintaining a certain CPA

Rows:

One row for each channel, at each spend level. Each channel has 100 rows, with spend levels chosen so that they're in the same range as your historical spend.

Columns:

channel: The channel
spend: An amount of daily spend that could be spent in the channel
p columns: The median and confidence interval of the CPA you can expect at that spend level

response_mcpa.csv

An estimate of the marginal CPA if you spend a certain amount of money in a certain channel. This is an estimate at the current (non-historical) channel performance only, and summarizes how much your marginal CPA will be to acquire the last customer at a given level of spend. It demonstrates the effect of saturation, so as spending increases the estimated marginal CPA will increase.

Use cases:

Identify which channels can be scaled to acquire more customers at the lowest price.

Rows:

One row for each channel, at each spend level. Each channel has 100 rows, with spend levels chosen so that they're in the same range as your historical spend.

Columns:

channel: The channel
spend: An amount of daily spend that could be spent in the channel
p columns: The median and confidence interval of the marginal CPA you can expect at that spend level

marginal_effectiveness.csv

This file is a little different from the other files in that it’s not raw data from the model, but rather a helpful summary of your current state. To increase return on marketing investment, you want to shift money from high marginal CPA channels to those with low marginal CPA. This sheet ranks your currently active channels to highlight which ones you should be moving money into.

Use cases:

Identify which channels should be receiving a larger investment

Rows

One row for each channel that is currently active

Columns

channel: The channel
mcpa_p columns: The median and confidence interval of the average marginal CPA over the last 30 days.
last_30_spend: How much money you’ve spent in that channel in the last 30 days
better_than_median: A flag for whether that channel is better than your average channel
upper_funnel_channel: A flag for whether it’s an upper funnel or lower funnel channel (Note: if you do not have lower funnel channels configured in your model, all channels will be upper funnel channels)