__STYLES__
In response to concerns about the simplicity of Shopify's Customer Lifetime Value (CLV) formula, I have recalculated CLV using a cohort-based approach. The cohort analysis divides customers into groups based on their first visit to our website and tracks their weekly spending behavior over a 12-week period. By incorporating all user registrations (including those who did not make a purchase), this method provides a more comprehensive view of customer retention and revenue growth over time.
I utilized the turing_data_analytics.raw_events
table, applying cohort analysis to derive two key metrics:
The main goal of the analysis is to predict future revenue for users who have just registered and provide a reliable estimate of their lifetime value (CLV) within a 12-week timeframe.
The Weekly Average Revenue by Cohorts gives me a clear picture of user behavior over 12 weeks following registration. The calculations account for every user who visited the site, including those who did not make any purchases.
To extract this data, I used the following SQL query:
WITH user_registrations AS (
SELECT
user_pseudo_id,
DATE_TRUNC(MIN(PARSE_DATE('%Y%m%d', event_date)), WEEK) AS registration_week
FROM
`tc-da-1.turing_data_analytics.raw_events`
GROUP BY
user_pseudo_id
),
weekly_revenue AS (
SELECT
user_pseudo_id,
DATE_TRUNC(PARSE_DATE('%Y%m%d', event_date), WEEK) AS revenue_week,
SUM(purchase_revenue_in_usd) AS revenue
FROM
`tc-da-1.turing_data_analytics.raw_events`
WHERE
event_name = 'purchase'
GROUP BY
user_pseudo_id,
revenue_week
)
SELECT
ur.registration_week,
wr.revenue_week,
DATE_DIFF(wr.revenue_week, ur.registration_week, WEEK) AS week_number,
COUNT(DISTINCT ur.user_pseudo_id) AS cohort_size,
SUM(wr.revenue) AS total_revenue,
SUM(wr.revenue) / COUNT(DISTINCT ur.user_pseudo_id) AS revenue_per_user
FROM
user_registrations ur
LEFT JOIN
weekly_revenue wr
ON
ur.user_pseudo_id = wr.user_pseudo_id
WHERE
DATE_DIFF(wr.revenue_week, ur.registration_week, WEEK) BETWEEN 0 AND 11
AND ur.registration_week <= DATE('2021-01-24')
GROUP BY
ur.registration_week,
wr.revenue_week
ORDER BY
ur.registration_week,
wr.revenue_week
This output provided me with weekly revenue per user for each cohort, showing how average spending evolves across a 12-week period. The table was formatted using conditional color formatting to highlight trends and identify any fluctuations in revenue.
In the second part of the analysis, I calculated the Cumulative Revenue by Cohorts, where revenue from previous weeks is added to each successive week to generate cumulative revenue.
This cumulative analysis allows me to track the growth in revenue per user over time and measure how much value a user generates in their first 12 weeks. The cumulative growth percentages are particularly useful for predicting future behavior and estimating CLV more accurately.
This cumulative analysis provided a clearer picture of how much revenue to expect from each cohort and offered insights into retention and spending behavior.
Based on historical data, I predicted future revenue for newly acquired user cohorts. Using the average cumulative growth percentage, I estimated future weekly revenue for each cohort. The idea is to project revenue for weeks beyond the current data by applying the growth rates from previous weeks.
For example:
By the 12th week, we expect a user from the 2021-01-24 cohort to generate approximately 0.35 USD in revenue, based on historical trends.
By applying cohort analysis to the customer lifecycle, we have gained a more detailed and actionable understanding of customer behavior and revenue trends. This method allows us to predict future revenue more accurately and estimate CLV for both purchasing and non-purchasing users, providing a solid foundation for data-driven decision-making.