__STYLES__

Cohort CLV Insights: A New Growth Path

Tools used in this project
Cohort CLV Insights: A New Growth Path

About this project

Cohort-Based Customer Lifetime Value (CLV) Analysis

Overview

In response to concerns about the simplicity of Shopify's Customer Lifetime Value (CLV) formula, I have recalculated CLV using a cohort-based approach. The cohort analysis divides customers into groups based on their first visit to our website and tracks their weekly spending behavior over a 12-week period. By incorporating all user registrations (including those who did not make a purchase), this method provides a more comprehensive view of customer retention and revenue growth over time.

Dataset and Methodology

I utilized the turing_data_analytics.raw_events table, applying cohort analysis to derive two key metrics:

  1. Weekly Average Revenue by Cohorts: These metric divides total weekly revenue by the number of users in each cohort, showing how much revenue each cohort generates per user over time.
  2. Cumulative Revenue by Cohorts: This metric builds on the first, calculating the cumulative revenue per user by summing up the weekly revenues over a 12-week period. This helps me understand revenue growth for each cohort in a more holistic manner.

The main goal of the analysis is to predict future revenue for users who have just registered and provide a reliable estimate of their lifetime value (CLV) within a 12-week timeframe.

Weekly Average Revenue by Cohorts (USD)

The Weekly Average Revenue by Cohorts gives me a clear picture of user behavior over 12 weeks following registration. The calculations account for every user who visited the site, including those who did not make any purchases.

Query:

To extract this data, I used the following SQL query:

WITH user_registrations AS (
  SELECT
    user_pseudo_id,
    DATE_TRUNC(MIN(PARSE_DATE('%Y%m%d', event_date)), WEEK) AS registration_week
  FROM
    `tc-da-1.turing_data_analytics.raw_events`
  GROUP BY
    user_pseudo_id
),
weekly_revenue AS (
  SELECT
    user_pseudo_id,
    DATE_TRUNC(PARSE_DATE('%Y%m%d', event_date), WEEK) AS revenue_week,
    SUM(purchase_revenue_in_usd) AS revenue
  FROM
    `tc-da-1.turing_data_analytics.raw_events`
  WHERE
    event_name = 'purchase'
  GROUP BY
    user_pseudo_id,
    revenue_week
)
SELECT
  ur.registration_week,
  wr.revenue_week,
  DATE_DIFF(wr.revenue_week, ur.registration_week, WEEK) AS week_number,
  COUNT(DISTINCT ur.user_pseudo_id) AS cohort_size,
  SUM(wr.revenue) AS total_revenue,
  SUM(wr.revenue) / COUNT(DISTINCT ur.user_pseudo_id) AS revenue_per_user
FROM
  user_registrations ur
LEFT JOIN
  weekly_revenue wr
ON
  ur.user_pseudo_id = wr.user_pseudo_id
WHERE
  DATE_DIFF(wr.revenue_week, ur.registration_week, WEEK) BETWEEN 0 AND 11
  AND ur.registration_week <= DATE('2021-01-24')
GROUP BY
  ur.registration_week,
  wr.revenue_week
ORDER BY
  ur.registration_week,
  wr.revenue_week

This output provided me with weekly revenue per user for each cohort, showing how average spending evolves across a 12-week period. The table was formatted using conditional color formatting to highlight trends and identify any fluctuations in revenue.

undefined

Cumulative Revenue by Cohorts (USD)

In the second part of the analysis, I calculated the Cumulative Revenue by Cohorts, where revenue from previous weeks is added to each successive week to generate cumulative revenue.

undefined

This cumulative analysis allows me to track the growth in revenue per user over time and measure how much value a user generates in their first 12 weeks. The cumulative growth percentages are particularly useful for predicting future behavior and estimating CLV more accurately.

This cumulative analysis provided a clearer picture of how much revenue to expect from each cohort and offered insights into retention and spending behavior.

Revenue Prediction by Cohorts (USD)

Based on historical data, I predicted future revenue for newly acquired user cohorts. Using the average cumulative growth percentage, I estimated future weekly revenue for each cohort. The idea is to project revenue for weeks beyond the current data by applying the growth rates from previous weeks.

undefined

For example:

  • For the cohort starting in the week of 2021-01-24, the first-week revenue per user was 0.19 USD. By applying the average cumulative growth rate (e.g., 23.29% for Week 1 and 12.26% for Week 2), we predicted revenue for subsequent weeks.

By the 12th week, we expect a user from the 2021-01-24 cohort to generate approximately 0.35 USD in revenue, based on historical trends.

Key Insights

  1. Lower Initial Revenue, but Gradual Growth: The average revenue per user starts relatively low, but we observe a steady increase over the first 12 weeks. This gradual growth suggests that users tend to make more purchases as they become more familiar with the site.
  2. Short Customer Lifespan: Most of the revenue is generated in the first few weeks. By the 12th week, the majority of cohorts have reached their peak, after which there is a decline or plateau in spending.
  3. Better Estimate of CLV: The cumulative revenue approach provides a more reliable estimate of CLV. For users who did not make purchases, their predicted CLV can be projected based on historical cohort behavior.

Recommendations

  1. Focus on Initial Weeks: Since most revenue is generated in the early weeks; marketing efforts should target the first 4 weeks after registration to maximize customer lifetime value.
  2. Segment High-Value Cohorts: Some cohorts exhibit higher-than-average revenue growth. Identifying and segmenting these high-value groups can inform targeted marketing and retention strategies.
  3. Predict Future Revenue for Strategic Planning: Using the predicted revenue model, we can project future sales and allocate resources more effectively based on expected customer behavior over time.

Conclusion

By applying cohort analysis to the customer lifecycle, we have gained a more detailed and actionable understanding of customer behavior and revenue trends. This method allows us to predict future revenue more accurately and estimate CLV for both purchasing and non-purchasing users, providing a solid foundation for data-driven decision-making.

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.