__STYLES__

A/B Test Analysis of Marketing Campaign: A Comprehensive Evaluation

Tools used in this project
A/B Test Analysis of Marketing Campaign: A Comprehensive Evaluation

About this project

A/B Test Analysis of Marketing Campaign: A Comprehensive Evaluation

The goal of this analysis is to evaluate the performance of three marketing campaigns based on weekly sales data and determine which campaign performed the best. The dataset includes sales data aggregated by LocationID, PromotionID, and week.

Tools used:

  • SQL (Aggregation of LocationID and Promotion with average sales)
  • Python (T-tests, Correction, Bonferroni Correction & Confidence Intervals)

Data Description

I extracted the dataset used for this analysis using this SQL code

SELECT * FROM tc-da-1.turing_data_analytics.wa_marketing_campaign;

The dataset includes the following columns:

  • MarketID: Unique identifier for the market
  • MarketSize: Size of the market area
  • LocationID: Unique identifier for the store location
  • AgeOfStore: Age of the store in years
  • Promotion: Identifier for the promotion (1, 2, or 3)
  • Week: Week number when the promotion was run
  • SalesInThousands: Sales amount to thousands of dollars

Step-by-Step Analysis

Statistical Analysis

  1. Aggregating and Comparing Sales by Promotion :Using the dataset, sales were aggregated by LocationID and Promotion. Below are the average sales and the variance for each promotion:
-- Calculate stats for Promotion 1
WITH promo_1 AS (
  SELECT 
    location_id, 
    AVG(sales_in_thousands) AS avg_sales
  FROM 
    `tc-da-1.turing_data_analytics.wa_marketing_campaign`
  WHERE Promotion = 1
  GROUP BY location_id
),
stats_1 AS (
  SELECT 
    AVG(avg_sales) AS mean_sales,
    VARIANCE(avg_sales) AS var_sales,
    COUNT(location_id) AS n_count
  FROM promo_1
)

-- Calculate stats for Promotion 2
, promo_2 AS (
  SELECT 
    location_id, 
    AVG(sales_in_thousands) AS avg_sales
  FROM 
    `tc-da-1.turing_data_analytics.wa_marketing_campaign`
  WHERE Promotion = 2
  GROUP BY location_id
),
stats_2 AS (
  SELECT 
    AVG(avg_sales) AS mean_sales,
    VARIANCE(avg_sales) AS var_sales,
    COUNT(location_id) AS n_count
  FROM promo_2
)

-- Calculate stats for Promotion 3
, promo_3 AS (
  SELECT 
    location_id, 
    AVG(sales_in_thousands) AS avg_sales
  FROM 
    `tc-da-1.turing_data_analytics.wa_marketing_campaign`
  WHERE Promotion = 3
  GROUP BY location_id
),
stats_3 AS (
  SELECT 
    AVG(avg_sales) AS mean_sales,
    VARIANCE(avg_sales) AS var_sales,
    COUNT(location_id) AS n_count
  FROM promo_3
)

-- Combine the results into a single output
SELECT 
  'Promotion 1' AS Promotion, mean_sales, var_sales, n_count
FROM stats_1
UNION ALL
SELECT 
  'Promotion 2' AS Promotion, mean_sales, var_sales, n_count
FROM stats_2
UNION ALL
SELECT 
  'Promotion 3' AS Promotion, mean_sales, var_sales, n_count
FROM stats_3;
  • Promotion 1: Mean Sales: $58.10k Variance: 256.90 Sample Size: 43
  • Promotion 2: Mean Sales: $47.33k Variance: 210.17 Sample Size: 47
  • Promotion 3: Mean Sales: $55.36k Variance: 268.43 Sample Size: 47

  1. Pairwise Comparisons Using T-tests: To determine whether the differences between the promotions were statistically significant, pairwise comparisons were conducted using T-tests.
# Filter data for each promotion
campaign_1 = agg_df[agg_df['Promotion'] == 1]['SalesInThousands']
campaign_2 = agg_df[agg_df['Promotion'] == 2]['SalesInThousands']
campaign_3 = agg_df[agg_df['Promotion'] == 3]['SalesInThousands']

# T-test between Campaign 1 and Campaign 2
t_stat, p_value = stats.ttest_ind(campaign_1, campaign_2)
print(f"Campaign 1 vs Campaign 2: T-stat = {t_stat}, P-value = {p_value}")

# T-test between Campaign 1 and Campaign 3
t_stat, p_value = stats.ttest_ind(campaign_1, campaign_3)
print(f"Campaign 1 vs Campaign 3: T-stat = {t_stat}, P-value = {p_value}")

# T-test between Campaign 2 and Campaign 3
t_stat, p_value = stats.ttest_ind(campaign_2, campaign_3)
print(f"Campaign 2 vs Campaign 3: T-stat = {t_stat}, P-value = {p_value}")

Results:

  • Promotion 1 vs Promotion 2: T-statistic = 3.35 P-value = 0.0012 (Significant at 99% confidence level)
  • Promotion 1 vs Promotion 3: T-statistic = 0.80 P-value = 0.426 (Not significant)
  • Promotion 2 vs Promotion 3: T-statistic = -2.52 P-value = 0.0135 (Not significant after applying the Bonferroni correction)

Interpretation:

  • Promotion 1 vs Promotion 2: There is a statistically significant difference, with Promotion 1 performing better than Promotion 2.
  • Promotion 1 vs Promotion 3: No statistically significant difference was observed.
  • Promotion 2 vs Promotion 3: Initially, a p-value of 0.0135 suggested a significant difference, but after applying the Bonferroni correction, the result became insignificant (corrected p-value = 0.0406).

  1. Corrected P-values Using Bonferroni Correction: Given the multiple comparisons, the Bonferroni correction was applied to control for the family-wise error rate.
from statsmodels.stats.multitest import multipletests

# Correct for multiple testing using Bonferroni correction
corrected_pvals = multipletests(p_values, alpha=0.01, method='bonferroni')[1]

# Output the results in a DataFrame
results = pd.DataFrame({
    'Comparison': combinations,
    'Original p-value': p_values,
    'Corrected p-value': corrected_pvals
})

# Print the results
print("\nCorrected Pairwise Comparison Results:")
print(results)

The following are the corrected p-values:

Comparison Original p-value Corrected p-value

  • Promo 2 vs Promo 1 0.001278 0.003833
  • Promo 3 vs Promo 1 0.425912 1.000000
  • Promo 3 vs Promo 2 0.013558 0.040673

Only the comparison between Promo 2 vs Promo 1 remained statistically significant after applying the correction.

  1. Confidence Intervals for Pairwise Comparisons: To further evaluate the significance of the differences, confidence intervals (99%) for the differences in means between the promotions were computed.
df_agg = df.groupby(['LocationID', 'Promotion']).agg(
    {'SalesInThousands': 'mean'}
).reset_index()
def confidence_interval(data1, data2, confidence=0.99):
    diff_means = data1.mean() - data2.mean()
    se = np.sqrt(data1.var()/len(data1) + data2.var()/len(data2))
    margin_of_error = se * stats.t.ppf((1 + confidence) / 2., len(data1) + len(data2) - 2)
    return diff_means - margin_of_error, diff_means + margin_of_error

ci_1_2 = confidence_interval(campaign_1, campaign_2)
ci_1_3 = confidence_interval(campaign_1, campaign_3)
ci_2_3 = confidence_interval(campaign_2, campaign_3)

def confidence_interval(data1, data2, confidence=0.99):
    diff_means = data1.mean() - data2.mean()
    se = np.sqrt(data1.var()/len(data1) + data2.var()/len(data2))
    margin_of_error = se * stats.t.ppf((1 + confidence) / 2., len(data1) + len(data2) - 2)
    return diff_means - margin_of_error, diff_means + margin_of_error

# Filter data for each promotion
campaign_1 = df_agg[df_agg['Promotion'] == 1]['SalesInThousands']
campaign_2 = df_agg[df_agg['Promotion'] == 2]['SalesInThousands']
campaign_3 = df_agg[df_agg['Promotion'] == 3]['SalesInThousands']

# Calculate confidence intervals
ci_1_2 = confidence_interval(campaign_1, campaign_2)
ci_1_3 = confidence_interval(campaign_1, campaign_3)
ci_2_3 = confidence_interval(campaign_2, campaign_3)

print(f"Confidence Interval for Campaign 1 vs Campaign 2: {ci_1_2}")
print(f"Confidence Interval for Campaign 1 vs Campaign 3: {ci_1_3}")
print(f"Confidence Interval for Campaign 2 vs Campaign 3: {ci_2_3}")
  • Campaign 1 vs Campaign 2: Confidence Interval: (2.26, 19.28) Interpretation: Campaign 1 significantly outperforms Campaign 2.
  • Campaign 1 vs Campaign 3: Confidence Interval: (-6.27, 11.73) Interpretation: No significant difference between Campaign 1 and Campaign 3.
  • Campaign 2 vs Campaign 3: Confidence Interval: (-16.43, 0.36) Interpretation: No significant difference between Campaign 2 and Campaign 3.

  1. Market and Location-Level Insights

  • Market Size Impact: Promo 3 performed poorly in medium-sized markets, with a strong negative correlation (-0.75) between sales and market size. Promo 3 is neutral in small markets but underperforms significantly in medium-sized ones.
  • Age of Store: Across all promotions, the age of the store had a minimal impact on sales, with a weak correlation (around -0.05).

Recommendations

  • Continue with Promotion 1: Promotion 1 is the best performer, significantly outpacing Promotion 2 and showing no statistical difference from Promotion 3.
  • Revise Promotion 3 for Medium Markets: Given the underperformance of Promotion 3 in medium-sized markets, consider tailoring the campaign to better target these regions.
  • Avoid Further Investment in Promotion 2: Promotion 2 consistently underperformed compared to both Promotions 1 and 3. It is recommended to discontinue or drastically revise this campaign.

Conclusion

This A/B test analysis provides clear evidence that Promotion 1 is the most effective campaign. While Promotion 3 performs similarly to Promotion 1 in small and large markets, its underperformance in medium-sized markets is a concern that requires further exploration. Promotion 2 was the least effective, and its discontinuation is advised. This comprehensive analysis and robust statistical evaluation provide actionable insights for optimizing marketing strategies.

Here is the link to the python sheet used https://drive.google.com/file/d/1h-tTuOouINHCQRZf9UIN6ZIbLG6BiFCR/view?usp=sharing

This comprehensive analysis and robust statistical evaluation provide actionable insights for optimizing marketing strategies.

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.