__STYLES__

Cracking the Code: Analyzing Customer Churn in the Banking Sector

Tools used in this project
Cracking the Code: Analyzing Customer Churn in the Banking Sector

Jupyter Notebook

About this project

Problem Statement

The goal is to identify and analyze the key factors influencing customer churn in the banking industry using the provided dataset. By examining attributes such as age, credit score, tenure, account balance, product usage, and activity level, I uncovered patterns and trends associated with customer churn. I utilized both manual and automated exploratory data analysis (EDA) approaches to uncover key trends, relationships, and insights.

Automated EDA was conducted using Sweetviz, and manual EDA was performed through custom visualizations and statistical analysis.

Project Workflow:

1. Data Understanding:

Dataset Overview: The project began by understanding the dataset, which included attributes such as Age, Balance, NumOfProducts, IsActiveMember, Geography, Gender, CreditScore, EstimatedSalary, Tenure, and Exited (indicating churn).

Initial Data Cleaning: Ensured the dataset was clean and ready for analysis. Focused on understanding if certain values like zero balances or zero tenures made sense in the business context.

2. Exploratory Data Analysis (EDA):

Churn Rate Calculation: Calculated the overall churn rate to get a sense of how many customers left the bank.

Univariate Analysis: Visualized the distribution of each feature present in dataset. Tools such as subplots, countplots, and distplots were used for categorical features like Geography,NumOfProductsand continuous features like Age, Balance, etc respectively.

Bivariate Analysis: Investigated the relationship between each individual variable and Exited variable (churn) using tools like countplots , BoxplotsandKDE plots.

Multivariate Analysis: Visualized the correlation between all the features of dataset and especially, correlation between each variable and target variable(Exited).

3. Hypothesis Testing:

For numerical variables like Age and Balance, conducted statistical hypothesis tests such as:

Mann-Whitney U Test to compare differences between churned and retained groups for non-normally distributed variables.

Independent two-samples T-test to to compare differences between churned and retained groups for normally distributed variables.

Categorical variables were tested using Chi-square tests of independence to check their association with churn.

4. Automated Validation (Sweetviz):

Ran Sweetviz for automated EDA to validate manually drawn insights. Observed that Sweetviz results supported the same conclusions reached through manual EDA.

5. Key Takeaways and Recommendations:

Reported all the insights derived through comprehensive analysis of dataset and pointed out the key factors influencing the customer churn in bank and suggested appropriate recommendations for further investigation (wherever required) and actions to be taken for improved retention rates.

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.