__STYLES__

Credit Score Logistic Regression

Tools used in this project
Credit Score Logistic Regression

Jupyter Notebook Script

About this project

The primary goal of this project is to predict low credit score customers based on various features from a dataset of credit transactions. The project aims to develop a classification model capable of identifying customers with low credit score.

This project is divided into three main parts, aligning with the mid-course project requirements for the Maven Analytics Classification course. Each part focuses on different aspects of the data preparation, model development, and evaluation process.

Step 1: Data Preparation and Exploratory Data Analysis (EDA)

  1. Data Import and Conversion:

    • Load the CSV file containing credit transaction data.
    • Perform necessary datatype conversions to ensure consistency and accuracy.
  2. Target Variable Modification:

    • Modify the target variable by grouping 'Standard' and 'Good' credit scores together, creating a binary classification problem (Low vs. High).
  3. Data Exploration:

    • Analyse the dataset to identify which features most significantly impact credit scores.
    • Check for and address any high correlations between features.
    • Remove unnecessary features that do not contribute to the predictive power of the model.
  4. Data Preparation for Modelling:

    • Create dummy variables for categorical features.
    • Split the data into training and testing sets.
    • Scale features if necessary to ensure model stability and performance.

Step 2: Logistic Regression

  1. Initial Model Fitting:

    • Fit a Logistic Regression model using default hyperparameters.
  2. Hyperparameter Tuning:

    • Tune the hyperparameters to optimize the model's performance.
  3. Performance Reporting:

    • Report key metrics: accuracy, precision, recall, and F1 score.
    • Adjust the decision threshold to maximize the F1 score.
  4. ROC Curve and AUC:

    • Plot the ROC curve for the tuned model.
    • Calculate and report the Area Under the Curve (AUC) to evaluate the model's ability to distinguish between classes.

Step 3: Addressing Imbalanced Data

  1. SMOTE Application:

    • Apply Synthetic Minority Over-sampling Technique (SMOTE) to balance the dataset by resampling to an equal number of instances for both classes.
  2. Model Re-tuning:

    • Tune the model threshold again after applying SMOTE to check for improvements.
  3. Performance Comparison:

    • Compare the model's performance (accuracy, F1 score, and AUC) before and after applying SMOTE to assess the impact of handling data imbalance.

Final Model Evaluation

  • Fit the final model using the best-performing configuration and techniques identified through the project.
  • Evaluate the final model's performance on the test data to ensure its generalizability and reliability.

By following these steps, this project aims to build a classification model capable for predicting low credit score customers, providing insights and tools for managing credit risk.

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.