__STYLES__

Unlocking Customer Insights and Market Trends in Grocery Retail: A Comprehensive Data Analysis

Tools used in this project
Unlocking Customer Insights and Market Trends in Grocery Retail: A Comprehensive Data Analysis

Unlocking Customer Insights and Market Trends in Grocery Retail: A Comprehensive Data Analysis

About this project

Overview

The Instacart Market Basket Analysis project aims to explore and analyze customer shopping behavior using transactional data from the Instacart platform. This dataset provides a rich source of information that can be used to derive valuable insights for optimizing operations and enhancing customer experiences.

Problems to solve.

· How can we increase customer retention?

· How can we improve customer segmentation and targeting?

· How can we reduce customer churn?

· How can we identify and promote product associations?

Data Analyst Report: Grocery Store Analysis

In this analysis, I explored the grocery store data in-depth to derive meaningful insights. Below, I present the findings and insights from our data analysis.

The database contains six tables:

aisles.csv — Contains information about different product categories (aisles).

departments.csv — Provides details about various departments within the store.

order_products__prior.csv — Includes information about products included in prior customer orders.

order_products__train.csv — Contains details about products in the training set of customer orders.

orders.csv — Provides information about individual orders and customers.

products.csv — Contains details about products, including aisle and department IDs.

Data Dictionary:

Refer to the provided data dictionary for a detailed description of each dataset and its columns.

Data Quantity

The database contains the following approximate quantities of data:

PRODUCT: 49,755 rows

ORDERS: 1,048,575 rows

DEPARTMENTS: 21 rows

AISLES: 134 rows

ORDER_PRODUCT_PRIOR: 32,434,489 rows

ORDER_PRODUCT_TRAIN: 1,384,617 rows

Data Cleaning and Exploration

Missing Values

I began by checking for missing values in the ORDERS table, specifically in the ‘days_since_prior_order’ column. I’m pleased to report that all values were found in this column.

Data Integrity

To ensure data integrity, I undertook the task of removing duplicate rows from the PRODUCT table. I identified duplicates based on product names and retained only one entry with the minimum ‘product_id.’ This data cleaning process ensures the data is accessible from redundancy and maintains consistency.

Market Basket Analysis

I conducted a market basket analysis to uncover the top 10 product pairs customers most frequently purchase together. This information is vital for optimizing inventory management and fine-tuning our marketing strategies. The analysis involved joining the order_product_prior data to identify product pairs and their frequencies. To make the insights more actionable, I included product names.

Steps:

  • Step 1: Join the Orders, Order_products__prior, and Products tables to create a dataset that includes order information and product names.

  • Step 2: Group the data by order_id and create a list of products in each order.

  • Step 3: Use these lists to calculate the frequency of product pairs in orders.

  • Step 4: Rank and select the top 10 product pairs with the highest frequency.

The top 10 product pairs that are most frequently purchased together are:

  • Bag of Organic Bananas — Organic Hass Avocado
  • Organic Strawberries — Bag of Organic Bananas
  • Banana — Organic Strawberries
  • Organic Avocado — Banana
  • Banana — Organic Baby Spinach
  • Organic Baby Spinach — Bag of Organic Bananas
  • Strawberries — Banana
  • Banana — Large Lemon
  • Organic Hass Avocado — Organic Strawberries
  • Bag of Organic Bananas — Organic Raspberries

I also explored which products are most commonly added to the cart first. This insight is crucial for understanding customer preferences and planning promotional activities.

Steps:

  • Step 1: Analyze the Order_products_prior table to find the products added to the cart first.

  • Step 2: Count the occurrences of each product and rank them to find the top 5.

The top 5 products that are most commonly added to the cart first are:

  1. Banana
  2. Bag of Organic Bananas
  3. Organic Whole Milk
  4. Organic Strawberries
  5. Organic Hass Avocado

How many unique products are typically included in a single order?

Another aspect of our analysis involved examining the number of unique products typically included in a single order. This insight provides a deeper understanding of customer preferences and choices.

  • Step 1: Calculate the average number of unique products per order in the Order_products__prior table.

The average number of unique products included in a single order is 7.

Customer Segmentation

I also conducted customer segmentation based on purchase frequency, classifying them as ‘Frequent Buyers,’ ‘Regular Buyers,’ or ‘Occasional Buyers.’ This segmentation can guide targeted marketing efforts.

  • Step 1: Join the Orders and Order_products__prior tables to calculate the total amount spent by each customer.

  • Step 2: Segment customers based on their total spending.

Based on the total amount they’ve spent on orders; customers can be categorized as follows:

  • High Spender: 80 or more orders
  • Medium Spender: 30 to 79 orders
  • Low Spender: Less than 30 orders

To gain insights into customer spending behavior, I categorized customers into three groups: ‘High Spenders,’ ‘Medium Spenders,’ and ‘Low Spenders’ based on their total order count.

  • Step 1: Analyze the Orders table to calculate the purchase frequency of each customer.

  • Step 2: Segment customers based on their purchase frequency.

Based on purchase frequency, customers can be segmented as follows:

  • Frequent Buyer: 10 or more orders
  • Regular Buyer: 5 to 9 orders
  • Occasional Buyer: Less than five orders

How many orders have been placed by each customer?

Count the number of orders placed by each customer in the Orders table.

The average number of orders per customer is 16.

Seasonal Trends Analysis

I analyzed the distribution of orders placed on different days of the week to identify trends in customer behavior.

  • Step 1: Analyze the Orders table to calculate the distribution of orders by day of the week.

The distribution of orders placed on different days of the week is as follows:

  • Sunday: 183939
  • Monday: 180025
  • Tuesday: 143162
  • Wednesday: 133839
  • Thursday: 130367
  • Friday: 139183
  • Saturday: 138060

Customer Churn Prediction

Inactive Customers

One of the critical metrics in our analysis was calculating the number of customers who have yet to place an order in the last 30 days. This information is invaluable for identifying inactive customers and potentially re-engaging them.

  • Step 1: Analyze the Orders table to identify customers who haven’t placed an order in the last 30 days.

The number of customers who last placed an order in the last 30 days is 46,392.

To gauge customer churn, I calculated the customer churn rate. This metric represents the percentage of inactive customers relative to the total customer base.

  • Step: Define churn criteria (e.g., no orders in the last quarter) and calculate the percentage of customers meeting this criterion.

The customer churn rate is 74%.

Product Association Rules

Can we find products that are often bought together on weekends vs. weekdays?

I compared product associations on weekends and weekdays to determine if buying patterns varied between these two time periods.

  • Step 1: Analyze the Orders table to identify weekends and weekdays.

  • Step 2: Apply association rule mining separately for weekends and weekdays to find product combinations.

The following product combinations are often bought together on weekends vs. weekdays:

Product Affinity

Aisle and Department Analysis

I analyzed the top-performing categories to gain insights into product sales by department and aisle. This information is crucial for inventory management and marketing decisions.

The top 5 departments by sales are:

  1. Produce
  2. Daily eggs
  3. Snacks
  4. Beverages
  5. Frozen

The top 5 aisles by sales are:

  • fresh fruit
  • fresh vegetable
  • Packaged vegetables fruits
  • yogurt
  • packaged cheese

Solutions

To increase customer retention, we can:

  • Offer loyalty programs and rewards.
  • Provide excellent customer service.
  • Personalize the shopping experience.
  • Make it easy for customers to reorder their favorite products.

To improve customer segmentation and targeting, we can:

  • Collect customer data, such as purchase history, demographics, and interests.
  • Use this data to create customer segments.
  • Target each segment with relevant marketing messages and offers.

To reduce customer churn, we can:

  1. Identify customers who are at risk of churning.
  2. Reach out to these customers with personalized offers and support.
  3. Make it easy for customers to change their minds about.

Recommendations for Business Improvement

  1. Cross-Promotion of Frequently Purchased Products: Utilize the insights from frequently purchased product pairs to strategically place complementary items together on store shelves or in online product recommendations. This will boost sales by encouraging customers to buy related products.

  2. Personalized Marketing for Customer Segments: Leverage customer segmentation based on spending behavior and purchase frequency to tailor marketing campaigns. High spenders might benefit from loyalty programs, while occasional shoppers could be enticed with promotions to increase their engagement.

  3. Seasonal Campaigns: Capitalize on seasonal trends by launching targeted marketing campaigns during peak months. For example, the holiday season in December can be optimized with exclusive offers and promotions.

  4. Churn Prevention Strategies: Identify and engage with customers who last purchased in the last 30 days. Send personalized offers or reminders to re-engage them with your products and services.

  5. Product Bundling and Special Offers: Based on product association rules, consider creating bundles or special offers for product combinations that are frequently purchased together. This can increase the average order value.

  6. Data-Driven Inventory Management: Use insights on product popularity to optimize inventory levels. Ensure that high-demand items are well-stocked while low-demand items are managed more efficiently.

By implementing these data-driven suggestions, we can enhance our store layout, marketing strategies, and customer engagement efforts. This approach will not only improve customer satisfaction but also contribute to increased sales and profitability, setting our business on a path to long-term success.

In conclusion, this data analysis gives us the insights to make informed decisions and enhance the grocery store’s overall performance.

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.