__STYLES__

8 Week SQL Challenge: Case Study #1 Danny’s Diner

Tools used in this project
8 Week SQL Challenge: Case Study #1 Danny’s Diner

About this project

Introduction

Danny's Diner, a Japanese restaurant, is seeking help to stay afloat after capturing basic data from their few months of operation. The restaurant wants to use the data to answer simple questions about customers, such as their visiting patterns, spending habits, and favourite menu items. The problem statement is to help Danny decide whether to expand the existing customer loyalty programme and generate basic datasets for his team to easily inspect. The data set contains three tables:

  • sales
  • members
  • menu

A link to the dataset can be found here.

Problem Statement

Danny wants to use the data to answer a few simple questions about his customers, especially about their

  • visiting patterns,
  • how much money they’ve spent, and
  • which menu items are their favourites.

Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers. He plans on using these insights to help him decide whether he should expand the existing customer loyalty programme . Additionally, he needs help generating some basic datasets so his team can easily inspect the data without needing to use SQL.

So, let's get started...

1. What is the total amount each customer spent at the restaurant?

Merge both sales and menu tables by matching the customer_id and price fields . To figure out how much each customer spent at the restaurant, we add up all their sales values with the SUM() function and, finally, group the joined results based on customer_id.

SELECT customer_id, SUM(price) as total_amount_spent FROM sales JOIN menu on sales.product_id = menu.product_id GROUP BY customer_id ORDER BY sales.customer_id

undefinedAnswer: Customers A, B and C spent $76, $74, and $36 respectively.


2. How many days has each customer visited the restaurant?

To find out how many times each customer visited, we use COUNT (DISTINCT order_date).

Note: To ensure accuracy when counting and to avoid duplicates, always include the DISTINCT function. For example, if Customer A went to the restaurant twice on '2021-01-07', counting without DISTINCT would incorrectly show 2 days instead of the correct count of 1 day.

SELECT customer_id, COUNT(DISTINCT order_date) as num_days_visited FROM sales GROUP BY customer_id;

undefinedAnswer: Customers A, B, and C visited 4, 6, and 2 times, respectively.


3. What was the first item from the menu purchased by each customer?

  • Create a special table part of the query called "first_order." In this table, add a new column called "Position" to figure out the row number, keeping things organised by customer and date. The plan is to focus on the first row for each customer.
  • Now, in the main part of the query, select the necessary details and only take the rows where it's the first visit for each customer.
  • After that, group the results to see how customers interact with different products.
WITH first_order AS (
  SELECT sales.customer_id, sales.order_date, menu.product_name,
    DENSE_RANK() OVER(PARTITION BY sales.customer_id 
    ORDER BY sales.order_date) AS position
  FROM sales
  JOIN menu ON sales.product_id = menu.product_id)
SELECT customer_id,product_name
FROM first_order
WHERE position = 1
GROUP BY customer_id, product_name;

undefinedAnswer:

  • Customer A’s first orders are curry and sushi.
  • Customer B’s first order is curry.
  • Customer C’s first order is ramen.

Here's what you need to know: Because we don't have the exact times when customers made orders, figuring out the very first thing they ordered is a bit tricky. Imagine you go to a restaurant twice on the same day; we can't tell which order came first.

To deal with this, we use a system that looks at the dates and gives each order a rank. If two orders happen on the same day, we consider them equally ranked as the first order. This way, even if we don't know the exact time, we can still get a good idea of what each customer's first order might be.

It's like saying, "Hey, these orders happened on the same day, so let's call them both the customer's first order."


4. What is the most purchased item on the menu and how many times was it purchased by all customers?

We select the product name from the menu table and count how many times each item has been purchased. We are using the COUNT(*) function to count the number of rows for each product.

SELECT menu.product_name,COUNT(*) AS most_purchased_item FROM sales JOIN menu ON sales.product_id = menu.product_id GROUP BY menu.product_name ORDER BY most_purchased_item DESC LIMIT 1;

undefinedAnswer: The most purchased item on the menu is ramen, which must be nice!.


5. Which item was the most popular for each customer?

Begin by creating a Common Table Expression (CTE) named fav_purchase. Within this table, join the "menu" and "sales" tables using the "product_id" column. Group the data by customer and product to calculate the count of occurrences for each product.

Utilize the DENSE_RANK() window function to rank each customer's orders based on the count in descending order. In the main query, select the customer ID, product name, and count of orders from the "fav_purchase" CTE.

WITH fav_purchase AS ( SELECT sales.customer_id, menu.product_name, COUNT(menu.product_id) AS count_of_fav_order, DENSE_RANK() OVER(PARTITION BY sales.customer_id ORDER BY COUNT(sales.customer_id) DESC) AS rank FROM menu JOIN sales ON menu.product_id = sales.product_id GROUP BY sales.customer_id, menu.product_name) SELECT customer_id, product_name,count_of_fav_order FROM fav_purchase WHERE rank = 1;

undefinedAnswer: Customer A and C’s favorite item is ramen while Customer B enjoys all items on the menu.


6. Which item was purchased first by the customer after they became a member?

  • Start by creating a CTE table called "order_as_member." Within this CTE, select the necessary columns and calculate the row number using the ROW_NUMBER() window function.
  • Utilize the PARTITION BY clause to partition the data by members.customer_id and the ORDER BY clause to arrange the rows within each members.customer_id partition by sales.order_date.
  • Join the members and sales tables based on the customer_id column, remember to add a condition that includes sales that occurred post-member's join_date (sales.order_date > members.join_date).
WITH order_as_member AS (
SELECT members.customer_id, sales.product_id,
ROW_NUMBER() OVER(PARTITION BY members.customer_id 
ORDER BY sales.order_date) AS first_item_after_membership
FROM members
JOIN sales ON members.customer_id = sales.customer_id AND sales.order_date > members.join_date)
SELECT customer_id,product_name 
FROM order_as_member
JOIN menu ON order_as_member.product_id = menu.product_id
WHERE first_item_after_membership = 1
ORDER BY customer_id ASC;     

undefinedAnswer: Customer A’s first order as a member is ramen while Customer B’s first order as a member is sushi.


7. Which item was purchased just before the customer became a member?

  • Start by creating a special table called "purchased_prior_member." In this table, pick the right details and figure out the order in which each customer bought things before they joined as members.
  • Then, connect the members' table with the sales table using the customer_id column. But only include purchases made before the customers became members.
  • Next, match the purchased_prior_member table with Danny's Diner menu based on the product_id column. After that, keep only the very first purchase made by each customer before they joined as members.
SELECT m.customer_id,
MAX(menu.product_name) AS last_item_before_membership
FROM members m
LEFT JOIN sales ON m.customer_id = sales.customer_id
LEFT JOIN menu ON sales.product_id = menu.product_id
WHERE sales.order_date < m.join_date
GROUP BY m.customer_id;

undefinedAnswer: Both customers’ last order before becoming members was sushi. What a coincidence!


8. What is the total items and amount spent for each member before they became a member?

  • SELECT: customer_id from the menu table and use COUNT() to count the total number of items purchased, while SUM() calculates the total amount spent before membership for each customer.
  • LEFT JOIN: This joins the members table with the sales table. It ensures that even if there are no matching records in the sales table for a customer (i.e., they haven't made any purchases), we still include them in the result. We will also be performing another left join, this time between the sales table and the menu table. This is to get the details of each item purchased.
  • COALESCE(): This function is used to handle null values, ensuring that even if there are no purchases made by a customer before joining, we still get a valid total amount spent (which will be zero).
SELECT m.customer_id,
  COUNT(*) AS total_items_before_membership,
  COALESCE(SUM(menu.price), 0) AS total_amount_spent_before_membership
FROM members m
LEFT JOIN sales ON m.customer_id = sales.customer_id
LEFT JOIN menu ON sales.product_id = menu.product_id
WHERE sales.order_date < m.join_date
GROUP BY m.customer_id;

undefinedAnswer: Before becoming members, Customer A spent $25 on 2 items while Customer B spent $40 on 3 items.


9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier — how many points would each customer have?

Here’s how the calculation is performed using a conditional CASE statement:

  • If product_id = 1, multiply every $1 by 20 points.
  • Otherwise, multiply $1 by 10 points.

Then, calculate the total points for each customer.

WITH points_customer AS (
SELECT menu.product_id, 
CASE WHEN product_id = 1 THEN price * 20 ELSE price * 10 END AS points
FROM menu)

SELECT sales.customer_id, 
SUM(points_customer.points) AS total_points
FROM sales
JOIN points_customer ON sales.product_id = points_customer.product_id
GROUP BY sales.customer_id
ORDER BY sales.customer_id;

undefinedAnswer: The total points for Customers A, B and C are $860, $940 and $360 respectively.


10. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi — how many points do customer A and B have at the end of January?

We selected relevant data from the sales, menu, and members tables, including customer transactions, menu item details, and customer information.

  • Join these tables based on common identifiers (customer_id and product_id) to bring together related data.
  • Utilizea conditional statement to calculate points earned for each transaction, considering whether it occurred within the first week after a customer joined the program.
  • Filtered the data to include only transactions made by customers A and B before or on January 31, 2021.
  • Group the data by customer_id to calculate the total points earned by each customer.
  • Aggregate the points earned during the first week for each customer to obtain thetotal_points_first_week.
SELECT sales.customer_id,
SUM(CASE WHEN order_date <= join_date + INTERVAL '7 days' THEN 2 * price * 10 
ELSE price * 10 END) AS total_points
FROm sales
JOIN menu ON sales.product_id = menu.product_id
JOIN members ON sales.customer_id = members.customer_id
WHERE sales.customer_id IN ('A', 'B') AND order_date <= '2021-01-31'
GROUP BY sales. customer_id;

undefinedAnswer: Customer A has 1,520 points, while Customer B has 1,240 points.

Bonus Questions

Join All The Things

Recreate the table with: customer_id, order_date, product_name, price, member (Y/N)

SELECT sales.customer_id,sales.order_date, menu.product_name,menu.price,
CASE WHEN members.join_date > sales.order_date THEN 'N'
WHEN members.join_date <= sales.order_date THEN 'Y'
ELSE 'N' END AS membership_status
FROM sales
LEFT JOIN members ON sales.customer_id = members.customer_id
JOIN menu ON sales.product_id = menu.product_id
ORDER BY members.customer_id, sales.order_date

undefinedRank All The Things

Danny is also interested in knowing the product rankings among customers. However, he specifically doesn't need the rankings for purchases made by non-members. Therefore, he anticipates seeing null ranking values for these records, indicating that the customers haven't joined the loyalty programme yet.

WITH all_data AS (
SELECT sales.customer_id, sales.order_date,  menu.product_name, menu.price,
CASE WHEN members.join_date > sales.order_date THEN 'N' 
WHEN members.join_date <= sales.order_date THEN 'Y'
ELSE 'N' END AS membership_status
FROM sales
LEFT JOIN members ON sales.customer_id = members.customer_id
JOIN menu ON sales.product_id = menu.product_id
ORDER BY members.customer_id, sales.order_date)

SELECT*,
CASE WHEN membership_status = 'N' then NULL
ELSE RANK () OVER(PARTITION BY customer_id, membership_status
ORDER BY order_date) END AS customer_rank
FROM all_data;

undefinedInsights

We have discovered a few interesting insights that would certainly be useful for Danny's Diner and they're

  • Customer B visited Danny's Diner the most in January 2021, going there 6 times.
  • The most popular foods at Danny's Diner are ramen, curry, and sushi.
  • Customers A and C really like ramen, while Customer B enjoys sushi, curry, and ramen equally.
  • Customer A was the first person to sign up as a member, and their first order was curry, probably because they really like it.
  • Before becoming members, the last things Customer A and B ordered were sushi and curry, which might have convinced them to join because they enjoyed them so much.
  • Before joining, Customer A spent $25, and Customer B spent $40.
  • During January 2021, Customer A earned 860 points, Customer B earned 940 points, and Customer C earned 360 points.
Discussion and feedback(2 comments)
comment-988-avatar
Alice Zhao
Alice Zhao
4 months ago
This was a really well-documented walk through. Nicely done!

comment-995-avatar
Lenora Ejimofor
Lenora Ejimofor
Project owner
4 months ago
Project owner
Thank you very much, Alice. It would interest you to know that I am still working on refining the documentation, so feel free to check back later for more updates. Your feedback means a lot.
2000 characters remaining