__STYLES__
Context: In this scenario, I'm an analyst for Taste of the World Cafe, which offers dishes from various global cuisines. The business just released a new menu at the start of the year, and it wants to understand how its new menu items are performing.
The task: I'll be exploring two tables: first, I'll look at the " menu items" table to explore the menu. Then, I'll examine the "order details" table to understand the data that's been collected.
Using these two tables, I'll help the business understand how it's menu selections are performing among customers.
Part 1: Exploring the Menu Items Table:
As you can see, there are 4 columns in the menu_items table:
Q.1: Finding the number of menu items on the table.
I wanted to find the number of unique items listed in the table, so I used COUNT DISTINCT to eliminate duplicates. In total, there are 32 items on the menu.
Q.2: Find the least and most expensive menu items.
To find the least and most expensive menu items, I used COUNT DISTINCT to avoid duplicates. Because I used an aggregate, I know I need to use GROUP BY to give the corresponding dimension to this metric. I grouped by two items:
item name
item price
I also used ORDER BY DESC to sort results from most expensive to the least.
Shrimp Scampi was the most expensive dish, while Edamame was the least expensive.
Q.3: How many Italian dishes are on the menu? What are the least and most expensive Italian dishes on the menu?
For this questions, I used a WHERE clause to filter the results down to Italian. I also used ORDER BY to show the price from most expensive Italian dish to least.
As you can see, Shrimp Scampi is the most expensive and Fettuccine Alfredo is the least.
Q.4: How many dishes are in each category? What is the average dish price within each category?
For this, I used two queries. To find number of dishes in each category, I used COUNT DISTINCT to avoid any duplicates that could skew results. I also used GROUP BY.
SELECT
COUNT(DISTINCT menu_item_id),
category
FROM menu_items
GROUP BY category;
To find average dish price by category, I used GROUP BY.
SELECT
AVG(price) AS avg_price,
category
FROM menu_items
GROUP BY category;
Part 2: Exploring the Order Details Table
To explore the Order Details table, I used:
SELECT *
FROM order_details;
As you can see, there are five total columns in the table.
Q.1: Date range of orders in order_details table.
To find the date range, I used MIN and MAX to aggregate order_date to find the start and end dates.
SELECT
MIN(order_date) AS start_date,
MAX(order_date) AS end_date
FROM order_details;
Q.2: Find # of orders in this date range. Then find the # of items ordered in this date range.
There can be multiple items in an order, so orders can appear multiple times in the table. Therefore, I used COUNT DISTINCT to find the number of orders.
SELECT
COUNT(DISTINCT order_id) AS total_orders
FROM order_details;
This resulted in 5,370 total orders.
To find the number of items, I used regular COUNT since multiple items can be in an order.
SELECT COUNT(item_id) AS total_items_ordered
FROM order_details;
The result is more than 12,000 items ordered during this time.
Q.3 Which order had the most # of items?
I used COUNT to get a total of items ordered and used GROUP BY to group the items by order.
SELECT
order_id,
COUNT(item_id) AS total_items
FROM order_details
GROUP BY order_id
ORDER BY total_items DESC;
Q.4 Which orders had > 12 items?
Because I'm using GROUP BY, I know I need to use HAVING to filter the orders that have more than 12 items.
SELECT
order_id,
COUNT(item_id) AS total_items
FROM order_details
GROUP BY order_id
HAVING total_items > 12
ORDER BY total_items DESC; -- I know this isn't necessary, I just prefer it.
Part 3: Analyze Customer Behavior
Q.1: Combine the menu_items and order_details tables into a single table.
I used LEFT JOIN to ensure I had all the transaction data and joined the menu_items table to the order_details table. I wanted to ensure all transactions were captured in the data.
SELECT *
FROM order_details
LEFT JOIN menu_items mit ON order_details.item_id = mit.menu_item_id;
Q.2: What were the least and most ordered items? What categories were they in?
To find the least and most ordered items, I used COUNT of item_id and grouped by item_name and category. I also used ORDER BY to sort from highest to lowest number of items ordered.
SELECT
COUNT(item_id) AS num_purchases,
item_name,
category
FROM order_details
LEFT JOIN menu_items mit ON order_details.item_id = mit.menu_item_id
GROUP BY
item_name,
category
ORDER BY num_purchases DESC;
The most ordered item was a hamburger and least was chicken tacos.
Q.3: What were the top 5 orders that spent the most money?
I grouped by order_id to get the amount spent per order. Then I ordered by order_total DESC. To focus on top 5 orders, I used LIMIT.
SELECT
order_id,
SUM(price) AS order_total
FROM order_details
LEFT JOIN menu_items mit ON order_details.item_id = mit.menu_item_id
GROUP BY order_id
ORDER BY order_total DESC
LIMIT 5;
Q.4: View the details of the highest spend order. Which specific items were purchased?
From the previous query, we know this is order_id 440. I used WHERE to filter for that order specifically.
SELECT
item_id,
item_name,
category
FROM order_details
LEFT JOIN menu_items mit ON order_details.item_id = mit.menu_item_id
WHERE order_id = 440;
Q. 5: BONUS: View the details of the top 5 highest spend orders.
Building upon the previous query, I used WHERE IN to filter for multiple orders, the ones listed in Q.3.
SELECT
order_id,
COUNT(item_id),
item_name,
category
FROM order_details
LEFT JOIN menu_items mit ON order_details.item_id = mit.menu_item_id
WHERE order_id IN (440, 2075, 1957, 330, 2675)
GROUP BY
order_id,
item_name,
category;
However, because there's a GROUP BY, you could also use HAVING IN order_id IN (440, 2075, 1957, 330, 2675) to get the same results.
Conclusions: