__STYLES__

Movie Store Data Analysis with SQL

Tools used in this project
Movie Store Data Analysis with SQL

About this project

Business Case Scenario and Objective

I own two movie rental stores and a local business owner wants to purchase them. But before doing so, he wants answers to his very important data-related questions, to make sure its the right investment for him.

My SQL skills came for my big rescue here, when I used:

  • Multi-Table Joins & Bridges
  • Union
  • Aggregate Functions like Count, SUM, AVG
  • with Group By & Order By
  • Case & Count statements

I answered all of his questions using SQL to analyze data from my two movie stores.

The Data:

My data is stored in MavenMovies database. The database has 19 different tables containing information about my customers, my stores, movie actors, actor awards, my staff, movies, movie categories, investors, inventory, rental and customer payments.

Because the data is stored in many different tables, I used Left Joins, Multi-table Joins and Unions to connect data from different tables to answer the investor's business questions.

Key Findings

Below I answer all the questions the investor asked, and I explain how I reached at the answers using SQL queries.

Question 1:

My partner and I want to come by each of the stores in person and meet the managers. Please send over the managers’ names at each store, with the full address of each property (street address, district, city, and country please).

My Answer #1

Because my staff data, store data, and address data is stored in different tables, I used LEFT JOINS to join these tables together to answer this question. I have two managers named Mike & Jon.

SELECT 
    staff.first_name AS manager_first_name, 
    staff.last_name AS manager_last_name, 
    address.address, 
    address.district, 
    city.city, country.country 
    FROM store
LEFT JOIN staff ON 
    store.manager_staff_id = staff.staff_id
LEFT JOIN address ON 
    store.address_id = address.address_id
LEFT JOIN city ON
    address.city_id = city.city_id
LEFT JOIN country ON
    city.country_id = country.country_id

undefined

Question 2:

I would like to get a better understanding of all of the inventory that would come along with the business. Please pull together a list of each inventory item you have stocked, including the store_id number, the inventory_id, the name of the film, the film’s rating, its rental rate and replacement cost.

My Answer #2

I LEFT joined my inventory table with film table, using the common film_id key.

SELECT 
    inventory.store_id, 
    inventory.inventory_id, 
    film.title, 
    film.rating, 
    film.rental_rate, 
    film.replacement_cost
FROM inventory
LEFT JOIN film ON
    inventory.film_id = film.film_id

undefined

Question 3

From the same list of films you just pulled, please roll that data up and provide a summary level overview of your inventory. We would like to know how many inventory items you have with each rating at each store.

My Answer #3

PG-13 movies top my inventory list at 525. In addition to using LEFT JOIN, I had to first use COUNT here to count the number of inventory items per store, and then GROUP the total inventory items(films) based on store_id and film ratings.

SELECT 
    inventory.store_id, 
    COUNT(inventory.inventory_id) AS inventory_total,
    film.rating
FROM 
        inventory
LEFT JOIN film ON
    inventory.film_id = film.film_id
GROUP BY inventory.store_id,film.rating

undefined

Question 4

Similarly, we want to understand how diversified the inventory is in terms of replacement cost. We want to see how big of a hit it would be if a certain category of film became unpopular at a certain store. We would like to see the number of films, as well as the average replacement cost, and total replacement cost, sliced by store and film category.

My Answer #4

Sports category at store#2 has the highest total replacement cost at $3746.19, followed by action and drama at store#1.

I used COUNT, SUM, AVG functions to find the total number of films, the average replacement cost, total replacement cost. Again, I had to use LEFT JOINS to get data from film table, film_category table, category table, and inventory table. I needed to find a BRIDGE here with the BRIDGE key to connect different tables to arrive at the answer. I used GROUP BY and ORDER BY to group the data based on store_id and category.

SELECT 
    store_id, 
    category.name AS category, 
    COUNT(inventory.inventory_id) AS Num_of_films, 
    AVG(film.replacement_cost) AS average_replacement_cost, 
    SUM(film.replacement_cost) as total_replacement_cost
FROM film
LEFT JOIN film_category ON
    film.film_id = film_category.film_id
LEFT JOIN category ON
    category.category_id = film_category.category_id
LEFT JOIN inventory ON
    inventory.film_id = film.film_id
GROUP BY 
    store_id,
    category.name
ORDER BY 
    total_replacement_cost DESC

undefined

Question 5

We want to make sure you folks have a good handle on who your customers are. Please provide a list of all customer names, which store they go to, whether or not they are currently active, and their full addresses – street address, city, and country.

My Answer #5

This query gave me detailed information about my customers and their addresses.

SELECT 
    first_name, 
    last_name, 
    store_id, 
    active, 
    address.address, 
    city.city, 
    country.country 
FROM customer
LEFT JOIN address ON
    customer.address_id = address.address_id
LEFT JOIN city ON 
    address.city_id = city.city_id
LEFT JOIN country ON
    city.country_id = country.country_id

undefined

Question 6

We would like to understand how much your customers are spending with you, and also to know who your most valuable customers are. Please pull together a list of customer names, their total lifetime rentals, and the sum of all payments you have collected from them. It would be great to see this ordered on total lifetime value, with the most valuable customers at the top of the list.

My Answer #6

Karl Seal is my top customer with total rentals at 45 and total payments received from him at $221.55.

As it's important to know where my revenue is coming from, I analyzed who my top paying customers are and who has rented the most films until now. DESC is a very useful function here, to use with ORDER BY, in order to get the top performers.


SELECT 
    customer.customer_id,
    customer.first_name, 
    customer.last_name, 
    COUNT(payment.rental_id) AS Lifetime_rentals, 
    SUM(payment.amount) AS Total_lifetime_value
FROM customer
LEFT JOIN payment ON
    customer.customer_id = payment.customer_id
GROUP BY 
    customer_id
ORDER BY 
    Total_lifetime_value DESC

undefined

Question 7

My partner and I would like to get to know your board of advisors and any current investors. Could you please provide a list of advisor and investor names in one table? Could you please note whether they are an investor or an advisor, and for the investors, it would be good to include which company they work with.

My Answer #7

I have 4 advisors and 3 investors.

This is where I used UNION to get a list of advisors and investors. I had to add NULL as an extra advisor column here, because I have to have same number of columns in both SELECT statements in order to use UNION. I needed company_name of investors as the corresponding column to NULL.


SELECT 
    first_name, 
    last_name, 
        'advisor' AS Type, 
        NULL
FROM 
    advisor 

UNION

SELECT 
    first_name, 
    last_name, 
        'investor' AS Type, 
    company_name
FROM 
    investor

undefined

Question 8

We're interested in how well you have covered the most-awarded actors. Of all the actors with three types of awards, for what % of them do we carry a film? And how about for actors with two types of awards? Same questions. Finally, how about actors with just one award?*

My Answer #8

The investor wants to make sure that we have enough films in our inventory for the top-awarded actors. I used CASE statements to answer this question. Turns out that:

57% actors with 3 awards have at least 1 film in our inventory.

92% actors with 2 awards have at least 1 film in our inventory.

83% actors with 2 awards have at least 1 film in our inventory.

SELECT 
CASE 
  WHEN 
  actor_award.awards = 'Emmy,Oscar, Tony' 
  THEN '3 awards'
   WHEN 
   actor_award.awards IN ('Emmy, Oscar',  'Emmy, Tony', 'Oscar, Tony') 
   THEN '2 awards'
   ELSE 
   '1 award'
END AS 
    number_of_awards,
    
    
AVG(CASE 
  WHEN actor_award.actor_id is NULL 
  THEN 0 
  ELSE 1 END) AS pct_w_one_film
FROM 
   actor_award
GROUP BY 
CASE 
WHEN 
    actor_award.awards = 'Emmy, Oscar, Tony'  
THEN '3 awards'
WHEN 
    actor_award.awards IN ('Emmy, Oscar', 'Emmy, Tony', 'Oscar, Tony') 
THEN '2 awards'
ELSE 
    '1 award'
END 

(Project data & inspiration by: John Pauler from Maven Analytics).





Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.