__STYLES__

Netflix TV Shows and Movies Data Analysis Using SQL

Tools used in this project
Netflix TV Shows and Movies Data Analysis Using SQL

About this project

THE DATA

This Dataset has been taken from Kaggle Dataset which contains 8807 Movies/ TV shows Data across the Globe and having 15 Columns. In this Project I am going to use the SQL (Structured Query Language) as a tool to analyze the dataset. There are a lot of SQL Functions and clauses used here to find the insight.

This Dataset contains the Movie/TV Show from 1942 till 2021(except 1 TV Show which was from 1925).

I have analyzed the Netflix Movie/TV Show data to focus on the Main Business Question which I have explored are below:

1. How Many Movies/TV Shows are in this dataset.

2. Top 5 Longest Movies Name and Duration.

3. Top 5 Shortest Movies Name and Duration

4. Top 5 TV Show having Maximum Season.

5. Top 5 best category in which Most TV Show Released.

6. Top 5 best category in which Most Movies Released.

7. Top 5 Highest number of TV Show Released By Year.

8. Top 5 Highest number of Movies Released By Year.

9. Top 5 Director who directed Most number of Movies/TV Show.

10. Top 5 Movie Rating which Most movies have.

Now let's Analyze each business Questions asked above.

1. How Many Movies/TV Shows are in this dataset.


SELECT 
        type as "Show Type",
        COUNT(*) as "Total Count",
        CONCAT(ROUND(COUNT(*)*100.0/(SELECT COUNT(*) FROM "netflix_titles")),'%') 
        as   Total_percentage 
FROM "netflix_titles"
        GROUP BY 1

There are total 6131 Movies has been release in these years which are 70% of Total data.

2. Top 5 Longest Movies Name and Duration.


  SELECT
          title, release_year, duration 
  FROM "netflix_titles"
         WHERE type ='Movie' 
  AND duration IS NOT NULL 
         ORDER BY SUBSTR(duration,1,STRPOS(duration,' '))::numeric desc
  LIMIT 5

Black Mirror: Bandersnatch is the Movie which was of 5 hour and 12 Minute duration.

3. Top 5 Shortest Movies Name and Duration.


SELECT
          title, release_year, duration 
FROM "netflix_titles"
         WHERE type ='Movie' 
AND duration IS NOT NULL
        order by SUBSTR(duration,1,STRPOS(duration,' '))::numeric 
LIMIT 5

Silent is a Short movie of 3 minute duration which was released in 2014 is the shortest ever movie in this dataset.

4. Top 5 TV Show having Maximum Season.


SELECT 
        title as "TV Show", duration as Season, country
FROM "netflix_titles"
        WHERE type='TV Show'
ORDER BY SUBSTR(duration,1,STRPOS(duration,' '))::numeric desc
         LIMIT 5

Grey's Anatomy is the TV Show has the Maximum i.e. 17 Season.

5. Top 5 best category in which Most TV Show Released.


 WITH CTE AS
  ( 
        SELECT 
                  unnest(string_to_array(listed_in,',')) as category
        FROM "netflix_titles" 
                 WHERE type='TV Show'
  ),
 CTE1 AS
 ( 
     SELECT 
                 trim(category) as "Web Series Category", COUNT(*) as cnt 
     FROM 
               CTE
     GROUP BY 1 
              ORDER BY 2 desc
   )
  SELECT 
               "Web Series Category",
               SUM(cnt) as "Total Web Series Released"
   FROM CTE1 
              GROUP BY 1
order by 2 desc
              LIMIT 5;

There are most TV Shows 1351 of International TV Shows types However Second is Drama Then Comedies.

6. Top 5 best category in which Most Movies Released.


 WITH CTE AS
 ( 
         SELECT 
                   unnest(string_to_array(listed_in,',')) as category
         FROM "netflix_titles" 
                  WHERE type='Movie'
 ),
CTE1 AS
( 
         SELECT
                       trim(category) as "Movie Category", COUNT(*) as cnt
         FROM 
                      CTE 
         GROUP BY 1 
                      ORDER BY 2 desc
)
 SELECT 
      "Movie Category",
       SUM(cnt) as "Total Movie Released"
  FROM CTE1 
       GROUP BY 1
  order by 2 desc
      LIMIT 5;

There are Most 2751 Movies of type International Movies while second is Drama then Comedies.

7. Top 5 Highest number of TV Show Released By Year.


     SELECT 
            release_year, COUNT(*) as "Total Number of TV Show Released"
     FROM "netflix_titles"
           WHERE type='TV Show'
    GROUP BY 1
           ORDER BY 2 desc
    LIMIT 5;

There are highest Total 436 TV Show Released in 2020 then in 2019 there are 397 show released.

8. Top 5 Highest number of Movies Released By Year.


      SELECT 
              release_year,COUNT(*) as "Total Number of Movie Released"
     FROM "netflix_titles"
             WHERE type='Movie'
     GROUP BY 1
            ORDER BY 2 desc
    LIMIT 5;

There are 767 highest Movies released in 2017 and 2018.

9. Top 5 Director who directed Most number of Movies/TV Show.


  WITH CTE AS
 (
        SELECT 
                UNNEST(STRING_TO_ARRAY(director,',')) as director
       FROM "netflix_titles" 
  )
  SELECT 
            director,
            COUNT(*) as "Total Number of Movies Directed"
  FROM CTE
            GROUP BY 1
  ORDER BY 2 desc
            LIMIT 5;

Rajiv Chilaka has directed highest of 22 Movies in this dataset.

10. Top 5 Movie Rating which Most movies have.


 SELECT 
           rating, COUNT(*) as "Total Number of Movie"
 FROM "netflix_titles"
         GROUP BY 1 
 ORDER BY 2 desc
          LIMIT 5;

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.