__STYLES__
THE DATA
This Dataset has been taken from Kaggle Dataset which contains 8807 Movies/ TV shows Data across the Globe and having 15 Columns. In this Project I am going to use the SQL (Structured Query Language) as a tool to analyze the dataset. There are a lot of SQL Functions and clauses used here to find the insight.
This Dataset contains the Movie/TV Show from 1942 till 2021(except 1 TV Show which was from 1925).
I have analyzed the Netflix Movie/TV Show data to focus on the Main Business Question which I have explored are below:
1. How Many Movies/TV Shows are in this dataset.
2. Top 5 Longest Movies Name and Duration.
3. Top 5 Shortest Movies Name and Duration
4. Top 5 TV Show having Maximum Season.
5. Top 5 best category in which Most TV Show Released.
6. Top 5 best category in which Most Movies Released.
7. Top 5 Highest number of TV Show Released By Year.
8. Top 5 Highest number of Movies Released By Year.
9. Top 5 Director who directed Most number of Movies/TV Show.
10. Top 5 Movie Rating which Most movies have.
Now let's Analyze each business Questions asked above.
1. How Many Movies/TV Shows are in this dataset.
SELECT
type as "Show Type",
COUNT(*) as "Total Count",
CONCAT(ROUND(COUNT(*)*100.0/(SELECT COUNT(*) FROM "netflix_titles")),'%')
as Total_percentage
FROM "netflix_titles"
GROUP BY 1
There are total 6131 Movies has been release in these years which are 70% of Total data.
2. Top 5 Longest Movies Name and Duration.
SELECT
title, release_year, duration
FROM "netflix_titles"
WHERE type ='Movie'
AND duration IS NOT NULL
ORDER BY SUBSTR(duration,1,STRPOS(duration,' '))::numeric desc
LIMIT 5
Black Mirror: Bandersnatch is the Movie which was of 5 hour and 12 Minute duration.
3. Top 5 Shortest Movies Name and Duration.
SELECT
title, release_year, duration
FROM "netflix_titles"
WHERE type ='Movie'
AND duration IS NOT NULL
order by SUBSTR(duration,1,STRPOS(duration,' '))::numeric
LIMIT 5
Silent is a Short movie of 3 minute duration which was released in 2014 is the shortest ever movie in this dataset.
4. Top 5 TV Show having Maximum Season.
SELECT
title as "TV Show", duration as Season, country
FROM "netflix_titles"
WHERE type='TV Show'
ORDER BY SUBSTR(duration,1,STRPOS(duration,' '))::numeric desc
LIMIT 5
Grey's Anatomy is the TV Show has the Maximum i.e. 17 Season.
5. Top 5 best category in which Most TV Show Released.
WITH CTE AS
(
SELECT
unnest(string_to_array(listed_in,',')) as category
FROM "netflix_titles"
WHERE type='TV Show'
),
CTE1 AS
(
SELECT
trim(category) as "Web Series Category", COUNT(*) as cnt
FROM
CTE
GROUP BY 1
ORDER BY 2 desc
)
SELECT
"Web Series Category",
SUM(cnt) as "Total Web Series Released"
FROM CTE1
GROUP BY 1
order by 2 desc
LIMIT 5;
There are most TV Shows 1351 of International TV Shows types However Second is Drama Then Comedies.
6. Top 5 best category in which Most Movies Released.
WITH CTE AS
(
SELECT
unnest(string_to_array(listed_in,',')) as category
FROM "netflix_titles"
WHERE type='Movie'
),
CTE1 AS
(
SELECT
trim(category) as "Movie Category", COUNT(*) as cnt
FROM
CTE
GROUP BY 1
ORDER BY 2 desc
)
SELECT
"Movie Category",
SUM(cnt) as "Total Movie Released"
FROM CTE1
GROUP BY 1
order by 2 desc
LIMIT 5;
There are Most 2751 Movies of type International Movies while second is Drama then Comedies.
7. Top 5 Highest number of TV Show Released By Year.
SELECT
release_year, COUNT(*) as "Total Number of TV Show Released"
FROM "netflix_titles"
WHERE type='TV Show'
GROUP BY 1
ORDER BY 2 desc
LIMIT 5;
There are highest Total 436 TV Show Released in 2020 then in 2019 there are 397 show released.
8. Top 5 Highest number of Movies Released By Year.
SELECT
release_year,COUNT(*) as "Total Number of Movie Released"
FROM "netflix_titles"
WHERE type='Movie'
GROUP BY 1
ORDER BY 2 desc
LIMIT 5;
There are 767 highest Movies released in 2017 and 2018.
9. Top 5 Director who directed Most number of Movies/TV Show.
WITH CTE AS
(
SELECT
UNNEST(STRING_TO_ARRAY(director,',')) as director
FROM "netflix_titles"
)
SELECT
director,
COUNT(*) as "Total Number of Movies Directed"
FROM CTE
GROUP BY 1
ORDER BY 2 desc
LIMIT 5;
Rajiv Chilaka has directed highest of 22 Movies in this dataset.
10. Top 5 Movie Rating which Most movies have.
SELECT
rating, COUNT(*) as "Total Number of Movie"
FROM "netflix_titles"
GROUP BY 1
ORDER BY 2 desc
LIMIT 5;