__STYLES__
日本語訳は以下からご覧いただけます。
Goals: The primary goal of this project was to uncover historical trends and preferences in the video game industry, specifically focusing on two key aspects: preferred gaming consoles and popular video game publishers. The overarching aim was to provide actionable insights that could aid an imaginary gaming company in making informed, data-driven decisions regarding their new game release strategy.
Business Needs: The business context necessitated a data-driven approach to decision-making for our imaginary gaming company. They were faced with the critical decision of determining which gaming platforms and publishers to target for the release of their new game. To address this need, the project sought to deliver insights on the gaming industry's historical landscape, allowing the company to identify opportunities for maximum market reach and potential partnerships with successful publishers.
Discovery of Meaningful Insights: To discover meaningful insights, the project followed a well-defined process:
Data Source Selection: I began by acquiring a dataset on Video Game Sales Data dataset from Kaggle, which covered the period from 1977 to 2020. The dataset was chosen for usability rating, and licensing being a CC0 Public Domain for public use, ensuring its suitability for analysis.
Data Cleaning and Preparation: Data cleaning was performed to address missing values in the dataset, primarily in user and critic scores. This was done using Microsoft Excel it is essential for the data to be cleaned and complete before importing it to MySQL.
Analysis Approach: The dataset was analyzed in MySQL to extract and organize the necessary metrics. The resulting data tables were then imported into Tableau Public for data visualization, which offered a more accessible and comprehensive view of the data.
Key Findings: The analysis uncovered several key findings, such as the dominance of Nintendo in both sales and game ratings. Undeniably, Nintendo holds the highest average market share over the entire time period that was analyzed. However, Nintendo's most popular games that were sold on the platform are made by Nintendo themselves or are Nintendo exclusives. For example: Super Mario, Wii Sports and Fit, Zelda, Mario Kart, and Pokemon. But when we look at the console side according to this dataset, PlayStation, Xbox 360, and PC all outsold Nintendo in terms of games sold by console. This led to the conclusion that for a new game company, it will most likely be a wise decision to pick PlayStation or PC as their first publishers due to the historical market reach those platforms have. Plus, PC has grown rapidly over time as well, which adds more reasons to choose one of them. Nintendo is indeed the king of its own games, but in terms of sales per console, it lags behind PlayStation, PC, and Xbox 360.
Below is the Code that I used in MySQL in order to get all the metrics from the dataset. The Dataset came from kaggle and is a CC0:Public Domain Licensed for public use. The dataset is 2 years old and the data hasn't been updated with it being highly focus on unit sales (Physical copies) so this analysis is meant for educational purposes only. Link to dataset: https://www.kaggle.com/datasets/holmjason2/videogamedata, Link to Tableau Visual: https://public.tableau.com/app/profile/adrian.savic/viz/GoogleAnalyticsCaseStudyVideoGameSalesDashboard/Dashboard1
ゴール: このプロジェクトの主要な目標は、ビデオゲーム業界における歴史的なトレンドと好みを明らかにし、特に好ましいゲーム機および人気のあるビデオゲームパブリッシャーに焦点を当てることでした。全体の目標は、架空のゲーム会社が新しいゲームのリリース戦略に関する情報に基づいた意思決定を行うのに役立つ具体的な洞察を提供することでした。
ビジネスニーズ: ビジネスの文脈では、架空のゲーム会社の意思決定にデータ駆動のアプローチが必要でした。彼らは新しいゲームのリリースにおいてターゲットとすべきゲームプラットフォームとパブリッシャーを決定するという重要な決定に直面していました。このニーズに対処するため、プロジェクトはゲーム業界の歴史的な状況に関する洞察を提供し、最大の市場到達度と成功したパブリッシャーとのパートナーシップの機会を特定するための支援を提供しました。
有意な洞察の発見: 有意な洞察を発見するために、プロジェクトは明確に定義されたプロセスに従いました。
データソースの選択: 最初に、1977年から2020年までの期間をカバーするKaggleの「Video Game Sales Data」データセットを取得しました。このデータセットは使用可能な評価があり、CC0パブリックドメインのライセンスを持っており、公共での使用に適していることが選ばれました。
データのクリーニングと準備: データの欠損値、特にユーザースコアと批評家スコアの欠損値に対処するために、Microsoft Excelを使用したデータクリーニングが実施されました。データをMySQLにインポートする前に、データはクリーンで完全であることが重要です。
分析アプローチ: データセットはMySQLで分析され、必要な指標を抽出および整理しました。その結果得られたデータテーブルは、データの視覚化を提供するTableau Publicにインポートされ、データのよりアクセスしやすく包括的なビューを提供しました。
主要な発見: 分析により、販売およびゲームの評価の両方において任天堂の優越性が明らかになりました。間違いなく、分析された期間全体で、任天堂は最も高い平均市場シェアを保持しています。しかし、プラットフォームで販売された任天堂の最も人気のあるゲームは、任天堂自体によって製作されたか、または任天堂の独占作品です。例えば、スーパーマリオ、Wiiスポーツ&フィット、ゼルダ、マリオカート、ポケモンなどです。しかし、このデータセットによれば、ゲーム別の販売数に関しては、PlayStation、Xbox 360、およびPCがすべて任天堂を上回りました。これからの新しいゲーム会社にとって、これらのプラットフォームが歴史的な市場到達度を持っているため、PlayStationまたはPCを最初のパブリッシャーとして選ぶのはおそらく賢明な決定でしょう。さらに、PCは時間とともに急速に成長しており、それらの中から選択するさらなる理由が追加されています。任天堂は確かに独自のゲームの王者ですが、コンソールごとの販売に関しては、PlayStation、PC、およびXbox 360に遅れをとっています。
以下はMySQLで使用したコードです。データセットはKaggleから提供され、CC0パブリックドメインライセンスで公共で使用可能です。データセットは2年前のものであり、データは物理的なコピーの販売に焦点が当てられており、この分析は教育の目的のために行われました。Link to dataset: https://www.kaggle.com/datasets/holmjason2/videogamedata, Link to Tableau Visual: https://public.tableau.com/app/profile/adrian.savic/viz/GoogleAnalyticsCaseStudyVideoGameSalesDashboard/Dashboard1
SQL Code that I used to get all of my metrics used in the Tableau Dashboard
/* Top 20 games in terms of sales from 1977-2020. */
SELECT
Name, Platform, Publisher, Total_Shipped, Year
FROM
`game_sales_data`
WHERE
Year BETWEEN 1977 AND 2020
ORDER BY
Total_Shipped DESC
LIMIT 20;
-------------------------------------------------------------------------------------------------------------------------
/* Total shipped units for 1977-2020. */
SELECT
SUM(Total_Shipped) AS Total_Shipped_1977_2020
FROM
`game_sales_data`
WHERE
YEAR BETWEEN 1977 AND 2020;
-------------------------------------------------------------------------------------------------------------------------
/* Total shipped units by platform 1977-2020 /*
SELECT
Platform,
SUM(Total_Shipped) AS Total_Shipped_By_Platform
FROM
`game_sales_data`
WHERE
YEAR BETWEEN 1977 AND 2020
GROUP BY
Platform
ORDER BY
Total_Shipped_By_Platform DESC;
-----------------------------------------------------------------------------------------------------------------------
/* Market share for plaforms years 1977-2020 /*
SELECT
A.Platform,
A.Total_Shipped_By_Platform,
A.Total_Shipped_By_Platform / B.Total_Units_Sold_All_Platforms * 100 AS Market_Share
FROM (
SELECT
Platform,
SUM(Total_Shipped) AS Total_Shipped_By_Platform
FROM
`game_sales_data`
WHERE
YEAR BETWEEN 1977 AND 2020
GROUP BY
Platform
) AS A
CROSS JOIN (
SELECT
SUM(Total_Shipped) AS Total_Units_Sold_All_Platforms
FROM
`game_sales_data`
WHERE
YEAR BETWEEN 1977 AND 2020
) AS B
ORDER BY
Market_Share DESC;
---------------------------------------------------------------------------------------------------------------------
/* Market Share for publishers years 1977-2020 /*
SELECT
A.Publisher,
A.Total_Shipped_By_Publisher,
A.Total_Shipped_By_Publisher / B.Total_Units_Sold_All_Publishers * 100 AS Market_Share
FROM (
SELECT
Publisher,
SUM(Total_Shipped) AS Total_Shipped_By_Publisher
FROM
`game_sales_data`
WHERE
YEAR BETWEEN 1977 AND 2020
GROUP BY
Publisher
) AS A
CROSS JOIN (
SELECT
SUM(Total_Shipped) AS Total_Units_Sold_All_Publishers
FROM
`game_sales_data`
WHERE
YEAR BETWEEN 1977 AND 2020
) AS B
ORDER BY
Market_Share DESC;
------------------------------------------------------------------------------------------------------------------------
/* Yearly growth and decline from years 1977-2020 /*
WITH YearlyData AS (
SELECT
YEAR,
SUM(Total_Shipped) AS Total_Shipped_All_Publishers
FROM
`game_sales_data`
WHERE
YEAR BETWEEN 1977 AND 2020
GROUP BY
YEAR
)
SELECT
YEAR,
Total_Shipped_All_Publishers,
COALESCE(
Total_Shipped_All_Publishers - LAG(Total_Shipped_All_Publishers) OVER (ORDER BY YEAR),
0
) AS Yearly_Industry_Growth
FROM
YearlyData
-----------------------------------------------------------------------------------------------------------------------------------
/* Top average critic scores from 1977-2020/*
SELECT
gs.Year,
AVG(gs.Critic_Score) AS Avg_Critic_Score,
ts.Total_Shipped
FROM
`game_sales_data` gs
LEFT JOIN (
SELECT
Year,
SUM(Total_Shipped) AS Total_Shipped
FROM
`game_sales_data`
WHERE
Year BETWEEN 1977 AND 2020
GROUP BY
Year
) ts ON gs.Year = ts.Year
WHERE
gs.Critic_Score <> 0
GROUP BY
gs.Year, ts.Total_Shipped
ORDER BY
Avg_Critic_Score DESC
LIMIT 10;
----------------------------------------------------------------------------------------------------------------------------------------
/* Top average user scores from 1977-2020/*
SELECT
gs.Year,
AVG(gs.User_Score) AS Avg_User_Score,
ts.Total_Shipped
FROM
`game_sales_data` gs
LEFT JOIN (
SELECT
Year,
SUM(Total_Shipped) AS Total_Shipped
FROM
`game_sales_data`
WHERE
Year BETWEEN 1977 AND 2020
GROUP BY
Year
) ts ON gs.Year = ts.Year
WHERE
gs.User_Score <> 0
GROUP BY
gs.Year, ts.Total_Shipped
ORDER BY
Avg_User_Score DESC
LIMIT 10;
-----------------------------------------------------------------------------------------------------------------------------------------------
/* Average critic score by platform /*
SELECT
Platform,
AVG(Critic_Score) AS Avg_Critic_Score
FROM
`game_sales_data`
WHERE
Critic_Score <> 0
GROUP BY
Platform
ORDER BY
Avg_Critic_Score DESC;
--------------------------------------------------------------------------------------------------------------------------
/* Average user score by platform /*
SELECT
Platform,
AVG(User_Score) AS Avg_User_Score
FROM
`game_sales_data`
WHERE
User_Score <> 0
GROUP BY
Platform
ORDER BY
Avg_User_Score DESC;
-------------------------------------------------------------------