__STYLES__
Tools used in this project
France's Fight for the World Cup

About this project

Getting Started

Introduction

In this project, I wanted to tell the story of France's history with the World Cup, along with their expectations in this year's tournament.


Exploration

The Data

This dataset contains 6 tables, in CSV format:

  • The World Cups table contains information on historical World Cups including the year, host country, winner, 2nd, 3rd, and 4th place finishers, the number of goals scored, the number of qualified teams, and the number of matches played.
  • The 2022 World Cup Groups table contains information on this year's World Cup including the group, team, and FIFA ranking of each country.
  • The 2022 World Cup Squads table contains detailed information about each player including the team country, player's name, position, age, league, club, and statistics including caps, goals, and World Cup goals.
  • The 2022 World Cup Matches table contains information about this year's matches including the date it will/has occurred, the stage of the tournament, if the team is home or away, and a Boolean field for if one of the teams playing is the host team.
  • The World Cup Matches table contains historical information about past World Cup matches including the date it took place, the stage of the tournament, the home team and number of goals, the away team and number of goals, if there were any win conditions, and a Boolean field for if one of the teams playing is the host team.
  • The International Matches table contains information about past international matches including the tournament, the date it took place, the home team and number of goals, the away team and number of goals, if there were any win conditions, and a Boolean field for if the game took place in the home team's country.

source: Maven World Cup Challenge

Thought Process

I considered what might be important to a team's success in the World Cup including:

  • Past World Cup wins
  • Current FIFA ranking
  • Past international match outcomes
  • Available players
  • Win status for home and away games

From there, I utilized BigQuery to analyze the data. Below are a few sample queries.


Analysis

Sample Queries

Finding France's international game stats

WITH games AS (
SELECT
  COUNT(DISTINCT CASE WHEN home_team LIKE 'France%' AND home_goals > away_goals THEN ID ELSE NULL END) AS home_wins
  , COUNT(DISTINCT CASE WHEN home_team LIKE 'France%' THEN ID ELSE NULL END) AS home_games
  , SUM(CASE WHEN home_team LIKE 'France%' AND home_goals > away_goals THEN home_goals ELSE NULL END) AS home_goals
  , COUNT(DISTINCT CASE WHEN away_team LIKE 'France%' AND away_goals > home_goals THEN ID ELSE NULL END) AS away_wins
  , COUNT(DISTINCT CASE WHEN away_team LIKE 'France%' THEN ID ELSE NULL END) AS away_games
  , SUM(CASE WHEN away_team LIKE 'France%' AND away_goals > home_goals THEN away_goals ELSE NULL END) AS away_goals
  , COUNT(DISTINCT CASE WHEN home_team LIKE 'France%' OR away_team LIKE 'France%' THEN ID ELSE NULL END) AS total_games
FROM `bright-zodiac-346921.world_cup.international_matches`
)


SELECT 
*
, home_wins / home_games AS home_win_percentage
, away_wins / away_games AS away_win_percentage 
FROM games

Finding France's World Cup rankings

WITH ranking AS (
SELECT 
  COUNT(DISTINCT CASE WHEN winner LIKE 'France%' THEN year ELSE NULL END) AS winner
  , COUNT(DISTINCT CASE WHEN runners_up LIKE 'France%' THEN year ELSE NULL END) AS runner_up
  , COUNT(DISTINCT CASE WHEN third LIKE 'France%' THEN year ELSE NULL END) AS third_place
  , COUNT(DISTINCT CASE WHEN fourth LIKE 'France%' THEN year ELSE NULL END) AS fourth
  , COUNT(DISTINCT CASE WHEN host_country LIKE 'France%' THEN year ELSE NULL END) AS host_country
FROM `bright-zodiac-346921.world_cup.world_cups`
)
SELECT
  * 
FROM
ranking


Takeaway

In this challenge, I wanted to represent France's past and future in the World Cup. Rather than reporting on KPIs and business metrics, I used information such as how many World Cups France has qualified for and how many games they have played in. This challenge was unique in that it encouraged data outside of the given dataset.

I found that France has been a World Cup winner twice and has hosted the World Cup twice. The team is currently ranked number 4 but they have won both games in the group stage of the tournament and will be moving on. I researched players' statistics to gain a better understanding of the likelihood of France advancing further in the tournament.

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.