__STYLES__

Harry Potter and the Maven Magic Challenge

Tools used in this project
Harry Potter and the Maven Magic Challenge

Tableau Dashboard

About this project

Getting Started

Introduction

As a kid, I dragged my mom to every single Barnes & Noble midnight release of the Harry Potter books. I was an avid fan. I distinctly remember spending an entire vacation in Cape Cod reading the most recent release. When Maven Analytics released their Maven Magic Challenge I knew I had to participate.

Exploration

The data

This dataset consisted of 6 tables in CSV format:

  • The Dialogue table contains all of the dialogue from the 8 Harry Potter movie scripts and includes foreign keys for character, chapter, and place.
  • The Chapters table contains the 234 chapters from the movie scripts with chapter name and number, as well as the related movie.
  • The Movies table contains the 8 movies with fields for movie title, release year, runtime, box office revenue, and budget.
  • The Characters table contains the 166 characters that had dialogue in the movie with information about the character's name, gender, species, patronus, wand core and wand wood, and their Hogwarts House.
  • The Places table contains the locations in the movies, both generally (category - Hogwarts, Diagon Alley, etc.) and specific (name - Great Hall, Hagrid's Hut, 4 Privet Drive, etc.)
  • The Spells table contains the spells cast in the movies, with fields containing the incantation, spell name & effect, and the color of the light cast

Thought process

As a self-proclaimed Potterhead, I wanted to better understand the magic of the Harry Potter movies. To do that, I looked at the overarching themes of the movies. I considered where the scenes took place, who had the most dialogue, what spells, charms, and curses Harry used to save the Wizarding World from Voldermort, and more. I wanted to practice not cluttering the space on my dashboard so that was a key focus during my analysis.

Analysis

Getting to know the data

I wanted to gain a better understanding of the data and how each table interacts with the others. To do so, I used some simple queries with joins. I also looked at basic figures, such as total box office revenue and budget, the number of chapters per movie, the gender breakdown of characters, and more. My exploration of the data was to simply grasp what the tables were conveying and how that information might be important.

##Sample Queries

Finding important financial figures.

SELECT
  SUM(Box_Office)/1000000000 ||'B'
  , SUM(ROUND(Box_Office / Runtime,2)) AS rev_per_minute
  , SUM(budget)
  , SUM(Box_Office) - SUM(Budget)
FROM `bright-zodiac-346921.harry_potter.movies`

Finding the number of scenes per specific setting.

SELECT 
  place_name
  , movie_title
  , COUNT(1) AS number_of_scenes
FROM `bright-zodiac-346921.harry_potter.places` AS places
INNER JOIN `bright-zodiac-346921.harry_potter.dialogue` AS dialogue
  ON dialogue.place_id = places.place_id
INNER JOIN `bright-zodiac-346921.harry_potter.chapters` AS chapters
  ON chapters.chapter_id = dialogue.chapter_id
INNER JOIN `bright-zodiac-346921.harry_potter.movies` AS movies
  ON movies.movie_id = chapters.movie_id
GROUP BY place_name, movie_title
ORDER BY 3 DESC

Finding runtime vs. lines of dialogue

SELECT
  movie_title
  , runtime
  , COUNT(dialogue_id) AS number_of_lines
FROM `bright-zodiac-346921.harry_potter.dialogue` AS dialogue
LEFT JOIN `bright-zodiac-346921.harry_potter.chapters` AS chapters 
  ON chapters.chapter_id = dialogue.chapter_id 
LEFT JOIN `bright-zodiac-346921.harry_potter.movies` AS movies
  ON movies.movie_id = chapters.movie_id 
GROUP BY 1,2
ORDER BY 3 DESC

Visuals

As mentioned, I wanted to focus on not cluttering the dashboard with unnecessary charts. I focused on a few different areas including the breakdown of characters per Hogwarts house, the number of lines of dialogue per character, the types of spells Harry uses most often and where, as well as the number of scenes in particular places.

I was particularly interested to see if these things differed per movie. The general consensus is that the Harry Potter books and movies grow progressively darker. I was curious if that would be reflected in the data as well. While not particularly obvious, the number of scenes in place like Hogwarts decreased as the movies progressed.

Takeaway

Based on my analysis and unsurprisingly, Harry Potter was the star of the movie. He had the most lines of dialogue by far (almost double that of any of the other characters), his Hogwarts house is most heavily represented, and he uses the most spells.

I tested my hypothesis that, as the movies progressed, we would see the increasing darkness represented in the data; however, this did not appear to be reflected in the dataset. This could be due to vague categories (i.e. the number of scenes at Hogwarts decreased over the movies but the other settings were too vague to judge if they, in fact, got more dangerous).

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.