__STYLES__
As a self-proclaimed analyst and Excel jockey, machine learning always seemed like a scary, distant concept to me - something best left for "real data scientists". Unsupervised learning? Forget about it, that sounded even more complicated.
Then I had the fortune of working alongside Chris Bruehl and Alice Zhao as they created Maven's Python for Data Science path and learned two things:
I really think it's a branding issue, because unsupervised learning has the same exact purpose as data analytics: finding insights & patterns. The only difference is that you use machine learning algorithms (like clustering) to get there.
Thus, I present the results of my first-ever data science project, all done from the comfort of my own home (Microsoft Excel).
A k-means clustering analysis revealed that, out of the 30 MCU films released between 2008 and 2022, these 6 were the best performing:
The Avengers and Spider-Man movies are clearly among Marvel's best releases, with only one film (Avengers: Age of Ultron and Spider-Man: Homecoming) missing the cut from each franchise. Black Panther joins the list, boasting the highest critics score out of all 30 films (96%).
You can see them compared with the rest here, visualized on two axes using the clusters from the analysis:
The most interesting piece, however, was visualizing these clusters by release year and noticing the stages of Marvel's build-up towards a clear climax, and the fall that followed:
To quote a DC movie that would have undoubtedly made the "best" cluster here:
"You either die a hero or live long enough to see yourself become the villain"