__STYLES__

Clustering MLB Pitcher Types

Tools used in this project
Clustering MLB Pitcher Types

About this project

This project was used as my undergraduate mathematics senior project. The goal of this project was to extract baseball pitch data from public sources into RStudio, and use the tools within R to tidy and analyze the data, and plot the results. This was the process:

  1. Download the data from baseballsavant.mlb.com into RStudio on pitches faced by 30 MLB hitters
  2. Clean the data into a more workable form using Tidyverse package
  3. Score each outcome of the pitches by creating a calculated field.
  4. Group the hitters based on the pitches they were most successful against.
  5. Group the pitchers based on the pitches they threw the most.
  6. Perform a multidimensional linear regression on each of the pitcher types against the hitter types.
  7. Re-group the pitchers using k-means clustering on their success against each of the hitter types.
  8. Plot the results.

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.