__STYLES__
Tools used in this project
Halloween Candy Rankings

Dashboard and Slot Machine Demo

About this project

Purpose

The purpose of the project is to find the 3 treats you'll give out on Halloween to guarantee that trick-or-treaters of all tastes find something they'll love.

Data Investigation

The dataset contained 85 types of candy that one might pass out to trick or treaters on Halloween. An online survey was conducted, randomly pitting two candies against each other and allowing the 8300 participants to judge their preferred treat. Each participant judged an average of 32 match-ups. While the methodology wasn’t designed to be scientifically rigorous, the dataset provides a distinctive view into personal candy preferences that can still offer valuable insights. The dataset also had various common characteristics to help group the candy, plus columns labelled as sugar percentile and price percentile.

I initiated the project by thoroughly exploring the dataset and validating its accuracy. During this process, I identified several candies with mislabeled attributes, for example Pixie Sticks were incorrectly classified as non-fruit flavored, and Tootsie Roll Snack Bars were mistakenly listed as "candy bars." I corrected these discrepancies manually.

I noticed a confusing measurement of sugar content based on percentiles, which didn’t reflect the actual amount of sugar in the candies. For instance, Lemonheads and Skittles have the same sugar content per serving, yet Skittles are ranked in the 94th percentile, while Lemonheads are in the 5th percentile. I thought this might be a subjective measure of 'sweetness,' but even identical products showed inconsistencies. Reese's Miniatures were placed in the 3rd percentile, while Reese's Peanut Butter Cups ranked 72nd, despite having the same ingredients. This made the sugar data point unreliable. I also decided to exclude sugar content as a candy characteristic. Candy must be sweet by definition.

Similarly, using percentiles to measure price proved to be flawed. In reality, candy prices vary depending on the region, store, and packaging size, both for individual items (e.g., Snickers minis vs. fun size) and for overall quantities (such as buying in bulk). Once prices were gathered, it seemed nonsensical to stack rank them to determine percentiles rather than just keeping the actual price data.

To personalize the dataset and correct these flaws, I used my local supermarket’s website to find actual pricing and sugar content. Since unit sizes can vary, I prioritized the versions of candy found in the Halloween section. I also avoided variety packs to keep the pricing data as consistent as possible. Lastly, I had to exclude certain candies based on availability (e.g., Boston Baked Beans are not available in my area) and packaging (e.g., candy corn was not offered in pre-packaged servings). This narrowed the dataset to 56 candy options. (The details are in the “Candy Data” worksheet of the attached Excel file.)

Methodology

Since the purpose was to find 3 candies that would guarantee that all trick or treaters would find something they loved, I knew I couldn’t simply select the three highest-rated candies. Instead, I needed to identify the combination that would contain at least one candy with each of the listed characteristics. If multiple combinations existed, the highest-rated candy combination would be preferred.

I decided to use Excel’s Solver add-in to optimize this selection process. Since it allows the creation of multiple constraints while maximizing a particular value, it was the perfect tool for the job. I also performed additional optimizations to find the combination with the worst combined score and the cheapest and most expensive average cost. (The Solver variables are visible in the “Solver Sheet” worksheet of the attached Excel file.)

I decided to look at all valid candy combinations, so I wrote a VBA script that would cycle through all possible combinations to find all that met the criteria. (VBA script is in the “Combo_Finder” module of the attached Excel file.) When designing one method of interactive data presentation, I realized I wanted all possible permutations, not just combinations. Rather than updating the VBA code, I opted for a manual approach to reorganize the combinations. (The output is in the “Valid Permutations” worksheet in the Excel file.)

In order to analyze which candy categories were most important, I created a box and whisker plot. While this does not directly address the project purpose, I thought it would be a good way to view the data. I also created scatterplots to help identify any correlation between the price or grams of sugar and the winningness score (a metric derived from the candy matchups that reflects overall preference).

Findings

Based on the analyses, the best choice is to hand out Gobstoppers, Snickers, and Twix. This combination ticks all the boxes regarding candy characteristics and has a combined winningness score of 205.1. This trio strikes a balance across taste categories, ensuring appeal to a broad spectrum of trick-or-treaters. Other combinations with the minimum and maximum score, sugar, and price are shown in an attached table.

Upon reviewing the scatterplots for sugar vs score and price vs score, we see there isn’t a strong correlation present. While weak mathematical correlations exist between certain factors, these R values were not strong enough to significantly influence the candy selection process. Visually, you can see that there is too much noise and variability to be able to identify clear patterns.

Data Presentation

I created a dashboard that will highlight data on the graphs. Since Excel didn’t really offer an option to have common filters and slicers like PowerBI or Tableau, I had to recreate the functionality with VBA. When you click on one of the candy bars labelled with a candy characteristic, the central table updates to show you to highest scoring candies in that category. In addition, the category is highlighted in the box plot and scatter plots. It is especially valuable with the scatterplots since datapoints may be superimposed on each other for all the candies that have more than one characteristic. By highlighting these datapoints, you are able to compare how different groupings might compare within the chart.

To add an engaging and festive touch, I created a slot machine feature with VBA that selects one of the 396 valid candy combinations at random. (This is why I had to stick with Excel instead of completing this challenge in PowerBI). Each time you click the handle, one of the 396 valid permutations is selected and displayed. If you get a combination that has the highest score or the lowest price, lights flash and you are rewarded with chocolate coins.

Throughout the dashboard are fun images of kids in costume going trick or treating. The slot machine is decorated in fun jack o lanterns to help maintain the Halloween theme.

The full file is available to download here: https://tinyurl.com/4dznc9c8

Additional project images

Discussion and feedback(12 comments)
comment-1947-avatar
Jake Smith
4 days ago
Looks good!

comment-1948-avatar
Dorothy Richey
4 days ago
Interesting analysis

comment-1949-avatar
Louis Michaud
Louis Michaud
4 days ago
I had a blast with the slot machine! Took me a few minutes, but I finally hit the jackpot.

comment-1950-avatar
Valentina  Thompson
Valentina Thompson
4 days ago
Glad to vote for you. See you in class!

comment-1951-avatar
William  Zapata
William Zapata
4 days ago
Looks good, Kevin. See you on Monday

comment-1978-avatar
Marveen Renard Reyes
Marveen Renard Reyes
1 day ago
i love the design!
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.