Using online votes ranking 85 types of candy, your task is to find the 3 treats you'll give out on Halloween to guarantee that trick-or-treaters of all tastes find something they'll love.
Remark: there was a moment I was looking up images for the image. I was not able to find a Payday variant without Caramel. It was supposed to have no caramel. I added "Caramel" to the .csv. It made me wonder about the dataquality in general. All results mentioned below are based on Payday with Caramel and without Chocolat (the special version)
After giving ChatGPT the column and datatype information of the dataframe (df.info()) and explaining the way to use it, I prompted for the scripts.
Used prompts
I got the following code (#below ChatGPT marks the start of the AI code):
import pandas as pd
from itertools import combinations
candy_in = pd.read_csv("candy-data.csv")
#remove the items with no 1 in some of the taste or structure columns
# 2 times money, 3 times candy scoring < .50 percentile wins
candy_combinations_start = candy_in.copy().drop([2,3,8,48,19],axis=0)
df = candy_combinations_start.drop([ 'sugarpercent','pricepercent','pluribus'], axis = 1)
df.sort_values(by='winpercent', ascending=False,inplace=True)
df.reset_index(drop=True,inplace=True)
#below ChatGPT
# Select the binary columns to be checked (exclude 'competitorname' and 'winpercent'), 'removed hard & bar
binary_columns = ['fruity', 'chocolate', 'caramel', 'peanutyalmondy', 'nougat', 'crispedricewafer']
# Function to check if a combination of products ticks all boxes exactly once
def ticks_each_box_once(df_subset):
# Sum the values in each binary column for this combination
column_sums = df_subset[binary_columns].sum()
# Check if each binary column is ticked exactly once (sum should be 1 for each column)
return (column_sums == 1).all()
# Variable to store the best combination and its total winpercent
best_combination = None
highest_winpercent = 0
# Iterate over all combinations of 3 products
for combo in combinations(df.index, 3):
# Get the subset of the DataFrame for this combination
df_subset = df.loc[list(combo)]
# Check if this combination ticks each box exactly once
if ticks_each_box_once(df_subset):
# Calculate the total winpercent
total_winpercent = df_subset['winpercent'].sum()
# Update if this combination has a higher winpercent
if total_winpercent > highest_winpercent:
best_combination = df_subset
highest_winpercent = total_winpercent
# Output the result
if best_combination is not None:
print("Best combination of 3 products:")
print(best_combination[['competitorname', 'winpercent']])
print(f"Total winpercent: {highest_winpercent}")
else:
print("No combination of 3 products ticks all the boxes exactly once.")
I roughly checked the code and the results, do I trust ChatGPT blindly? In this one almost.
The lists of 3 products in the results are ordered by winpercent.
#Test all flavours/ingredients with "candybar" and "hard candy" required
Total winpercent: 151.174756
#Test all flavours/ingredients with "candybar" required
Total winpercent: 162.858338
#Test all flavours/ingredients with "hard candy" required
Total winpercent: 178.419243
#Test all flavours/ingredients, no "hard candy" or "candybar" required
Total winpercent: 190.102825
Based on the results of the tests, leaving out the optional "hard candy" and "candy bar", I suggest the best choice for the products to buy and make many trick-or-treaters of all tastes find something they'll love is KitKat, Starburst & Payday (without the chocolat!)
Hours used approx. 6, would have been 5 if I had not hit the delete button instead of publish.