Taste Breakdown: Leveraging K-Means Algorithm for Customer Segmentation

Tools used in this project
Taste Breakdown:  Leveraging K-Means Algorithm for Customer Segmentation

Taste Breakdown: Leveraging K-Means Algorithm for Customer Segmentation

About this project


Develop a data-driven strategy to identify target audience, optimal product offerings, and pricing strategy for launching the first coffee shop of a group of investors. Main business questions:

  • Target audience: What type of customer should we target, and what are their preferences?
  • Product offering: What types of coffee beans and drinks should we offer?
  • Pricing strategy: How can we align prices with customer value perception?

About The Dataset

The data contains survey responses from ~4,000 Americans after a blind coffee taste test conducted by YouTube coffee expert James Hoffmann and Cometeer. This first-of-its-kind experiment was designed to provide a largely identical tasting experience for people across the country. After the tasting, and once the surveys were submitted, details about each of the 4 coffees they tasted were revealed.

Data Preparation

  • Removed 31 respondents with unknown age.
  • Excluded variables with over 70% blank values from the analysis.
  • I created a TRUE/FALSE column titled “People Who Typically Drink Coffee Outside Home or Office”, where TRUE indicates respondents who typically drink coffee “on the go” or “at a cafe”. This information would be valuable for those who intend to invest in a coffee shop.
  • Merged certain categories in variables like “Gender”, “What is your age?”, “Education level”, “Ethnicity/race”, “Employment Status”, “Number of Children”, “How strong do you like your coffee?”, “What roast level of coffee do you prefer?”, “What is the most you'd ever be willing to pay for a cup of coffee?”, “What is the most you've ever paid for a cup of coffee?”, “How many cups of coffee do you typically drink per day?”, and “What is your favorite coffee drink?” to enhance visualization.
  • Broke down the variable “What is your favorite coffee drink?” into categories based on AI suggestions: Espresso-based drinks (Americano, Cortado, Espresso), Milk-based drinks (Cappuccino, Latte), Cold coffee drinks (Cold brew, Iced coffee), Flavored drinks (Mocha, Blended drink), Pourover/Regular drip coffee, and Other. Additionally, a tooltip was developed to provide deeper insights into each of these categories.
  • Measures with variables offering multiple choices were computed by dividing the total number of respondents who selected a specific category by the overall number of respondents.

Customer Segmentation

To identify distinct segments, I used the K-Means clustering algorithm with variables related to acidity, bitterness, and personal preference from respondents who tasted coffees A, B, C, and D. Initially, Power BI generated 2 clusters automatically, but over 80% of the observations were in a single cluster. So, I manually adjusted the number of clusters to 5, with the largest containing around 30% of the observations.

Then, I labeled the clusters using AI suggestions from the new dataset called “Segments” as follows:

  • Segment 1: Standard taste
  • Segment 2: Low acidity, high bitterness
  • Segment 3: High acidity, medium bitterness
  • Segment 4: No preference
  • Segment 5: Balanced taste

Dashboard Overview

I've created a concise single-page Power BI Dashboard to address the main questions from the Maven Coffee Challenge. The dashboard features a static section at the top and three sections below it, accessible via a left-side menu.

In the static section, users can explore customer segments generated by K-Means clustering on the top left, with total respondents, gender distribution, and average coffee expertise displayed on the top right. Additionally, users can filter respondents who typically drink coffee outside.

The menu sections include:

  1. General Profile: Provides insights into the selected target audience's demographics, including number of children, political preference, age, education level, employment status, and ethnicity/race.

  2. Taste & Preferences: Displays preferences of the selected target audience, such as favorite coffee among A, B, C, and D, reasons for drinking coffee, preferred strength and caffeine content, roast level, favorite coffee drink, and coffee additions.

  3. Budget: Allows users to visualize the budget of the selected target audience and adjust coffee drink or equipment prices based on responses. This section is divided into two parts:

     - Coffee Equipment: Provides insights into respondents' 5-year budget and home brewing methods. **Users can also filter respondents who perceive good value for money with coffee equipment**.
     - Coffee Drink: Displays information on respondents' monthly budget, daily consumption, and a comparison between maximum willingness to pay and maximum price ever paid for a cup of coffee. **Users can filter respondents who perceive good value for money with coffee drinks and/or know the source of their coffee**.

Additionally, most charts are responsive, enabling users to interact with the data and uncover various insights by selecting them.


In conclusion, the development of this dashboard represents a significant step forward in simplifying the process of understanding customer segments, product offerings, and pricing strategies for launching a coffee shop venture. With its user-friendly interface and intuitive design, the dashboard offers a seamless experience for users to explore key insights and make informed decisions.

The utilization of the K-Means algorithm for customer segmentation enhances the dashboard's effectiveness by providing clear and actionable segments based on acidity, bitterness, and personal preferences. This enables users to tailor their strategies and offerings to meet the specific needs and preferences of each segment, maximizing the potential for success in the competitive coffee market.

By leveraging this dashboard, users can gain valuable insights into their target audience's demographics, taste preferences, and budget considerations, empowering them to make data-driven decisions when addressing the main questions posed in the objective. Whether it's identifying the ideal customer profile, refining product offerings, or optimizing pricing strategies, this dashboard serves as a valuable tool for guiding strategic decision-making and driving business growth in the coffee industry.

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.