The Task...
is to share an explanatory report providing a data-driven strategy for opening their first coffee shop and address the target audience, product offering, and pricing strategy.
The Data...
I cleaned, transformed, and analyzed the data in Excel.
To gain insights on the potential customers coffee drinking and spending habits, I cut the observations in 5 groups based on on (1) the price they are willing to pay for a cup of coffee when compared to the average and (2) the cups of coffee they usually drink a day, when compared with the average. The results revealed that for GR0 – people who did not provide information on either one, or the two of the criteria for segmentation – has also the most, or all missing answers on the rest of the questions, that is why I did not include them in the analysis.
II. The Report - I divided the report into 5 segments each addressing a different aspect of the task. On the header of each segment provided some interesting facts on the participants in general related to that segment - for example, "64.97% | White/Caucasians".
- "What type?" - describing the 4 different customer segments and comparing and contrasting them on several items to point out their potential.
1.2. Data transformation - turned the following categories into numbers by taking the middle point of the range, for example - "$2-$4" -$3. Used the F-test & t - Test (assuming equal/unequal variances) and Bonferroni adjustment for multiple comparisons.
- "What is the most you've ever paid for a cup of coffee?"
- "What is the most you'd ever be willing to pay for a cup of coffee?"
- "Approximately how much have you spent on coffee equipment in the past 5 years?"
- "How many cups of coffee do you typically drink per day?"
- "Who?" - presented the findings on the demographic characteristics of the different groups and tested for differences across Race, Age, Gender - here I also tested for the statistical significance of the differences using t-Test (F-test & Bonferroni adjustment).
2.1. Data Transformation - because the data is skewed and some of the categories have very few observations, I recoded the categories to include the underrepresented ones into the other groups.
- Age - "<24 years old", "25-34 years old", "3-54 years old", and ">55 years old"
- Race - "White/Caucasian", "Asian/Pacific Islander", "Hispanic/Latino", and "Other"
- Gender - "Male", "Female", and "Other"
- "Why?" - presented the findings about the participants' motivation for drinking coffee and how it impacts their drinking and spending habits. To do that I made the following transformations:
- Recoded the "Why do you drink coffee?" - to make sure I only get unique combinations of the given options, for example, "It tastes good, I need the caffeine" & "It tastes good, I need the caffeine" I recoded as "Taste | Caffeine" - the reason for using the combination of the differing motives instead of the single motives is because I wanted to be able to compare the differing motivations, for example - if a participant selects only "It taste good" is different from another one who selects: "It tastes good, I need the caffeine".
- Coded the open-ended question: "Other reason for drinking coffee" - where the participants were given the opportunity to provide their own motivation for drinking coffee, if different from the provided ones. I read and reread the answers and 3 themes began to emerge: (1) "bonding/socializing" (connect, social, friends, family, etc.), "hobby" (hobby, explore, learning, etc.), "sensations" (smell, comfort, etc.), "other". Then, I coded the answers to be able to compare across the groups and used chi-square to test for the association.
"What?" - used "How strong do you like your coffee?", "What roast level of coffee do you prefer?", "What is the most you've ever paid for a cup of coffee?", "Lastly, what was your favorite overall coffee?" to single out the preferences of the different groups. Used chi-square to test for association on the "How strong do you like your coffee?", "What roast level of coffee do you prefer?" and the customer segments.
"Where?" - presented the finding about where the customers usually drink their coffee.
- Recoded "Where do you typically drink coffee?" to make sure I only get unique combinations of the given locations, and recoded: "At a cafe" & "On the go" in one named " Out".
- Tested for association between buying a coffee from a cafe and the participants satisfaction using chi-square test.
- Created a new variable: "specialty_coffee_shop" to include the participants who would "On the go, where do you typically purchase coffee? (Local cafe)" & "On the go, where do you typically purchase coffee? (Specialty coffee shop)".