__STYLES__
For this project, I analyzed customer data for five different marketing campaigns for a company that sells products to customers in eight different countries.
My goals were:
If you feel so inclined, you can view the original data set here: data playground
Tools:
Before the analysis, I looked through the data dictionary and explored the dataset to understand what each column represented. For example, recency referred to the number of days since the customer's last purchase.
Each customer also has a unique ID and there were records of if they accepted various marketing campaign offers, as well as products purchased. There was also basic demographic information such as marital status, country, number of children, and year of birth.
Right away, I noticed a problem. In the year of birth, some folks had submitted that they were born in the 1893, 1899, and one in 1900. This was just not possible. And while we might naturally assume a person who entered '1893' meant to put '1993,' it's still unclear what someone who entered '1900' meant.
I knew these outliers could skew the data. Since it was only three records (0.1% of the data), I felt comfortable removing them.
Unfortunately, there was little insight to gain from "Education." For responses, people included "graduation", "2n cycle", "PHD," "Basic," and "Master." However, there's no standardization or explanation of what these mean.
For example, what are we defining as a basic education? Does "graduation" mean they completed high school or graduated from university? It's anyone's guess.
I used Google Sheets pivot tables to slice and dice the data.
Additionally, I also used the CORREL function to determine if there was a relationship between various factors and web purchases (products, marital status, campaigns, recency, etc).
To determine the average income, I opted to use median versus average. Salaries can have a wide range. In this example, the dataset had a salary as high as $666,666 and the lowest salary was around $1,700. This could heavily skew an average. Using median, I determined the average customer's income.
I decided to group together "together" and "married" into one demographic. These are customers in a relationship which could include married, a civil union, or simply a long-term romantic partner with no official legal status. I grouped "YOLO" and "alone" (which could be someone being silly or something else undetermined) into "other" in my pivot table.
Finally, I used pivot tables to create charts and design the dashboard. I designed the dashboard with busy marketing managers and CMOs in mind. You can view the dashboards here:
Campaigns and Products Dashboard