__STYLES__
The customer dataset was obtained from Kaggle (download here). It has customer information organized into categorical variables: sex, marital status, occupation, education, and settlement size; and also numerical variables: customer ID, age, and income.
There was no need for data cleaning, I only transformed the original dataset using Excel by changing the categories' numerical legend to its respective descriptive legend:
Gower distance method from the “cluster” R package was used to build the matrix because data is a mix of numerical and categorical variables. Then, the dendrogram was built with Ward’s agglomeration, which showed seven distinguishable clusters. After that, Excel's pivot tables and Power BI were used to quickly analyze the seven clusters and explore/compare the male and female age group distribution and differences in age/income.
Check out the complete R script here, in one of my public Github repositories. You will also see the resulting dendrogram (original and colored).
A vertical orientation was used with a personalized dendrogram as the central element, dividing the visualization into male/female. Additionally, a special detail was added so that the dashboard can be filtered by clicking a male and/or female cluster.
To proceed with a customer behavior analysis and match those findings with these seven clusters. That way, products, promotions, and/or ads will have major relevancy, especially considering the customer population of non-single females and single males.