__STYLES__
End of course project for Google Data Analytics Professional Certificate
Date Created: Jul 29, 2023
Goal: find the reasons behind users abandoning carts.
Tools used: Tableau, Power BI, and R language.
This is fictitious data created by me that could represent a real-world situation.
I approached this project by following the data analytics phases outlined in the certificate program:
The data is fictitious data created by me and it could represent a real-world case, it is very small data that consists of 19 rows and 8 columns, here is a sample of the data:
I analyzed the data to find possible reasons, here are some pictures of the process using Tableau:
We can see the average cart value for those who completed order is significantly higher than who did not complete the order:
Here we can see that 75% of users who did not complete the order have added their payment info, which means there is no problem in the payment page:
Here we can see that 83.33% of users who did not complete the order have NOT added their address info, which makes me think about a possible problem in the shipping prices, as the shipping rates only show up AFTER the payment info is added:
Here is the last chart to show, which shows that most of the completed orders are above $40 in cart value, which makes my last point true:
We should investigate our shipping rates to solve the problem, my suggestion is to make the shipping free or $5 max and slightly increase the prices of all products like 5% for example to cover the shipping costs, this needs further studying by sales team to determine how much exactly the products prices should be increased.
Done, I also worked on this project using R language too, here is the code and the output:
library(readr)
library(dplyr)
library(skimr)
library(janitor)
library(ggplot2)
library(tidyverse)
library(here)
install.packages("egg")
library(egg)
abandoned_carts <- read.csv("C:/Users/s2ahm/Desktop/Data Analytics/Google Data Analytics Professional Certificate/Case Study/Abandoned carts.csv")
view(abandoned_carts)
str(abandoned_carts)
count_completed_order <- table(abandoned_carts$completed_order)
count_completed_order
y = prop.table(stat(count))
label = scales::percent(prop.table(after_stat(count)))
value_to_count <- 19
count <- length(which(abandoned_carts$cart_value == value_to_count))
completed_order_sum <- sum(abandoned_carts$completed_order)
completed_order_count <- summary(abandoned_carts$completed_order)
mutate(abandoned_carts, completed_order_sum)
completed_order_count
sum(abandoned_carts$completed_order, na.rm = TRUE)
count_user <- table(abandoned_carts$user)
count_user2 <- summary(abandoned_carts$user)
count_user2
plot1 <-
ggplot(data = abandoned_carts,
mapping = aes(x = (group = cut(cart_value, breaks = seq(0, 120, 20))), y = prop.table(stat(count)),
fill = completed_order)) +
geom_col() +
geom_text(aes(label=count_user), vjust=3) +
labs(title = "First R Plot", caption = "abandoned carts", x = "Cart Value", y = "Number of Users")
seq_cart_value <- abandoned_carts %>%
group_by(group = cut(cart_value, breaks = seq(0, 120, 20))) %>%
summarize(n = n())
scale_x_continuous("",breaks=c(2.5,6,9),labels=c("Group1","Group2","Group3"))
ann_text <- data.frame(mpg = 15,wt = 5,lab = "Text",
cyl = factor(8,levels = c("4","6","8")))
p + geom_text(data = ann_text,label = "Text")
p <- ggplot(abandoned_carts, aes(x = (group = cut(cart_value, breaks = seq(0, 120, 20),
labels=c("$0-19", "$20-39", "$40-59", "$60-79", "$80-99", "$100-119"))), fill = completed_order)) +
geom_bar(stat = "count") +
scale_fill_manual(name="Abandoned Carts",
labels=c("Abandoned Cart", "Completed Order"),
values=c("#990012", "#77BFC7")) +
labs(title = "Abandoned Carts", x = "Cart Value", y = "Count of Users") +
theme(plot.title = element_text(hjust = 0.5)) +
geom_text(aes(label=after_stat(count)), stat="count", vjust=1.5) +
#annotate("text", x = 5, y = 6, label = "order completion rate
#for\ncart_value<40 is 16.66%\norder completion rate for cart_value>40 is 71.43%", color = "", size = 0.8) +
geom_vline(xintercept = 2.5, linetype="solid", color = "navy") +
facet_wrap(~completed_order, nrow=2)
my_tag = c("order completion rate for\ncart_value<40 is 16.66%", "order completion rate for\ncart_value>40 is 71.43%")
tag_facet(p,
x = -Inf, y = -Inf,
vjust = -4, hjust = -1.25,
open = "", close = "",
fontface = 1,
size = 4,
tag_pool = my_tag)
output:
and here is a Power BI sheet I made too:
Done.
Thanks for reading!