__STYLES__

Brazilian E-Commerce Public Dataset by Olist

Brazilian E-Commerce Public Dataset by Olist

About this project

Olist operates an online e-commerce site for sellers, that connects merchants and their products to the main marketplaces of Brazil. Olist’s technology enables small merchants to gain market share across the country through a SaaS licensing model for small brick-and-mortar businesses.

The dataset contains information on 100,000 orders placed between 2016 and 2018 on several Brazilian marketplaces. It enables viewing orders from a variety of angles, including customer location, product attributes, order status, pricing, payment, and freight performance.

Data Analysis Process

  1. The Problem
  2. Data Cleaning
  3. Analysis
  4. Data Visualization
  5. Communicating Insight and Recommendation

THE PROBLEM

The company needs insight and recommendations on its sales and how the available data could help it improve its sales and visibility.

DATA CLEANING

The dataset was explored with Microsoft Excel and found the following

  • Empty and Null
  • Unwanted special characters
  • The product category name is written in the Brazilian language (there is a separate table that contains the English version).
  • The Table and Columns title contains an underscore.
  • Spelling errors

The following was done to ensure data quality are up to standard so that the analysis will be accurate , organized and corrected. This is necessary so that the stakeholders can be confident in the insights found so that action can be taken place :

Renamed Tables and columns (Power Bi Query)

Example :

= Table.RenameColumns(#"Changed Type",{{"customer_id", "Customer Id"}, {"customer_unique_id", "Unique Id"}, {"customer_zip_code_prefix", "Zip Code"}, {"customer_city", "City"}, {"customer_state", "State"}})

Replacing Nulls and Empty records (Excel)

By filling Up/Down the empty date records in the order table, such as carrier date and delivered date, there are 610 null and empty records in the products table. The columns include category name, name length, description length, and photo Qty. The missing values were replaced in category with “Not Available” (N/A) and "0” for the remaining columns that are numerics.

The empty values were not removed/deleted because the rows contained information that may be valuable to the analysis.

Importing and Replacing Product Category Column Records (Excel)

The Brazilian language of the product category names were replaced by the English version through Vlookup by matching the Product Id attached to it . The Brazilian language was deleted after since the audience will be reading in English.

Spelling Errors (Excel)

A few column names and values in the product category were misspelt. All were manually corrected to ensure there is consistency and accuracy with the data across the board.

ANALYSIS

To find the best insights for Olist to determine how they are doing, The following questions were asked :

How many orders were placed on Olist, and how does this vary by month ?

What are the most popular product categories on Olist, and how are their sales volume?

What is the distribution of order ratings on Olist, and how does this impact sales performance?

What are the top-selling products on Olist, and how have their sales trends changed over time?

Which payment methods are most commonly used by Olist customers?

Which City has a high seller and customer density?

The following metrics were used for retail industry KPIs for the organization, which include:

  • Total Revenue
  • Total orders
  • Average Order value
  • Number of sellers
  • Average order per customer
  • Percentage of canceled orders
  • Average order seller

Power Bi DAX was used to perform the calculations

Total Revenue

Total Revenue = SUM('Order Payments'[Payment Value])

TR = $16.01 million

Total Orders

Totalorders = COUNT('Order Items'[Item Id])

Total Orders (2016-2020) = 113,000

Average Order Value (AOV)

AOV = [Total Revenue] /[No of orders]

The average order value (AOV) is the average dollar amount customers spend each time an order is placed on your e-commerce website.

AOV = $161

Number of sellers on Olist

No of sellers = DISTINCTCOUNT('Order Items'[Seller Id])

NoS = 3095

Average order per customer

Basket = [Totalorders] / COUNT('Order Payments'[Order Id])

The average order per customer helps business owners and organizations know the average order that was placed by a single customer. This is the opposite of order value, which shows the amount spent on a single order. This is also called a basket.

Basket = 1.08

Percentage of canceled orders

The amount of canceled order were found first

Canceled = CALCULATE(COUNTROWS(olist_orders_dataset), olist_orders_dataset[Order Status] = "canceled")

Canceled = 625

The % canceled order was found after

Percentage of Cancellation = Orders[Canceled] / Orders[No of orders]

This particular metric is very important to retail businesses as it shows how much people like their service, their products, their price, and their shipping fee.

% = 0.006

Average orders per seller

AVG order per sellers = [Totalorders] / [No of sellers]

AVG = 36

This metric helps business owners make informed decisions. It lets them know if they need more sellers or need to improve their sales and marketing.

DATA VISUALIZATION

Power Bi was used to visualize our analysis.

See the attached photos above

CARD displayed all important single metrics. The card is useful to show a single numerical value or a metric.

A pie chart was used to display the most commonly used payment methods.

Line charts, stacked bar charts, clustered bar charts, clustered column bar charts, and line charts were all used for other analysis.

INSIGHTS

  1. The review with the highest rating (5) generated the most revenue and orders
  2. The average order cancellation rate is 0.006%, which shows that most buyers are satisfied with the products. Additionally, it demonstrates that customer orders were fulfilled on time, preventing cancellation.
  3. Credit cards are the most popular method of payment, followed by the country’s most popular payment option, boleto.
  4. Sao Paulo has the most sellers and customers however we must keep in mind that it is because it is the most populous city in Brazil. Followed up is the City of Curitiba.
  5. Bed Bath Table has the highest order volume.
  6. Health Beauty is the product category that brought in the highest revenue in 2017 ($1,45,959) and the Bed Bath Table Category is the product with the highest revenue in 2018 (784,289).
  7. The number of orders increased between June - August Every year

RECOMMENDATION

  1. Olist should offer free shipping for customers with more than two orders, which could increase the average order value.
  2. Olist needs to target the right audience with their most expensive products and consider delivery costs. For instance, paying the delivery cost when purchasing a computer could be difficult.
  3. Olist should provide discounts between May and June to boost order volume, as sales during those times went up.
  4. The average order per customer and average order value means that there is a slow rate of orders . The business should investigate the shipping fee and the product pricing if it is excessively high or pricey.

LIMITATION

  • The dataset does not contain the date when a new customer registered which limited the customer analysis
  • To make a more sensitive analysis, a longer period dataset with every information included such as customer and seller registration date, customer and seller’s name is needed.
Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.