__STYLES__

Customer Review Analysis using Python

Tools used in this project
Customer Review Analysis using Python

About this project

Reviews helps brands improving marketing efforts to target customers by satisfying their needs and wants. However, it would be inefficiency to manually read through these reviews wishing to find out the implied stories. Hence, machine learning comes handy to rescue the mission of using big data to understanding customers through their reviews. In this project, I would use Python Natural Language Processing (NLP) and Python's Natural Language Toolkit (NLTK) as a main tool to analyzing the stories behind customer’s review of Japanese’s whiskies.

Link to the dataset: https://www.kaggle.com/datasets/koki25ando/japanese-whisky-review.

In this research, I will leverage the information to find the answer for these questions:

1) Among these 4 brands, which one does receive the most positive/negative reviews?

2) What is customer’s attitude toward these type of whiskies?

3) What negative and positive words are commonly used in these review?

4) How accuracy machine learning would predict the positive/negative sentiment from text review?

Within 50 kinds of whiskies, top 5 whiskies received most of reviews are (1) Nikka Whisky from The Barrel with 150 reviews, (2) Yamazaki 12 Year Old with 126 reviews, (3) Yamazaki Sherry Cask 2016 with 123 reviews, (4) Yamazaki 18 Year Old with 84 reviews, (5) Hibiki Japanese Harmony with 58.

I applied VadersSentiment to calculate the Vader score which includes negative score(neg), positive score(pos), neutral score (neu), and the normalized, weighted composite score between these 3 – compound score(compound). undefinedHere is the answer to research question 1 and 2:

1. For question 1, Hibiki receives the most positive reviews based on its highest average positive score, and Yamazaki receives the most negative reviews based on the highest average negative score .

undefinedundefined2. For question 2, most of the whisky types receive positive attitude from the customer. The lowest rate is #20 - Nikka Coffey Malt 1998 with -0.4335 in average compound score, following with #43-Yamazaki Bourbon Barrel 2013 (48%) with -0.1172 in average compound score.

undefined3. For question 4, most of the words are positive sentiment. Below is the comparison between number of positive word and negative words within the collected reviews. These words are presented in word-cloud to better the illustration the amount of being used to review the products.

undefined4. For question 4, In random forest, out of 339 observations, the model misclassifies 86 observations with overall accuracy of nearly 75%.

undefined

RNN is widely used in natural language processing thank you it’s ability recognized data’s sequential characteristics and use patterns to predicts the next likely scenario. At epoch of 5, RNN generate accuracy of 95% before overfitting. Therefore, RNN is the best model to predict the outcome of this particular dataset.

undefinedundefined

Marketing Recommendations and Future Work:

  • consider using positive words in digital marketing campaign, for these words will become searching words once consumers look for a specific whisky.

  • study the market carefully to focus on special quality of the products rather than quantity of the products since Customers won’t be able to identify a standout benefit/quality with too many option within a brand.

  • focus on one or two core product rather than spreading the marketing effort equally between each product types

  • recognize the leader in the market and create a specific marketing strategy to compete with it.

It is recommended to keep gathering more reviews/data in different period of time to compare and evaluate the result of marketing strategies so that company can better prepare and adjust to the change of the market and customer’s perspectives.

Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.