A sentiment analysis of TATA Hexa.
I always wonder when I look at the collaboration of TATA Motors with the blogger community in India. Earlier there are very few activities. But bloggers were seriously involved when Zica (later named as TATA Tiago) was launching. Bloggers & media called separately to Goa to for show casing the fantastico TATA Tiago. In this post, I would like to quantitatively look at how the bloggers & media communicate to the people…In terms of sentiments, not by reach. I am sure, influence by blogger are not just limited to an initial buzz, but it is having distend reactions.
A potential car buyer will go through a couple of media reviews before making the buying decision based on his reach & preferences. Later he will ask for information to people around him. Media plays two roles mainly
- Bring the true, technical & transparent details about the car
- Give comparisons in terms of price & features
Bloggers having a good personal touch with their reader and they bring more personalized flavor of the product apart from the technicality & commercials.
I was contemplating on machine learning, natural language processing & sentiment analysis for a while. Some day I got a trigger, why can’t I apply the sentiment analysis on the car reviews. TATA Hexa reviews are ideal for it, Tata Motors had generated a vast amount of content over internet by showcasing Hexa to media (as usual) & to the bloggers. There are interesting & vast amount of data available for my analysis.
I approached the problem keeping the simplicity in mind. I web scraped most of the TATA Hexa media review as well as blog posts. Around 38 media reviews & 58 blogger reviews had web scraped. Both data-sets where considerably heavier. I decided to do analysis separately & understans the sentiments of both media & bloggers.
The process of sentiment analysis as follows. Web scrapping from relevant sources(http://hexa.tatamotors.com/reviews). Data cleaning in RStudio. Sentiment analysis using R library “syuzhet”. Summarizing sentiments by percentage distribution plot. In addition a word-cloud prepared with R “wordcloud” library. I write a details sentiment analysis with RStudio in an upcoming post.
Inference [Sentiment Analysis]:
- Positive: A very small 0.7% difference. Both bloggers & media is positive about TATA Hexa. About 1/4 portion of reviews are positive comparing other sentiments.
- Negative: Bloggers found less negative sentiments towards TATA Hexa, but as usual media brought out critical observations, which is technically in depth.
- Trust: Same as the positive sentiment, both media & bloggers are in the same page. TATA Hexa reviews having good weight-age about trust.
- Surprise: Change of 1%, which could be a cumulative of the product & product introduction activities. Off course bloggers had a lot of surprises in the event.
- Sadness: As same as the negative sentiment, bloggers & media have different observations ,which could be more towards technical aspects of the car.
- Joy: Yes, for the bloggers it is more of a joyful ride & media it is serious business.
- Fear: Almost the same sentiments & very minimal overall
- Disgust: Overall negligible.
- Anticipation: It is mixed response to the car, drive & the event.
- Anger: This is weighted due to the explanations of the TATA Hexa’s aggressive look, pricing, etc in addition to the real anger.
TATA Hexa Review – Word Cloud [Media & Bloggers]
- Both media & blogger predominantly used “THE TATA HEXA” [Obviously ;)]
- Bloggers are excited about:
- Hexa experience, how they drove the car in various roads in addition to the off-road experience
- Secondly bloggers talked about- the manual & automatic transmissions of Hexa, modes, features & Hyderabad location
- Media focused on:
- Systematically media talked about design of the vehicle, engine, seats, transmissions & drive modes.
- Media talked about TATA Aria also in the reviews
- Media is positive with the words – quality, well, gets, new etc
Overall the sentiment is towards positive. Following the positivity the second position is Trust. Since it is TATA, no doubt about trust. Negativity & anticipation are at almost the same level. Media talked about some of the glitches at the media review level & shown their expectation of fixing these in the final product. Joy & surprise is there in the product & in the reviews as well. sadness, anger, fear & disgust are in negligible level.
Note: This is adaptation of a natural word processing model & sentiment analysis. The application, analysis, views & opinions in this post are personal.
- Problem/Limitations of syuzhet:- https://annieswafford.wordpress.com/2015/03/02/syuzhet/
- Github:- https://github.com/mjockers/syuzhet
- Introduction to the Syuzhet Package:- ftp://cran.r-project.org/pub/R/web/packages/syuzhet/vignettes/syuzhet-vignette.html