Latent Aspect Mining for Short and Unrated Review

  • 李 冠霖

Student thesis: Master's Thesis

Abstract

With the growth of the Internet more and more review websites are born such as TripAdvisor and Amazon However numerous text makes user hardly to understand opinion from the author in short time Therefore rating becomes one essential information on most of the websites even scoring for different aspect It makes people understand from glance by the score of rating but not all sites contain complete information for aspect rating Latent Aspect Rating Analysis (e g LARAM and SACM) has been proposed to infer aspect and aspect rating from reviews In recent years with the rapidly growing of social media the habit of users is changing with tend and the proportion of short text in reviews are increasing How to accurately predict the aspect rating on sparse data becomes a big issue since using the topic model to implement aspect identification in short text and sparse information is difficult to match ground-truth Therefore there are few success cases of Latent Aspect Rating Analysis in short text one of them is Aspect Identification and Rating (AIR) AIR assumes high scored reviews are more likely to occur positive polarity word on the contrary is negative polarity word By this assumption AIR combines sentiment distribution into topic model then uses word sentiment proportion by sampling to infer aspect rating Furthermore if the gap of aspect rating and overall rating is too large or overall rating is missing the accuracy of AIR would be inaccurate since AIR is over-reliance on overall rating In this paper we propose a unified generative model named RAIR based on the structure of AIR and two predicting overall rating method Our method will generate rating distribution from the training data and predict the overall rating of unrated data Then we sample words to different aspect and sentiment to infer latent aspect rating Experiment results on real world dataset without overall rating demonstrate the effect of our method is better than AIR with predicting overall rating
Date of Award2017 Aug 30
Original languageEnglish
SupervisorHung-Yu Kao (Supervisor)

Cite this

'