2012 | OriginalPaper | Buchkapitel
Sentiment Classification with Supervised Sequence Embedding
verfasst von : Dmitriy Bespalov, Yanjun Qi, Bing Bai, Ali Shokoufandeh
Erschienen in: Machine Learning and Knowledge Discovery in Databases
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
In this paper, we introduce a novel approach for modeling
n
-grams in a latent space learned from supervised signals. The proposed procedure uses only unigram features to model short phrases (
n
-grams) in the latent space. The phrases are then combined to form document-level latent representation for a given text, where position of an
n
-gram in the document is used to compute corresponding combining weight. The resulting two-stage supervised embedding is then coupled with a classifier to form an end-to-end system that we apply to the large-scale sentiment classification task. The proposed model does not require feature selection to retain effective features during pre-processing, and its parameter space grows linearly with size of
n
-gram. We present comparative evaluations of this method using two large-scale datasets for sentiment classification in online reviews (Amazon and TripAdvisor). The proposed method outperforms standard baselines that rely on bag-of-words representation populated with
n
-gram features.