Sentiment analysis in a character-level model

Learning to Generate Reviews and Discovering Sentiment. Alec Radford, Rafal Jozefowicz, Ilya Sutskever.

In this paper (apparently only published thru arxiv, so not carefully reviewed by anyone just yet) the authors present an intriguing result. They build a neural-inspired model (LSTM, a fairly standard one) which predicts the next byte in a text, given the ones it already has seen. They train the model on product reviews, and then use it as an input to a simple classifier. The model, in spite of being trained on characters, does very well (better than many standard lexical models, e.g.) on classifying sentiment of product reviews! The authors even find (to their own delight) an indicator cell specifically for sentiment, and show how it tracks sentiment along the progression of the text. This may seem strange, but actually there is a fairly reasonable hypothesis to explain the result: there is more to sentiment than lexical resources can model. This model appears to capture signal which is encoded in something more than the sequence of words.

In general, coercing most everything about language into lexical models (as recent results have done) is fixing the representation on one analysis level which happens to be accessible due to the nature of our writing system. Breaking this strong binding is probably a good idea.