Bounds of gold standards for sentiment analysis experiments

I keep running experiments on sentiment analysis, the process where texts or sentences
are categorised in positive, negative, or neutral. Those who know me know that I like to
hold forth about how simplistic and impractical many of the starting points for that whole endeavour is. It is still both fun and informative to try.

In practical application sentiment analysis is most often based on purely lexical features –
the presence or absence of attitudinally loaded terms. Improving such analysis models is done either by improving the lexicon itself or improving the handling of such features: either one adds (or removes) terms from the lexicon, which of course has predictable effects on coverage, recall, and precision; or one understands the way those terms are used better, such as introducing negation handling or something else constructionally relevant.

I found that experiments in this direction tended to give less than satisfactory results, and this working report on bounds of lexical sentiment analysis is part of an effort to understand why. I show that the results one can get from experimentation on a gold standard are very much bounded by the lexical resource; that the gold standards differ considerably; and that potential experimental gains are relatively small.

Decadence

The adjective decadent is included in several lists of negative terms compiled for sentiment analysis systems, but how is the term used? I looked in real life data from customer reviews and news.

  • … Eating one is good. Eating two is great. Eating three is decadent, and awesome on an
    empty stomach. Eat four and you start to feel sick. …
  • Love this amazing fusion dessert. Sounds exotic and looks soooo decadent. …
  • A tailored take on the label’s romantic aesthetic, Elie Saab’s crepe jumpsuit is a
    decadent choice for evening events
  • So let’s all raise a glass and hope we get many more decades of this decadent local staple
  • Dense and wickedly chocolatey, the decadent dessert is best shared for greater enjoyment.
  • I’m surrounded by soft, glowing candles while enjoying a glass of rich red wine and a box
    of decadent dark chocolate
  • If you’re feeling decadent, put a pinch of crumbled bacon or a couple of sun-dried
    tomatoes in an egg white omelet
  • Come join us for some more decadent daytime disco and house partying
  • … and many more examples

Almost none were negative. Most frequent topic was chocolate.

Talk at NLP lunch

Today, I gave a hastily put together talk on my current experimentation on attitudinal adjectival expressions in the NLP group lunch. Fun, but somewhat rhapsodic, since the experiments are as of yet incomplete! (I will update when I have more to tell).


|---large--+------everyday-----+-----small---|
|-----not small----------------|
           |-------------not large-----------|
|-----------ill------------------------------O
                                      healthy^
                                      not ill^
|-----------not healthy----------------------O

One of the things I wanted to stress is to think about the utility of introducing linguistic sophistication for practical information system purposes. I posited three levels of use cases for large scale text
analysis:

  1. what are they talking about?
  2. how are they talking about it?
  3. what are they saying about it?

Slides are here.

A digital bookshelf: original work on recommender systems

I spent the better part of the 1989-90 academic year at Columbia University, in the NLP group headed by Kathy McKeown as a visiting graduate student. I had recently begun my graduate studies and my idea was to work on statistical models to improve human computer interaction. I had heard of neural networks, read the recently published PDP book and worked through its examples (there was a 5 1/4″ floppy disk included!), and went to a Summer school on connectionist models and neural architectures organised by Boston University in Nashua NH, taught by Steven Grossman, Robert Hecht-Nielsen and some others (the school was very focussed on the ART architecture). I built a connectionist crossword puzzle generator in Prolog which given a lexicon almost managed to build a crossword puzzle.

That year was fruitful in several ways. My most important task for myself was setting up experiments on recommender systems. I collected data through a questionnaire, and I ran statistical experiments on .newsrc files on the systems I had access to. I visited Bellcore in Morristown, where my mentor Don Walker invited me to give a talk to his lab which I believe included Will Hill who later worked on this sort of thing. I had by then written a paper on the “Digital Bookshelf” which was promptly rejected by the 1990 INTERACT reviewers because they held that building a recommender system would interfere with users’ privacy and integrity.

When I came back to Stockholm I wrote a tech report to describe the idea. Later, I wrote a more complete report, when I worked at SICS. I only published it in 1994: I brought it to that year’s SIGCHI and distributed it to several friends there. Martin Svensson, one of my colleagues at SICS later picked up similar thoughts and wrote his dissertation on Social Recommendation Systems, and by then I started to regret that I had not worked more on developing the ideas further! I blame those reviewers for the 1990 INTERACT! (I probably should not have opened that discussion: I included a section on privacy aspects in the paper.)


  • The 1990 Tech Report: Jussi Karlgren. 1990. An Algebra for Recommendations. The Systems Development and Artificial Intelligence Laboratory. Working Paper No 179. Department of Computer and Systems Sciences. KTH Royal Institute of Technology and Stockholm University.
  • The 1994 Tech Report: Jussi Karlgren. 1994. Document Behaviour Based on User Behavior—A Recommendation Algebra. Tech report T94:04. Swedish Institute of Computer Science. (Or here, if that link breaks.)

Dolphins at VIHAR

On this day our position paper A proposal to use distributional models to analyse dolphin vocalization outlining our plans to work with dolphin vocalisation using distributional semantics was presented at the 1st International Workshop on Vocal Interactivity in-and-between Humans, Animals and Robots. The paper was presented by Mats Amundin, and co-authors were Robert Eklund, Henrik Hållsten, and Lars Molinder, who came up with the original idea.

Transaktionstransparens och informationsimbalans

Idag publicerar jag och ett antal vänner (“sju it- och mediedebattörer”) en text om transaktionstransparens, (något vi ser som eftersträvansvärt) på DN debatt. Detta i motsats till den informationsimbalans vi ser uppstår mellan informationsintensiva organisationer och de individer som har med dem att göra som kunder eller som medborgare. Vi ser att individer inte ser värdet på de data delar med sig, men inte heller har möjlighet att bedöma det värdet utan att få tillgång till liknande verktyg och liknande mängder data som organisationen själv har. Detta kommer inte att ske. Som motvikt till detta föreslår vi att företag och organisation som värdesätter information om sina kunder eller andra individer de interagerar med anger det värde de anser den informationen har som en del av sin ekonomiska redovisning. Det ger kunder möjlighet att bedöma vilket värde de data de delat med sig av har.

Visiting Scholar at Stanford

This coming academic year of 2017-18 I will be at Stanford University, at its Department of Linguistics. I am looking forward to tugging at some of the most interesting loose ends from the past few years of technology development at Gavagai in the hope of finding interesting seams to work!
Professor Martin Kay, who hosts my visit, took me in for an internship at Xerox PARC in 1991. Now he will be again pointing out the best directions to develop.

Stanley Greenstein defends his Ph D dissertation on predictive modelling

Today I had the pleasure to witness the public defense of Stanley Greenstein’s Ph D dissertation on legal implications of predictive modelling: “Our humanity exposed — Predictive modelling in a legal context” for which I was a co-supervisor on technical matters.

In his dissertation, Stanley gives an inventory of several legal frameworks which might be relevant for the effects predictive modelling might have on an individual. He discusses the risk of “potential harm” — harms which an individual might not even be aware have occurred, such as a somewhat higher interest rate or insurance payment, or not being selected for a job. He examines how European regulations on data protection and human rights are applicable to understanding such harms, and focusses on the target notion of “empowerment” as a legal concept to address the information imbalance between large organisations and individuals.

Föreläsning för humaniorastudenter

Jag var inbjuden att hålla en föreläsning om i stort sett vad som helst jag nu kunde tänkas vilja tala om för humaniorastudenter på humanistdagen på Stockholms universitet på institutionen där jag tagit både grund- och doktorsexamen. Jag försökte anlägga sträng min och tala om sådant som humanister borde göra istället för det de gör när de träffar på en dator. Lite ljusbilder här!