Generic Opinion Mining (English)

A generic sentiment analysis pipeline for English text. The pipeline identifies sentences containing basic positive and negative opinions, and includes basic entity detection. It creates an annotation for every opinionated sentence, with features denoting:

  • the polarity of the opinion (positive or negative)
  • a score denoting the opinion strength
  • a broad classification of the emotion expressed by the sentence
  • an optional reference to the entity acting as the target of the opinion
  • a feature denoting presence of sarcasm
  • linguistic features such as whether the sentence is a question, conditional, imperative etc.

It also averages the sentiment over the whole document and provides an indication of standard deviation.

The pipeline is designed for use on good quality longer texts such as news articles or reviews.

Default annotations
Sentiment:Sentence Sentences, with the following additional features:
  • question with value "direct", for sentences which contain direct questions.
  • directive for directive sentences, the value being the kind of directive, e.g. imperative, prohibitive, deliberative.
  • sentiment (positive or negative) and emotion for opinionated sentences; see SentenceSentiment below for more detailed features on these.
  • conditional with value "yes", for conditional sentences.
  • firstPerson or secondPerson, with value "yes", for sentences with a first or second person pronoun.
  • kind gives more detailed information on the kind of sentence; possible values include "negative" for negative directive sentences, "wh-question", "directive-question" or "general-question" for different kinds of question, or "invitation" for sentences that include an invitation expression.
Sentiment:SentenceSentiment Opinionated sentences, with features polarity (positive or negative) and score (from +1 to -1), plus sarcasm with value "yes" or "no" for sentences that are identified as sarcastic or not. There is also a feature emotion giving a broad classification of the emotion expressed by the sentence, possible values are "cute", "happy", "bad", "anger", "disgust", "fear" or "sadness".
Sentiment:SentenceSet An average sentiment score across the set of sentences in the document, with features polarity and score for the mean score, and score_std_dev for the standard deviation of the individual sentence scores. A low standard deviation indicates that the opinionated sentences in this document all express similar opinions, a higher value indicates a wider variety of sentiments across the document.
Additional annotations available if selected
Sentiment:Person Named entities as detected by ANNIE
Sentiment:SentimentTarget The target of a sentiment expression, with polarity, score and sarcasm features as described above.
1,200 free requests / day
Larger batches GBP0.80 / CPU hour

Use this pipeline

Single documents

You can process up to 1,200 documents per day free of charge using the REST API, at an average rate of 2 documents/sec. Higher quotas are available for research users by arrangement, contact us for details.

The API endpoint for this pipeline is:

Create API Key

Batches of documents

You can process any amount of data with this pipeline on a pay-as-you-go basis, for GBP0.80 per hour. This can be data you upload yourself, data you collected from Twitter, or the results of a previous job.

Reserve a job