The Political Futures Pipeline

A pipeline designed to detect political topics, UK politician names (as valid at the 2017 General Election), abusive terms and sentiment, in addition to Twitter-specific data such as location (NUTS) where possible, hashtag, user names etc. It works best on tweets in the original Twitter JSON input format. Upload your own or harvest some with our Twitter Collector.

Default annotations
:Topic Mentions of topics relevant to UK politics, based largely on the topic classification used on gov.uk.
:AbuseLookup Abusive, insulting and swear words.
:Hashtag Standard Twitter entity types.
:UserID
:URL
:Politician Recognised UK politicians such as MPs, parliamentary candidates at the 2016 election, and other significant individuals such as party leaders who are not MPs.
:Party Political parties
Additional annotations available if selected
:Sentence Sentences
:NounChunk Noun phrase chunks

When the input is Twitter JSON and the output is saved as GATE XML or sent to Mímir, the following additional information is extracted from the tweet metadata and made available as document-level features:

author
The screen name of the Tweet author
tweet_id
The ID of this tweet
tweet_uri
URL of the tweet in the form https://twitter.com/{author}/status/{id}
tweet_kind
original, retweet or reply
orig_id
If this tweet is a retweet, the ID of the original retweeted_status
timestamp
ISO8601-formatted representation of the tweet "created_at" timestamp
hour_timestamp and minute_timestamp
Numeric representation of the timestamp to hour (YYYYMMDDHH) or minute (YYYYMMDDHHmm) granularity
NUTS1 and NUTS2
The user's location mapped to NUTS region codes. Only applies to locations within the UK, non-UK users return "NON"
1,200 free requests / day
Larger batches £0.80 / CPU hour

Use this pipeline

Single documents

You can process up to 1,200 documents per day free of charge using the REST API, at an average rate of 2 documents/sec. Higher quotas are available for research users by arrangement, contact us for details.

The API endpoint for this pipeline is:

https://cloud-api.gate.ac.uk/process/sobigdata-politics

Create API Key

Batches of documents

You can process any amount of data with this pipeline on a pay-as-you-go basis, for £0.80 per hour. This can be data you upload yourself, data you collected from Twitter, or the results of a previous job.

Reserve a job