A pipeline designed to detect political topics, UK politician names (as valid at the 2017 General Election), abusive terms and sentiment, in addition to Twitter-specific data such as location (NUTS) where possible, hashtag, user names etc. It works best on tweets in the original Twitter JSON input format. Upload your own or harvest some with our Twitter Collector.

Default annotations
:Topic Mentions of topics relevant to UK politics, based largely on the topic classification used on
:AbuseLookup Abusive, insulting and swear words.
:Hashtag Standard Twitter entity types.
:Politician Recognised UK politicians such as MPs, parliamentary candidates at the 2016 election, and other significant individuals such as party leaders who are not MPs.
:Party Political parties
Additional annotations available if selected
:Sentence Sentences
:NounChunk Noun phrase chunks

When the input is Twitter JSON and the output is saved as GATE XML or sent to Mímir, the following additional information is extracted from the tweet metadata and made available as document-level features:

The screen name of the Tweet author
The ID of this tweet
URL of the tweet in the form{author}/status/{id}
original, retweet or reply
If this tweet is a retweet, the ID of the original retweeted_status
ISO8601-formatted representation of the tweet "created_at" timestamp
hour_timestamp and minute_timestamp
Numeric representation of the timestamp to hour (YYYYMMDDHH) or minute (YYYYMMDDHHmm) granularity
The user's location mapped to NUTS region codes. Only applies to locations within the UK, non-UK users return "NON"
You can process up to 1,200 documents per day free of charge using the REST API, at an average rate of 2 documents/sec.

The API endpoint for this pipeline is:

You can process any amount of data with this pipeline on a pay-as-you-go basis, for £0.80 per hour. This can be data you upload yourself, data you collected from Twitter, or the results of a previous job.

