The Political Futures Pipeline
A pipeline designed to detect political topics, UK politician names (as valid at the 2017 General Election), abusive terms and sentiment, in addition to Twitter-specific data such as location (NUTS) where possible, hashtag, user names etc. It works best on tweets in the original Twitter JSON input format. Upload your own or harvest some with our Twitter Collector.
Default annotations | |
:Topic | Mentions of topics relevant to UK politics, based largely on the topic classification used on gov.uk. |
:AbuseLookup | Abusive, insulting and swear words. |
:Hashtag | Standard Twitter entity types. |
:UserID | |
:URL | |
:Politician | Recognised UK politicians such as MPs, parliamentary candidates at the 2016 election, and other significant individuals such as party leaders who are not MPs. |
:Party | Political parties |
Additional annotations available if selected | |
:Sentence | Sentences |
:NounChunk | Noun phrase chunks |
When the input is Twitter JSON and the output is saved as GATE XML or sent to Mímir, the following additional information is extracted from the tweet metadata and made available as document-level features:
- author
- The screen name of the Tweet author
- tweet_id
- The ID of this tweet
- tweet_uri
- URL of the tweet in the form
https://twitter.com/{author}/status/{id}
- tweet_kind
- original, retweet or reply
- orig_id
- If this tweet is a retweet, the ID of the original retweeted_status
- timestamp
- ISO8601-formatted representation of the tweet "created_at" timestamp
- hour_timestamp and minute_timestamp
- Numeric representation of the timestamp to hour (YYYYMMDDHH) or minute (YYYYMMDDHHmm) granularity
- NUTS1 and NUTS2
- The user's location mapped to NUTS region codes. Only applies to locations within the UK, non-UK users return "NON"
Use this pipeline
You can process up to 1,200 documents per day free of charge using the REST API, at an average rate of 2 documents/sec. Higher quotas are available for research users by arrangement, contact us for details.
The API endpoint for this pipeline is:
You can process any amount of data with this pipeline on a pay-as-you-go basis, for £0.80 per hour. This can be data you upload yourself, data you collected from Twitter, or the results of a previous job.