ANNIE+Measurements

ANNIE is a named entity recognition pipeline that identifies basic entity types, such as Person, Location, Organization, Money amounts, Time and Date expressions.

This pipeline combines the basic ANNIE named entity system with taggers to recognise numeric expressions (digits and words) and to annotate and normalise measurement expressions with features giving their value in SI units.

Default annotations
:Person Standard named entity types
:Location
:Organization
:Date
:Address Includes email and IP addresses as well as street addresses
:Measurement Measurement expressions, with features giving the value and unit of the measurement, both in the original form specified in the document and in a form normalized to SI units
Additional annotations available if selected
:Money Monetary amounts
:Percent Expressions representing percentages
:Token The individual tokens of the text, with "category" feature for POS
:SpaceToken The spaces between tokens
:Sentence Sentences detected by the sentence splitter
:Ratio Expressions denoting a ratio rather than a simple measurement, typically percentages but also expressions like "300 parts per million"
1,200 free requests / day
Larger batches £0.80 / CPU hour

Use this pipeline

Single documents

You can process up to 1,200 documents per day free of charge using the REST API, at an average rate of 2 documents/sec. Higher quotas are available for research users by arrangement, contact us for details.

The API endpoint for this pipeline is:

https://cloud-api.gate.ac.uk/process/annie-measurements

Create API Key

Batches of documents

You can process any amount of data with this pipeline on a pay-as-you-go basis, for £0.80 per hour. This can be data you upload yourself, data you collected from Twitter, or the results of a previous job.

Reserve a job