spaCy English Named Entity Recognition

An application which annotates documents using the spaCy Named Entity Recognition component for English with the en_core_web_sm model.

Default annotations
:PERSON People, including fictional.
:GPE Countries, cities, states.
:LOC Non-GPE locations, mountain ranges, bodies of water.
:ORG Companies, agencies, institutions, etc.
:NORP Nationalities or religious or political groups.
:DATE Absolute or relative dates or periods.
:TIME Absolute or relative dates or periods.
:FAC Buildings, airports, highways, bridges, etc.
Additional annotations available if selected
:PRODUCT Objects, vehicles, foods, etc. (Not services.)
:EVENT Named hurricanes, battles, wars, sports events, etc.
:WORK_OF_ART Titles of books, songs, etc.
:LAW Named documents made into laws.
:LANGUAGE Any named language.
:PERCENT Percentage, including "%".
:MONEY Monetary values, including unit.
:QUANTITY Measurements, as of weight or distance.
:ORDINAL "first", "second", etc.
:CARDINAL Numerals that do not fall under another type.
:Token The individual tokens of the text, with "category" feature for POS
:SpaceToken The spaces between tokens
:Sentence Sentences detected by the sentence splitter
:NounChunk Noun chunks detected by the noun chunker
1,200 free requests / day
Larger batches GBP0.80 / CPU hour

Use this pipeline

Single documents

You can process up to 1,200 documents per day free of charge using the REST API, at an average rate of 2 documents/sec. Higher quotas are available for research users by arrangement, contact us for details.

The API endpoint for this pipeline is:

Create API Key

Batches of documents

You can process any amount of data with this pipeline on a pay-as-you-go basis, for GBP0.80 per hour. This can be data you upload yourself, data you collected from Twitter, or the results of a previous job.

Reserve a job