WeVerify OCR Service

An application to identify URLs in documents and carry out optical character recognition (OCR) on those which are images.

Default annotations
:URL URLs with the following additional features produced by the OCR service:
  • ocr_ok true or false to indicate the service's success,
  • ocr_textthe output text (empty if unsuccessful),
  • ocr_errorthe server error message (empty if successful),
  • ocr_responsethe HTTP response code,
  • stringthe URL itself.
1,200 free requests / day
Larger batches GBP0.80 / CPU hour

Use this pipeline

Single documents

You can process up to 1,200 documents per day free of charge using the REST API, at an average rate of 2 documents/sec. Higher quotas are available for research users by arrangement, contact us for details.

The API endpoint for this pipeline is:

https://cloud-api.gate.ac.uk/process-document/ocr-service

Create API Key

Batches of documents

You can process any amount of data with this pipeline on a pay-as-you-go basis, for GBP0.80 per hour. This can be data you upload yourself, data you collected from Twitter, or the results of a previous job.

Reserve a job