WeVerify OCR Service
This application uses optical character recognition(OCR) to identify text contained within images. There are two ways the service can be used.
The first, and more normal GATE Cloud approach, works by scanning a text document for URLs. Each URL found is then passed through the OCR system. A URL annotation is then created in the output with features to indicate success and the text identified or failure and the nature of the error (such as an invalid or unreachable URL or a valid URL that refers to something other than an image).
An alternative to processing plain text allows a single image file to be passed to the service. This is achieved by Base 64 encoding the image and posting the resulting data as plain text. The output from this approach is slightly different as we simply pass back any extracted text, rather than adding annotations over the possibly large input data.
You can experiment with this second approach using the following input field to select a local file. This will be loaded, Base 64 encoded, and the result put into the standard form below.
The Base 64 approach is mostly intended for use via the REST API. If you want to test the service with images then we have a dedicated OCR demo page which is easier to use.
|:URL||URLs with the following additional features produced by the OCR service:
Use this pipeline
You can process up to 150 documents per day free of charge using the REST API, at an average rate of 2 documents/sec. Higher quotas are available for research users by arrangement, contact us for details.
The API endpoint for this pipeline is: