GATE Cloud REST APIs

There are two different ways you can run GATE Cloud pipelines over your documents. To process individual documents one at a time we provide an on-line processing API, or to process large batches of documents more efficiently you can reserve an annotation job and configure, monitor and control it using our batch-mode API. Additionally you can use the data management API to manage your persistent data bundles (such as web search results and the output of completed annotation jobs).

All API calls are authenticated using HTTP basic authentication with an API key consisting of a randomly-generated username and password. You can manage your API keys from this page, and you can have several different API keys active at the same time.

The on-line processing API

The on-line processing API is for making individual processing requests for an immediate response. Most services present the same API with a single HTTPS endpoint that accepts POST requests with the content of the document to process, and returns the annotated document in either GATE XML or JSON format. This API is explained in detail on its own page. A few services have different requirements and present different APIs tailored to their particular functionality (for example image OCR services) - these APIs are documented in detail on their respective description pages.

The batch-mode APIs

The batch-mode APIs use a RESTful design with the URI structure representing jobs and their components and HTTP verbs representing operations on those URIs. Operations that require input parameters can accept these as JSON or XML (with an appropriate request Content-Type) and all operations return data in one of these formats (based on the Accept HTTP header in the request).

Any operation may fail, and will return an error response with an appropriate HTTP status code (4xx for problems with the client request, 5xx for problems on the server) and details of the error:

XML JSON
<error>
  <message>The error message</message>
</error>
{
  "message":"The error message"
}

The easiest way to get started with the API is the command line tool usage guide which uses the Java Client Library.

The various APIs are described in detail on separate pages:

  • the shop - to list the available pipelines and reserve jobs based on a particular pipeline;
  • job management - to configure and control a reserved annotation job;
  • data management - to manage persistent data bundles, upload data for processing and download results.
  • cloud machine management - to start and stop cloud machines that you have reserved

Using the APIs

We provide various options to help you get started with the GATE Cloud APIs. For annotation jobs and data bundles we provide