Domain Credibility

An application that annotates URLs within a document and assigns to each information on how credible the domain is. Clearly this makes the most sense for URLs pointing twoards known sources of news.

Default annotations
:DomainCredibility Annotations spanning URLs for which we have credibility information about the responsible domain. A URL may have more than one of these annotations spanning it if we have credibility information from multiple sources. These have additional optional features:
  • credibility-resolved-url, the full URL that the URL in the document resolved to. This is important in the case of URL shorteners etc.
  • credibility-domain, the credibility info is domain based (rather than URL based) and this feature shows the domain used to lookup the details in each of the sources we use.
  • credibility-source, the source from which the credibility information was collected.
  • credibility-timestamp, the date on which the credibility information was collected or last updated.
  • credibility-description, if the source contains a description of the domain it will be placed here.
  • credibility-labels, a comma separated list of classification labels for the domain (i.e. bias, conspiracy, clickbait).
  • credibility-more-info, if the source provides a link to a more info page this will be included here.
:URL The features from the DomainCredibility annotation with the lowest credibility-score are attached to a single URL annotation.
1,200 free requests / day
Larger batches GBP0.80 / CPU hour

Use this pipeline

Single documents

You can process up to 1,200 documents per day free of charge using the REST API, at an average rate of 2 documents/sec. Higher quotas are available for research users by arrangement, contact us for details.

The API endpoint for this pipeline is:

https://cloud-api.gate.ac.uk/process-document/domain-credibility

Create API Key

Batches of documents

You can process any amount of data with this pipeline on a pay-as-you-go basis, for GBP0.80 per hour. This can be data you upload yourself, data you collected from Twitter, or the results of a previous job.

Reserve a job