Domain Credibility

An application that annotates URLs within a document and assigns to each information on the credibility of news articles the site hosts. Currently we label domains using information from OpenSources which focuses on domains with known credibility issues. This means that main stream news sites are less likely to be flagged by this service. We currently have data on 1320 unique domains as well as 724 Twitter, and 918 Facebook accounts linked to these domains.

The following short video explains the service in more detail and gives a demo example.

Default annotations
:DomainCredibility Annotations spanning URLs for which we have credibility information about the responsible domain. A URL may have more than one of these annotations spanning it if we have credibility information from multiple sources. These have additional optional features:
  • credibility-resolved-url, the full URL that the URL in the document resolved to. This is important in the case of URL shorteners etc.
  • credibility-domain, the credibility info is domain based (rather than URL based) and this feature shows the domain used to lookup the details in each of the sources we use.
  • credibility-source, the source from which the credibility information was collected.
  • credibility-timestamp, the date on which the credibility information was collected or last updated.
  • credibility-description, if the source contains a description of the domain it will be placed here.
  • credibility-labels, a comma separated list of classification labels for the domain (i.e. bias, conspiracy, clickbait).
  • credibility-more-info, if the source provides a link to a more info page this will be included here.
:URL The features from the DomainCredibility annotation with the lowest credibility-score are attached to a single URL annotation.
1,200 free requests / day
Larger batches GBP0.80 / CPU hour

Use this pipeline

Single documents

You can process up to 1,200 documents per day free of charge using the REST API, at an average rate of 2 documents/sec. Higher quotas are available for research users by arrangement, contact us for details.

The API endpoint for this pipeline is:

https://cloud-api.gate.ac.uk/process-document/domain-credibility

Create API Key

Batches of documents

You can process any amount of data with this pipeline on a pay-as-you-go basis, for GBP0.80 per hour. This can be data you upload yourself, data you collected from Twitter, or the results of a previous job.

Reserve a job