Search for a Dataset - the Datahub

Add Dataset Import Data Package

Deutsche Nationalbibliografie (DNB)

The Linked Data Service of the German National Library (Deutsche Nationalbibliothek, DNB) has expanded and includes bibliographic data since January 2012. As a first step, the...
- RDF
- text/turtle
- application/ld+json
Gemeinsame Normdatei (GND)

GND stands for "Gemeinsame Normdatei" (Integrated Authority File) and offers a broad range of elements to describe authorities. The GND originates from the German library...
- RDF
- example/rdf+xml
- text/turtle
- application/ld+json
DBpedia

DBpedia.org is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated...
- HTML
- api/sparql
- application/x-ntriples
- meta/rdf-schema
- meta/void
- linked data
- RDF
- text/turtle
- meta/sitemap
U.S. Securities and Exchange Commission Corporate Ownership RDF Data (rdfabout)

Data exposed: corporate ownership Size of dump and data set: 1.8 million triples Notes: also found in the of SPARQL Endpoints
- text/turtle
- api/sparql
- example/rdf+xml
Ontos News Portal

The Ontos News Portal extracts facts (objects as e. g. persons or organizations as well as relations between them, e. g. a person is working for an organization or living at a...
- text/turtle
- RDF
TCMGeneDIT Dataset

Data exposed: Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons Size of dump...
- RDF
- text/turtle
- api/sparql
- example/rdf+xml
ChEMBL RDF

ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and...
- api/sparql
- text/turtle
- meta/void
BioSamples RDF

The BioSamples database aggregates sample information for reference samples (e.g. Coriell Cell lines) and samples for which data exist in one of the EBI's assay databases such...
- api/sparql
- meta/void
- text/turtle
Global airports in RDF

This corpus contains RDF conversion of Global airports dataset which was retrieved from openflights.org. The dataset contains information about airport names, its location,...
- text/turtle
Statbel - Unemployment and additional indicators dataset

This corpus contains RDF conversion of datasets from the 'Statistics Belgium' (also known as Statbel) which aims at collecting, processing and disseminating relevant, reliable...
- text/turtle
- HTML
Statbel - Belgian house price index dataset

This corpus contains RDF conversion of datasets from the 'Statistics Belgium' (also known as Statbel) which aims at collecting, processing and disseminating relevant, reliable...
- text/turtle
- HTML
Statbel - Employment, unemployment, labour market structure dataset

Employment, unemployment, labour market structure dataset: data on employment, unemployment and the labour market from the labour force survey conducted among Belgian households.
- text/turtle
- HTML
CopyrightTermBank

Terminology on copyright and related concepts
- Linked Data
- text/turtle
- application/n-triples
- example/turtle
- PDF
KORE 50 NIF NER Corpus

KORE 50[1] (AIDA) is a subset of the larger AIDA corpus, which is based on the dataset of the CoNLL 2003 NER task. The dataset aims to capture hard to disambiguate mentions of...
- text/turtle
- PDF
ORCID

ORCID (Open Researcher and Contributor ID) is a nonproprietary alphanumeric code to uniquely identify scientific and other academic authors. This dataset contains RDF conversion...
- text/turtle
- GZ
Statbel Corpus

This corpus contains RDF conversion of datasets from the "Statistics Belgium" (also known as Statbel) which aims at collecting, processing and disseminating relevant, reliable...
- text/turtle
Global airports in RDF

This corpus contains RDF conversion of Global airports dataset which was retrieved from openflights.org. The dataset contains information about airport names, its location,...
- text/turtle
Brown Corpus in RDF/NIF

RDF version of the Brown Corpus (W. N. Francis, H. Kucera; Brown University; 1979). 1,014,312 words in 500 documents, taken from newspapers texts on diverse topics, non-fiction...
- text/turtle
- example/turtle
News-100 NIF NER Corpus

This corpus comprises 100 German news articles from the online news platform news.de. All of the articles were published in the year of 2010 and contain the word Golf. This word...
- text/turtle
- PDF
RSS-500 NIF NER CORPUS

This corpus has been created using a dataset comprising a list of 1,457 RSS feeds as compiled in (Goldhahn et al. 2012). The list includes all major worldwide newspapers and a...
- text/turtle
- PDF
Reuters-128 NIF NER Corpus

This English corpus is based on the well known Reuters-21578 corpus which contains economic news articles. In particular, we chose 128 articles containing at least one NE....
- text/turtle
- PDF
Chat Game corpus

A corpus resulting from an object arrangement game using a computer-mediated setting.
- text/turtle
MExiCo

MExiCo (short for "Multimodal Experiment Corpora") is a data model for data collections containing multimodal linguistic and interaction annotations.
- text/turtle
- example/turtle
FiESTA

FiESTA (short for "Format for extensive spatiotemporal annotations") is a generic format for linguistic and behavioral annotations.
- text/turtle
MeSH pairs

Data exposed: NLM 2007 MeSH descriptor/qualifier pairs Size of dump and data set: 13 MB Openness: OPEN See http://www.nlm.nih.gov/mesh/termscon.html (basically attribution with...
- text/turtle
- example/rdf+xml

You can also access this registry using the API (see API Docs).

26 datasets found