Search for a Dataset - the Datahub

Add Dataset Import Data Package

OpenStreetMap (Water)

OpenStreetMap (OSM) is a free collaborative mapping project that creates a free editable vector map of the world. These crowdsourced data, meanwhile created by over 1.5...
- RDF
- rdf, xml
RISM Authority data

Authority data used in the (RISM catalogue, dataset description). It contains information about persons, organisations and literary works.
- RDF
- api/sparql
- rdf/turtle
Répertoire International des Sources Musicales - RISM

Répertoire International des Sources Musicales (RISM) is an international, non-profit organization with the aim of comprehensively documenting extant musical sources anywhere in...
- RDF
- api/sparql
- rdf/turtle
News-100 NIF NER Corpus

This corpus comprises 100 German news articles from the online news platform news.de. All of the articles were published in the year of 2010 and contain the word Golf. This word...
- text/turtle
- PDF
RSS-500 NIF NER CORPUS

This corpus has been created using a dataset comprising a list of 1,457 RSS feeds as compiled in (Goldhahn et al. 2012). The list includes all major worldwide newspapers and a...
- text/turtle
- PDF
Reuters-128 NIF NER Corpus

This English corpus is based on the well known Reuters-21578 corpus which contains economic news articles. In particular, we chose 128 articles containing at least one NE....
- text/turtle
- PDF
2001 Spanish Census to RDF

This site offers information of the conversion process of a 5% sampling of the 2001 Spanish census from a plain useless format to RDF, a semantic representation supported by...
- api/sparql
- RDF
- example/turtle
- meta/void
OpenLink Software LOD Cache

Mirror of various RDF and Linked Data from around the Web, as well as data extracted using Virtuoso Sponger from web pages. Primarily GoodRelations data.
- api/sparql
Chat Game corpus

A corpus resulting from an object arrangement game using a computer-mediated setting.
- text/turtle
MExiCo

MExiCo (short for "Multimodal Experiment Corpora") is a data model for data collections containing multimodal linguistic and interaction annotations.
- text/turtle
- example/turtle
FiESTA

FiESTA (short for "Format for extensive spatiotemporal annotations") is a generic format for linguistic and behavioral annotations.
- text/turtle
CODE Endpoint

The CODE Endpoint contains mindmaps and data cubes. Mindmeister (http://www.mindmeister.com) provides an online mindmapping service. All public mindmaps are stored in the CODE...
- api/sparql
- rdf/n3
LinkedSpending: OpenSpending becomes Linked Open Data

A transformation the openspending.org datasets to RDF using the RDF DataCube vocabulary. Available as browsable OntoWiki, SPARQL endpoint, and tar.gz compressed RDF/ntriple...
- .tar.gz
- api/ontowiki
- api/sparql
COD inventory

Lists all data records in the Crystallography Open Database release. Releases are as a rule updated quarterly.
- application/x-ntriples
- application/x-xz
- HTML
- api/sparql
Yale Senselab

About Data exposed: Yale Senselab Size of dump and data set: 216 KB Notes: released without contract The Semantic Web development of SenseLab involves exporting data from...
- api/sparql
- RDF
- MS Access
- example/rdf+xml
WebDataCommons

More and more websites have started to embed structured data describing products, people, organizations, places, events into their HTML pages using markup standards such as...
- list
- gz:nq
- HTML
U.S. Census data

Duplicate of package:2000-us-census-rdf
UNESCO Thesaurus

The UNESCO Thesaurus is a controlled and structured list of terms used in subject analysis and retrieval of documents and publications in the fields of education, culture,...
- api/sparql
- HTML
- rdf/turtle
- RDF
- example/n3
- example/rdf+xml
- example/turtle
- example/json
UK National Gallery Data

Various RDF datasets from the National Gallery. Some appear to be under an original-BSD-style licence, which I believe amounts to an attribution licence.
UK Government Art Collection

A screen-scraped RDF/SPARQL database of artworks in various UK government collections. See here for details of how the data was assembled:...
Ghent University Bibliography

The Ghent University Academic Bibliography, containing all research papers of the university, as well as information on people and departments, in several formats including...
- TXT
- HTML
- RDF
TWC Data-gov

duplicate of package:twc-logd
- rdf (gzipped)
Texai Lexicon

Data exposed: machine readable dictionary derived from WordNet 2.1, Wiktionary, the CMU Pronouncing Dictionary and the OpenCyc lexicon. Each lexicon word sense entry contains...
SwetoDblp

Data exposed: ontology focused on bibliography data of publications from DBLP with additions that include affiliations, universities, and publishers Size of dump and data set:...
- meta/rdf-schema
- RDF
SPQR

SPQR addresses the integration of heterogeneous datasets in the humanities, specifically data relating to classical antiquity, using a linked data approach and based on the...
- RDF
- SPARQL

You can also access this registry using the API (see API Docs).

202 datasets found