Datasets
-
GeoWordNet
GeoWordNet is a semantic resource built from the full integration of WordNet, GeoNames and the Italian part of MultiWordNet. GeoWordNet Public Dataset contains 3,698,238... -
PDEV-Lemon
PDEV is a dictionary which provides insight into how verbs collocate with nouns and other words using an empirically well-founded apparatus of syntactic and semantic categories.... -
TDS
Typological Database System ontology -
Linguistic Metadata (LIME) vocabulary
LIME (LInguistic MEtadata) is a vocabulary for expressing linguistic metadata about linguistic resources and linguistically grounded datasets. The metadata vocabulary has been... -
EuroSentiment
Gabriela Vulcu, Raul Lario Monje, Mario Munoz, Paul Buitelaar and Carlos A. Iglesias (2014), Linked-Data based Domain-Specific Sentiment Lexicons, In: Proceedings of the 3rd... -
KAIST silver standard corpus
KAIST silver standard corpus Availability: Freely Avalable Usage: Named Entity Recognition Status:Newly created-finished Description: We propose a novel method to... -
ConceptNet
WordNet-like concept network developed at MIT ConceptNet aims to give computers access to common-sense knowledge, the kind of information that ordinary people know but usually... -
xLiD-Lexica
Our xLiD-Lexica dataset in RDF (http://km.aifb.kit.edu/resources/xLiD-lexica.nt) contains about 300 million triples of cross-lingual groundings. It is extracted from Wikipedia... -
Terminesp Linked Data
Lexicon Terminesp LD Spanish (spa) English (eng) German (deu) French (fra) Swedish (swe) Latin, Italian Availability: Freely Avalable Usage: Machine Translation,... -
Syntactic Reference Corpus of Medieval French (SRCMF)
The SRCMF contains the 15 Old French texts with about 280000 words. It has a high-quality manual annotation, based on a linguistically adequate dependency grammar. Annotation... -
OLiA Discourse
OLiA Discourse Extensions -
MetaShare metadata model
Ontology Metadata as LOD Availability: Freely Avalable Usage: Status:Newly created-in progress Description: LOD prelimnary version of the MetaShare metadata model.... -
linked hypernyms
This Linked Hypernym dataset attaches entity articles in English, German and Dutch Wikipedia with a DBpedia resource or a DBpedia ontology concept as their type. The types are... -
ISOcat-metadata
The linguistics community is building a metadata-based infrastructure for the description of its research data and tools. At its core is the ISOcat registry ISOcat.org, a... -
French TimeBank
The French TimeBank consists of a set of 109 journalistic articles from 7 different sub-genres annotated according to the ISO-TimeML standard, adapted for the French language.... -
Diachronic Ontologies from People's Daily
Diachronic Ontologies from People's Daily Ontology Availability: Freely Avalable Usage: Word Sense Disambiguation Status:Newly created-finished Description: 1.... -
AcadOnto
An academic domain ontology populated using IIT Bombay organization corpus, web and the linked open data. Usage: Information Extraction, Information Retrieval Availability:... -
Manually Annotated Sub-Corpus (MASC) of the Open American National Corpus
The Manually Annotated Sub-Corpus (MASC) consists of approximately 500,000 words of contemporary American English written and spoken data drawn from the OPEN AMERICAN NATIONAL... -
Linked Old Germanic Dictionaries
Lexical resources (word lists, etymological dictionaries) for Germanic languages in different historical stages: pre 1100 (incl. Gothic, Old High German, Old English),...