-
Sanskrit English Lexicon
A Lexicon of Sanskrit to English -
SALDO
SALDO (Swedish Associative Thesaurus version 2) is an extensive electronic lexicon resource for modern Swedish written language. It is created for the purpose of language... -
The Rosetta Project
About From the about page: The Rosetta Project is a global collaboration of language specialists and native speakers working to build a publicly accessible digital library of... -
Pali English Lexicon
A lexicon from Pali to English. -
OPUS - an open source parallel corpus
OPUS is a growing collection of translated texts from the web. In the OPUS project we try to convert and align free online data, to add linguistic annotation, and to provide the... -
OLiA Discourse
OLiA Discourse Extensions -
The DGT Multilingual Translation Memory of the Acquis Communautaire
As of November 2007, the European Commission's Directorate-General for Translation (DGT) made publicly accessible its multilingual Translation Memory for the Acquis... -
MetaShare metadata model
Ontology Metadata as LOD Availability: Freely Avalable Usage: Status:Newly created-in progress Description: LOD prelimnary version of the MetaShare metadata model.... -
linked hypernyms
This Linked Hypernym dataset attaches entity articles in English, German and Dutch Wikipedia with a DBpedia resource or a DBpedia ontology concept as their type. The types are... -
Leipzig Corpora Collection (LCC)
Deutscher Wortschatz contains data generated from newspapers and web resources that are publicly available. The data were collected per language and encompass statistics about... -
ISOcat-metadata
The linguistics community is building a metadata-based infrastructure for the description of its research data and tools. At its core is the ISOcat registry ISOcat.org, a... -
French TimeBank
The French TimeBank consists of a set of 109 journalistic articles from 7 different sub-genres annotated according to the ISO-TimeML standard, adapted for the French language.... -
FrameNet
About From website: The Berkeley FrameNet project is creating an on-line lexical resource for English, based on frame semantics and supported by corpus evidence. The aim is to... -
EU Directorate-General for Translation (DGT) - Acquis Communautaire
About From website: As of November 2007, the European Commission's Directorate-General for Translation (DGT) made publicly accessible its multilingual Translation Memory for... -
Diachronic Ontologies from People's Daily
Diachronic Ontologies from People's Daily Ontology Availability: Freely Avalable Usage: Word Sense Disambiguation Status:Newly created-finished Description: 1.... -
de-gaap-ontology-lexicon
The dataset is a German-English financial ontology-lexicon with 728 bilingual annotated phrases with Constituent and Part-Of-Speech Tags in lexinfo/lemon format. The ontology... -
Automated Similarity Judgment Program lexical data
ASJP collects 40 words from 5500 languages in a simplified phonetic representation. More background can be found at http://email.eva.mpg.de/~wichmann/ASJPHomePage.htm -
Analisi del blog http://www.beppegrillo.it/
Analisi del blog http://www.beppegrillo.it/. I dati vanno da gennaio 2005 a febbraio 2013. Il pacchetto è diviso in quattro dataset: - Dati sui singoli post - Dati sulle... -
AcadOnto
An academic domain ontology populated using IIT Bombay organization corpus, web and the linked open data. Usage: Information Extraction, Information Retrieval Availability:... -
Manually Annotated Sub-Corpus (MASC) of the Open American National Corpus
The Manually Annotated Sub-Corpus (MASC) consists of approximately 500,000 words of contemporary American English written and spoken data drawn from the OPEN AMERICAN NATIONAL...