-
The SPECIALIST Lexicon
The SPECIALIST lexicon is a large syntactic lexicon of biomedical and general English. Coverage includes both commonly occurring English words and biomedical vocabulary. The... -
The IBL Corpus
About The IBL Corpus was collected by the University of Plymouth and the University of Edinburgh as part of the EPSRC funded project IBL, Instruction-based Learning for Mobile... -
WikiWord
About Overview: WikiWord is a system for building a multilingual Thesaurus by extracting lexical and semantic information from Wikipedia. It was originally developed for a... -
The Speech Accent Archive
From website: The speech accent archive uniformly presents a large set of speech samples from a variety of language backgrounds. Native and non-native speakers of English read... -
Spanish Linguistic Datasets
Spanish Linguistic Datasets (SLD) is an open initiative to expose as Linked Data available Spanish Linguistic resources maintained at OEG. It is worth noting that we host... -
MOCHA-TIMIT
About Authors: Alan Wrench, Queen Margaret University College. Funded by: Engineering and Physical Sciences Research Council. When created: November 1999. Purpose:... -
Language Commons
This dataset has no description
-
Hungarian Language Corpora and Analyzers
Resources, including corpora and software, for processing Hungarian language. Language resources The Hunglish Corpus is a sentence-aligned Hungarian-English parallel corpus... -
Europarl Parallel Corpus
Description Overview from home page: The Europarl parallel corpus is extracted from the proceedings of the European Parliament. It includes versions in 11 European languages:... -
english-gigaword
This is a recipe to train word n-gram language models using the newswire text provided in the English Gigaword corpus (1200M words of NYT, APW, AFE, XIE). It also prepares... -
Corpus de Textes Linguistiques Fondamentaux (CTLF)
This database contains more than 3,000 notices on major linguistic books on grammar, from Antiquity to now. Major books will progressively be digitized and made available... -
Catalan WordNet
This dataset has no description