-
JRC-Names-MLODE
From their web site: JRC-Names is a highly multilingual named entity resource for person and organisation names (called 'entities'). It consists of large lists of names and... -
Multext-East
From the web site: Version 4 of the MULTEXT-East resources, a multilingual dataset for language engineering research and development. This dataset contains, for Bulgarian,... -
Leipzig Corpora Collection (LCC)
Deutscher Wortschatz contains data generated from newspapers and web resources that are publicly available. The data were collected per language and encompass statistics about... -
The Speech Accent Archive
From website: The speech accent archive uniformly presents a large set of speech samples from a variety of language backgrounds. Native and non-native speakers of English read... -
MOCHA-TIMIT
About Authors: Alan Wrench, Queen Margaret University College. Funded by: Engineering and Physical Sciences Research Council. When created: November 1999. Purpose:...
