5 datasets found

Licenses: Other (Not Open) Tags: linguistics

Filter Results
  • JRC-Names-MLODE

    From their web site: JRC-Names is a highly multilingual named entity resource for person and organisation names (called 'entities'). It consists of large lists of names and...
  • Multext-East

    From the web site: Version 4 of the MULTEXT-East resources, a multilingual dataset for language engineering research and development. This dataset contains, for Bulgarian,...
  • Leipzig Corpora Collection (LCC)

    Deutscher Wortschatz contains data generated from newspapers and web resources that are publicly available. The data were collected per language and encompass statistics about...
  • The Speech Accent Archive

    From website: The speech accent archive uniformly presents a large set of speech samples from a variety of language backgrounds. Native and non-native speakers of English read...
  • MOCHA-TIMIT

    About Authors: Alan Wrench, Queen Margaret University College. Funded by: Engineering and Physical Sciences Research Council. When created: November 1999. Purpose:...
You can also access this registry using the API (see API Docs).