Description
"Apertium is a toolbox to build open-source shallow-transfer machine translation systems, especially suitable for related language pairs: it includes the engine, maintenance tools, and open linguistic data for several language pairs."
Language-pair data includes:
- Spanish ⇆ Catalan (apertium-es-ca)
- Spanish ← Romanian (apertium-es-ro)
- French ⇆ Catalan (apertium-fr-ca)
- Occitan ⇆ Catalan (apertium-oc-ca)
- English ⇆ Galician (apertium-en-gl)
- Swedish → Danish (apertium-sv-da)
- Occitan ⇆ Spanish (apertium-oc-es)
- Spanish ⇆ Portuguese (apertium-es-pt)
- English ⇆ Catalan (apertium-en-ca)
- English ⇆ Spanish (apertium-en-es)
- English ⇆ Esperanto (apertium-en-eo)
- Spanish ⇆ Galician (apertium-es-gl)
- French ⇆ Spanish (apertium-fr-es)
- Esperanto ← Spanish (apertium-eo-es)
- Welsh → English (apertium-cy-en)
- Breton → French (apertium-br-fr)
- Esperanto ← Catalan (apertium-eo-ca)
- Portuguese ⇆ Catalan (apertium-pt-ca)
- Portuguese ⇆ Galician (apertium-pt-gl)
- Basque → Spanish (apertium-eu-es)
- Norwegian Nynorsk ⇆ Norwegian Bokmål (apertium-nn-nb)
The above are the "released" language pairs, data includes:
- dictionaries for morphological analysis and generation
- disambiguation (statistical models, rules, in some cases Constraint Grammars)
- bilingual (transfer) dictionaries
- structural transfer rules
There is also a lot of data of the above kinds for unreleased language pairs, eg. Icelandic → English, North Sámi → Lule Sámi; and tools to maintain such data.
License
COPYING file in language pair data archive contains a copy of the GPL.