Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-S 0010
Multext-East Speech Corpus
This is a small parallel corpus of spoken texts taken from the EUROM-1 speech corpus. 40 short passages have been translated from English into Romanian, Slovene, Estonian, Hungarian, Czech and Bulgarian.
For four languages (Romanian, Slovene, Estonian and Hungarian) recordings of the texts are also provided (with links between texts and spoken passages).

It is a part of a multilingual dataset containing multiple resources for Central and Eastern European languages:
- MULTEXT-East morphosyntactic specifications,
- MULTEXT-East morphosyntactic lexicons,
- MULTEXT-East morphosyntactically annotated "1984" corpus,
- MULTEXT-East comparable corpus,
- MULTEXT-East "1984" parallel corpus,
- and associated documentation.
The central component of the MULTEXT-East corpus is the novel "1984" by G. Orwell.

The dataset is compliant with the EAGLES and TEI P4 recommendations.
It is a resource of value for Central and Eastern European languages engineering research and development.
Identification
Period of coverage :
Version : v3, 2004
Version history : v1: 1998 ('East meets West' CDROM) v2: 2002
Production
Project : TELRI, CONCEDE, Multext-East Projects Creation date : 2004
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 speech corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4