Universal Catalogue  
  You are here » Universal Catalogue » Spoken Resources » Desktop/microphone
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-S0222
Speech corpus for Amharic
This is a speech corpus of 20 hours. Amharic is one of the official language of Ethiopia and the second most-spoken Semitic language after Arabic.

This corpus was built for the development of an automatic speech recognizer and is divided into several parts:

- the training speech corpus, which contains 10,850 different sentences read by 100 speakers (56 male and 44 female). 80 of them are from the Addis Ababa dialect area while the other 20 speak one of the four other existing dialects (Gojjam, Gonder, Wollo and Menz).

- the development and evaluation set, which contains 38 different sentences read by 24 speakers (20 speakers of the Addis Ababa dialect and 4 speakers of the other four dialects).

- the adaptation set, which contains 53 adaptation sentences that consist of all Amharic CV syllables (for all of the readers).
Contents Click on the arrow to display content.
 speech corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4