You are here
»
Universal Catalogue
»
Spoken Resources
»
Desktop/microphone
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-S0222
Speech corpus for Amharic
This is a speech corpus of 20 hours. Amharic is one of the official language of Ethiopia and the second most-spoken Semitic language after Arabic.
This corpus was built for the development of an automatic speech recognizer and is divided into several parts:
- the training speech corpus, which contains 10,850 different sentences read by 100 speakers (56 male and 44 female). 80 of them are from the Addis Ababa dialect area while the other 20 speak one of the four other existing dialects (Gojjam, Gonder, Wollo and Menz).
- the development and evaluation set, which contains 38 different sentences read by 24 speakers (20 speakers of the Addis Ababa dialect and 4 speakers of the other four dialects).
- the adaptation set, which contains 53 adaptation sentences that consist of all Amharic CV syllables (for all of the readers).
Contents
Click on the arrow to display content.
speech corpus
Language(s) :
Amharic
Duration : 20 hours
Source Channel :
Microphone
Friday 22 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4