You are here
»
Universal Catalogue
»
Spoken Resources
»
Desktop/microphone
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-S0201
Corpus of Estonian Dialects
The Corpus of Estonian Dialects (CED) is a speech database which contains interviews on different topics. Speakers are distributed among the nine main dialects of Estonian : Mid, Eastern, Western dialects (for the North Estonian dialect group), Vőru, Mulgi, Tartu, Seto dialects (for the South Estonian dialect group), North-Eastern (Alutaguse), Coastal dialects (for the North-Eastern Coastal dialect group). Dialect recordings were tape-recorded, mainly during the 1960s and 1970s.
Recordings are provided with phonetic and text transcription, including features of spoken language such as pause-fillers, discourse particles, word repetitions, corrections, unfinished words, speaker turn, etc.
The CED contains about 1,000,000 transcribed words and 500,000 morphologically tagged words (26 word classes according to morphological inflections, syntactic characteristics and semantics), as well as information about speakers and recordings.
Production
Project :
Corpus of Estonian Dialects
Applications
application Area :
Research
Contents
Click on the arrow to display content.
speech corpus
Language(s) :
Estonian
Source Channel :
Microphone
Sound Type Annotation : Mispronunciation#Truncation
Transcription Entries : Orthographic#Phonetic#Translitteration
Lexical Unit Information : Notes
Annotation language : XML
Friday 22 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4