You are here
»
Universal Catalogue
»
Spoken Resources
»
Desktop/microphone
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-S0268
Chinese Speech Corpus 4
This corpus consists of speech read by 100 native Madarin speakers (50 males, 50 females), reading Chinese names, command words for cell phones and 11-digit telephone numbers.
It was recorded by a microphone Sennheiser E835S at the following sampling and data format: 16000 Hz, 16-bit Windows wave format.
Each speaker uttered a set of 250 tokens. Each set contains:
- 150 items of Chinese names for training data (50 items in consideration of distribution of Chinese names and 100 items in consideration of syllable balanced words); 120 items for testing data (80 items in consideration of distribution of Chinese names and 40 items in consideration of syllable balanced words)
- 57 command words (divided into 36 crucial words and 21 non-crucial words)
- 11-digit telephone numbers: Chinese telephone numbers including cell phone numbers generated by random sampling.
Production
Creation date :
2004
Applications
Applications existing :
Speech recognition
Contents
Click on the arrow to display content.
speech corpus
Language(s) :
Chinese
Quantisation : 16-bit
Source Channel :
Microphone
Recording Environment : Office
Friday 22 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4