Universal Catalogue  
  You are here » Universal Catalogue » Spoken Resources » Desktop/microphone
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-S0268
Chinese Speech Corpus 4
This corpus consists of speech read by 100 native Madarin speakers (50 males, 50 females), reading Chinese names, command words for cell phones and 11-digit telephone numbers.

It was recorded by a microphone Sennheiser E835S at the following sampling and data format: 16000 Hz, 16-bit Windows wave format.

Each speaker uttered a set of 250 tokens. Each set contains:
- 150 items of Chinese names for training data (50 items in consideration of distribution of Chinese names and 100 items in consideration of syllable balanced words); 120 items for testing data (80 items in consideration of distribution of Chinese names and 40 items in consideration of syllable balanced words)
- 57 command words (divided into 36 crucial words and 21 non-crucial words)
- 11-digit telephone numbers: Chinese telephone numbers including cell phone numbers generated by random sampling.
Production
Creation date : 2004
Applications
Applications existing : Speech recognition
Contents Click on the arrow to display content.
 speech corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4