Universal Catalogue  
  You are here » Universal Catalogue » Spoken Resources » Desktop/microphone
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-S0267
Chinese Speech Corpus 2
This corpus consists of speech read by 300 native Madarin speakers (150 males, 150 females), reading a set of 150 words and 100 sentences each. Speakers are native Pekinese speakers.

It was recorded by a microphone Sennheiser E835S at the following sampling and data format: 16000 Hz, 16-bit Windows wave format.

Texts for prompts are extracted from:
- People Daily(Ren Min Ri Bao) 1993/1994/1996/1997
- Economic Daily(Jing Ji Ri Bao) 1992/1994
- Market(Shi Chang Bao) 1994
- Xinhua News(Xinhuashe Wengao) 1994/1995/1996

Words and sentences were selected from the sentences (a total of 685,982 sentences) with 10-20 characters sampled from the population corpus. Frequencies of Di-IFs (IF: Chinese initial, Chinese final/rhyme) in Chinese syllables were also considered.

Prompts consist of 150 sets of words (a total of 22,500 words) and 150 sets of sentences (a total of 14,872 sentences).
Production
Creation date : 2003
Applications
Applications existing : Speech recognition
Contents Click on the arrow to display content.
 speech corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4