|
Language Resources |
|
|
|
Search Catalogue |
|
|
|
Send us information |
|
|
|
Languages |
|
|
|
|
|
Displaying 261 to 280 (of 423 products) |
Result Pages: 14 |
This collection of lecture speech was recorded in China in 2007 at the National Conference on Man-Machine Speech Communication (NCMMSC).
Language(s) : Chinese
|
|
|
|
This expressive speech corpus contains longitudinal data of natural conversational speech in Japanese. It gathers the recordings of 50 speakers aged from 3 months to fifty years old, for a total of 1,500 hours of manually transcribed speech (recorded over 5 years).
Language(s) : Japanese
|
|
|
|
This Japanese speech corpus contains camera and microphone recordings of multi-person business and social dialogues that were collected over 3 years.
The corpus is transcribed and annotated for discourse moves.
Language(s) : Japanese -
|
|
|
|
The RWCP-SP96 contains object-oriented face-to-face dialogues in Japanese between a questioner (customer) and an answerer (professional). Dialogues are focused on two tasks: purchasing a car (24 dialogues) and overseas travel plan (24 dialogues).
Language(s) : Japanese
|
|
|
|
The RWCP-SP97 contains object-oriented face-to-face dialogues in Japanese between a questioner (customer) and an answerer (professional).
The task is overseas trip planning.
Language(s) : Japanese
|
|
|
|
The RWCP-SP99 contains Japanese broadcast news articles read by professional announcers.
Content: 40 news articles (30 independent articles + 10 articles common to all speakers).
Language(s) : Japanese
|
|
|
|
The RWCP-SP01 contains speech data of a played meeting between more than three participants speaking Japanese. The meeting focuses on the schedule of the participants. MPEG format video recording of the meeting was also taken from three directions.
Language(s) : Japanese -
|
|
|
|
The RWCP-SSD is a database of sounds in real acoustical environments.
It is divided into three parts:
- Sound Scene Database in Real Acoustical Environments.
- Speech data of fixed sound sources measured by microphone-array.
- Speech data of moving sound sources measured by microphone-array.
Language(s) : English
|
|
|
|
The PASL is a Japanese database that can be divided into two parts: monosyllables and isolated words, and continuous speech.
Speakers: 6 males and 6 females.
Language(s) : Japanese
|
|
|
|
The UME-ERJ corpus is a database of English read by Japanese Students.
It contains two sets, one for phonemic pronunciation and the other for prosody.
Number of speakers: 202 speakers (100 males and 102 females).
Each sentence is read by 12 speakers and each word is read by 20 speakers.
Language(s) : English (Japan)
|
|
|
|
This corpus is a Japanese speech database read by foreign students. It contains:
- 503 ATR phonetically-balanced sentences
- 108 sentences difficult to pronounce
- 42 sentences for prosodic evaluation
- 115 minimal-pair words
Number of speakers: 141 (72 male and 69 female speakers), including native speakers of 26 languages. Their level of Japanese is from intermediate to senior.
Language(s) : Japanese
|
|
|
|
This corpus contains speech and transcriptions of 93 dialogues in Japanese.
Language(s) : Japanese
|
|
|
|
The WSJCAM0 is the UK English equivalent of a subset of the US American English WSJ0 database. The recorded material was taken from the Wall Street Journal (WSJ) text corpus.
It consists of speaker-independent read material, split into training, development test and evaluation test sets.
Audio files come with orthographic transcriptions and automatically generated phone and word alignments.
Language(s) : English (United Kingdom)
|
|
|
|
The CORAL corpus is a spoken dialogue corpus, with several levels of labelling. It contains recordings of 32 speakers, amounting to 64 dialogues. All dialogues have been annotated orthographically and only a small subset has been annotated at other levels.
Language(s) : Portuguese (Portugal)
|
|
|
|
This labeled Portuguese Speech DataBase was developed under the ANTIGONA Project (Program IC-PME).
The text material was obtained from newspaper articles. The database contains 100 minutes of high quality speech produced by a professional speaker. It is annotated at phoneme, word and phrase levels (F0 information is also included).
Language(s) : Portuguese (Portugal)
|
|
|
|
This is a Portuguese database of read speech ; 10 speakers were recorded in a sound-proof room.
6 types of material:
- 4,600 isolated words
- 350 sentences for prosodic studies
- 18 phonetically-complete paragraphs
- 60 read paragraphs extracted from television debates
- 3,000 logatomes
- 600 phonetically rich sentences
5 male and 5 female speakers were recorded. A subset of the corpus is also read by two young speakers (1 male, one female, 12-14 years old).
The orthographic transcription is provided. The phonemic level labels are in SAMPA.
Language(s) : Portuguese (Portugal)
|
|
|
|
BD-PUBLICO is a database of read speech; 120 speakers were recorded in a sound-proof room with a high quality microphone (at a sampling frequency of 16kHz).
Language(s) : Portuguese (Portugal)
|
|
|
|
The EUROM1 database contains read speech from 60 speakers, recorded in an anechoic room. It was collected in the framework of the SAM-A European project, extension of the SAM project (Speech Assessment Methods).
Language(s) : Portuguese (Portugal)
|
|
|
|
The project aims to collect 1,000 hours of recordings of Mandarin Chinese spoken in China. 650 hours of audio and 150 hours of video recordings have already been collected.
The corpus is transcribed and annotated, with segmented audio/video chunks linked to the corresponding transcripts.
Language(s) : Chinese -
|
|
|
|
This is a Hungarian database containing spoken material recorded from children. The material contains Hungarian phonemes in isolated form, in sound connections, in words and in sentences.
72 children were recorded; they were aged from 5 to 10. They were not only children with good and average pronunciation, but also speech handicapped children.
Recordings were made in an anechoic room, with a Monacor ECM-100 electret microphone and a Sony TCD-D7 DAT-recorder, at the sampling rate of 48 kHz and a resolution of 16 bit. The total recording time per speaker was approximately 10-15 minutes.
Language(s) : Hungarian
|
|
|
|
Displaying 261 to 280 (of 423 products) |
Result Pages: 14 |
|
|