|
Language Resources |
|
|
|
Search Catalogue |
|
|
|
Send us information |
|
|
|
Languages |
|
|
|
|
|
Displaying 281 to 300 (of 423 products) |
Result Pages: 15 |
The KAPD is a detailed and comprehensive database that shows the articulatory mechanism of Arabic sounds.
The database contains more than 46,000 files (results of 9 experiments on 8 native Arabic subjects) showing articulatory, acoustic and perceptual information. The files are photos taken by a laryngoscope and video cameras (JPEG format), NSP file formats, audio files in .wav format, and a table of the perceptual responses to all uttered tokens.
Language(s) : Arabic
|
|
|
|
This Hungarian spoken language corpus was compiled from Version 2 of The Budapest Sociolinguistic Interview (BSI). The Version 2 of the BSI is composed of interviews with 50 informants (ten teachers, ten sales clerks, ten blue-collar workers, ten university students, and ten vocational trainees). The interviews were transcribed and coded on computer.
Language(s) : Hungarian
|
|
|
|
The BASE corpus consists of 160 lectures and 39 seminars recorded in different university departments, for a total of 1,644,942 tokens. It also comprises video data.
Recordings were transcribed and tagged according to the TEI guidelines.
Language(s) : English (United Kingdom) -
|
|
|
|
This Italian ASR database contains the speech of 103 speakers recorded in-car with microphones. The total number of utterances is 35,875.
Each speaker recorded 1 or 2 sessions:
- Session 1 in a parked vehicle with the engine running
- Session 2 in a vehicle travelling at 60 mph (100 km/h).
Data includes 175 prompts per session (Digits, Street names, Generic Command and Control items, Phonetically rich Sentences and Words).
Language(s) : Italian
|
|
|
|
The Interfra is a corpus of French spoken by Swedish students. It contains interviews, accounts of comics and films (for approximately 400,000 words).
It is orthographically transcribed and POS tagged.
Language(s) : French (Sweden)
|
|
|
|
This corpus is an English non-native speech database containing 15,000 utterances from 96 speakers (6,4 hours). Their native language is German, French, Japanese, Indonesian or Chinese.
Language(s) : English
|
|
|
|
This English corpus was collected thanks to a Wizard of Oz system. The aim was to collect naturalistic data from native and non-native speakers in a semi-quiet office.
Language(s) : English
|
|
|
|
This English corpus is small collection of speech uttered by 10 non-native English speakers (1,200 utterances).
Language(s) : English
|
|
|
|
This is a small collection of 1,600 English utterances by 20 Chinese speakers.
Language(s) : English
|
|
|
|
This is an English speech database containing 7,500 utterances by 62 non-native speakers (Japanese, Chinese).
Language(s) : English
|
|
|
|
This is a small collection of 452 utterances in English, uttered by 64 German native speakers.
Language(s) : English
|
|
|
|
The Cross Towns corpus is a non native corpus that covers many language directions (24). For each of these directions the recording contains twice 45 city names per speaker: names are read from a prompt and names are repeated after listening to them via headphone.
Language(s) : Dutch - English - French - German - Italian - Czech
|
|
|
|
This is a non-native English speech corpus containing 2,200 utterances by 93 speakers from 15 different countries.
Language(s) : English
|
|
|
|
This corpus is a small collection of 700 utterances by 10 English native speakers. It is composed of city names in French, Italian, Norwegian and Greek.
Language(s) : French - Italian - Norwegian - Greek
|
|
|
|
This speech corpus is a small collection of 2,148 utterances in English by 19 German native speakers.
Language(s) : English
|
|
|
|
This corpus is a collection of 2,000 utterances in English by 40 non-native speakers. Data are composed of digits.
Language(s) : English
|
|
|
|
The Hiwire database is a non-native English database containing two kinds of speech material: a set of utterances (pilot orders) recorded in a quiet room with a close-talking microphone (clean data) and another set obtained by the addition of noise recorded in a real plane cockpit to the clean data.
8,099 English utterances have been recorded from 81 non-native speakers from France, Greece, Italy and Spain.
Language(s) : English
|
|
|
|
The M-ATC is a non-native English database containing pilot controller communications.
Language(s) : English
|
|
|
|
The N4 is a non-native database containing recordings from naval communication training sessions in the Netherlands, Germany, United Kingdom and Canada.
It is distributed through the ELRA catalogue http://catalog.elra.info/ under the reference ELRA-S0239.
Language(s) : English - Dutch - German - Canadian US English
|
|
|
|
This multilingual resource contains native and non-native, read and spontaneous children's speech.
Duration: 63 hours.
Number of children: 611.
Languages: British English, German, Swedish.
Native languages: British English, German, Swedish and Italian.
Language(s) : English - German - Swedish
|
|
|
|
Displaying 281 to 300 (of 423 products) |
Result Pages: 15 |
|
|