Universal Catalogue

You are here » Universal Catalogue » Spoken Resources » Desktop/microphone

Language Resources

Search Catalogue

Send us information

Would you like to collaborate ?
Contact Us

Languages

Desktop/microphone

Displaying 81 to 100 (of 423 products)

Result Pages: [<< Prev] 1 2 3 4 5 ... [Next >>]

ELRA-SD168

Audio Book “Mein Leben”

This corpus is extracted from the autobiographical audio book “Mein Leben” by Marcel Reich-Ranicki, consisting of 2 CDs with extracts of the corresponding book read aloud by the author.
Language(s) : German

Click here for
more information

ELRA-SD169

Polish Speech Database

The database contains parliamentary statements read by one male speaker. It consists of a selection of 2150 sentences annotated and manually verified, including 100 rare phonemes in words.

It is distributed through the ELRA catalogue http://catalog.elra.info under the reference ELRA-S0339.
Language(s) : Polish

Click here for
more information

ELRA-SD17

SPEECON Mandarin

(Available since 14/06/2005)

The Mandarin Chinese Speecon database comprises the recordings of 600 Mandarin Chinese speakers.
Language(s) : Mandarin Chinese

Click here for
more information

ELRA-SD170

MICASE

This corpus contains 190 hours of academic speech, recorded and transcribed (1,7 million words). It covers several domains: humanities and arts, social sciences and education, biological and health sciences, physical sciences and engineering and others.
Language(s) : English

Click here for
more information

ELRA-SD171

COLT (Corpus of London Teenage Language)

The corpus is a part of the British National Corpus, and consists of 472,000 words of transcribed text. It is a large English Corpus focusing on spontaneous conversations of teenagers. It was collected in 1993 and consists of the spoken language of 13 to 17-year-old teenagers from different boroughs of London and with different social backgrounds. The complete corpus, half a million words, has been orthographically transcribed and word-class tagged.
Language(s) : English

Click here for
more information

ELRA-SD172

Buckeye Corpus of Conversational Speech

This corpus is composed of about 300,000 words of unmonitored casual speech. It gathers interviews of 40 speakers from Columbus (USA). The recordings have been orthographically transcribed and phonetically labeled.
Language(s) : English (USA)

Click here for
more information

ELRA-SD173

DanPASS

This is a danish phonetically annotated spontaneous speech corpus. It consists of monologues, dialogues and word lists, containing about 70,000 words corresponding to 10 hours of speech, recorded by 27 speakers.
Language(s) : Danish

Click here for
more information

ELRA-SD174

CIAIR In-Car Spoken Dialogue Corpus

This is a multi-modal corpus consisting of audio, videos, driving information and transcripts. Dialogues between a driver and a navigator were recorded in car with around 800 subjects. It contains about 1.03 million morphemes. 35,000 utterance units of the corpus have been manually tagged.
Language(s) : Japanese

Click here for
more information

ELRA-SD175

French Non-Native Speech Corpus

This is a 6 hour non-native speech corpus. 15 non-native French speakers were recorded: 7 native Chinese speakers from China and 8 native Vietnamese speakers from Vietnam. The corpus consists of two parts: dialog phrases and read articles in tourism domain.
Language(s) : French

Click here for
more information

ELRA-SD176

Polish English Literacy Tutor (PELT) Corpus

116 native Polish speakers read English prompts with rich phonetic contexts. The corpus contains sentences, 6,032 files, that is 3.5 GB and 14h 37min 37sec of running speech.
Language(s) : English (Poland)

Click here for
more information

ELRA-SD177

British English Diphthong Corpus

This corpus represents 8 British English diphtongs in 12 different contexts. 30 speakers were recorded reading 61 sentences (each read 3 times by each subject).
Language(s) : English (United Kingdom)

Click here for
more information

ELRA-SD178

Spanish and Valencian Speech Corpus

This is a bilingual parallel corpus of two phonetically similar languages. Valencian is a Catalan language dialect spoken in the Comunitat Valenciana. 20 speakers recorded 120 sentences, 60 per language, that is to say about one hour of speech for each language.
Language(s) : Catalan, Valencian - Spanish (Spain)

Click here for
more information

ELRA-SD179

SmartWeb Motorbike Corpus (SMC)

This speech corpus has been recorded on a motorbike by two speakers using helmet and throat microphones and a Bluetooth transceiver. It consists of 38 sessions of 12 query blocks, with 6 queries per query block. In total the corpus contains 2835 queries with about 31,900 running words.
Language(s) : German

Click here for
more information

ELRA-SD18

SPEECON Polish

(Available since 02/08/2005)

The Polish Speecon database comprises the recordings of 550 adult Polish speakers and 50 child Polish speakers who uttered respectively over 290 items and 210 items (read and spontaneous).
Language(s) : Polish

Click here for
more information

ELRA-SD180

Czech Lombard Speech Database (CLSD'05)

This corpus contains speech data and their transcriptions. 26 speakers (12 female, 14 male) recorded sessions with both neutral and simulated noisy scenarios. The sessions consist of 205 utterances per speaker and scenario, that is to say 10-12 minutes of continuous speech. Each session contains 30 phonetically rich sentences and 470 repeated and isolated digits.
Language(s) : Czech

Click here for
more information

ELRA-SD181

Corpus of Dutch Aphasic Speech (CoDAS)

This corpus aims to be a tool for linguistic research on aphasia. It will include speech representing different types of aphasia (Broca, Wernicke, global, transcortical, anomic, etc.) and various communication settings. For a pilot study for the CoDAS Corpus, speech material from six aphasic patients has been collected. Their average age was 54 and the time post onset was between three and four years. The patient had to answer questions on five standard topics. Each patient produced at least 300 words and three of the five topics at least are discussed. There were also a repetition task, a writing task, a naming task and a comprehension task. The data has been orthographically transcribed, phonetically transcribed and Part-of-Speech tagged.
Language(s) : Dutch (The Netherlands)

Click here for
more information

ELRA-SD182

Corpus Gesproken Nederlands (CGN, Spoken Dutch Corpus)

This is a database of contemporary standard Dutch spoken by adults in the Netherlands and Flanders. It consists of about 10 million words, that is to say 1,000 hours of speech data which have been recorded in different communicational settings.
Language(s) : Dutch

Click here for
more information

ELRA-SD183

SpIt Corpus

It consists of 3 dialogues from the CLIPS project and the Ipar project, and 13 annotation levels (orthographic, morphosyntactic, syntactic, lexical, rhythmic levels, etc.). Each annotation level selects its own base unit, depending on linguistic factors.
Language(s) : Italian (Italy)

Click here for
more information

ELRA-SD184

TRAINS Spoken Dialog Corpus

This is a corpus of task-oriented spoken dialogs. The aim of the dialogs is to create, discuss and evaluate various plans involving freight shipments by train. 34 speakers recorded 98 dialogs, 5,900 turns, involving 20 different tasks. In total, it contains 6 and a half hours of recorded speech with 55,000 transcribed words.
Language(s) : English

Click here for
more information

ELRA-SD185

Maptask Corpus

This is a corpus of task-oriented spoken dialogs containing 128 dialogs for giving driving directions.
Language(s) : English

Click here for
more information

Displaying 81 to 100 (of 423 products)

Result Pages: [<< Prev] 1 2 3 4 5 ... [Next >>]