Universal Catalogue  
  You are here » Universal Catalogue » Spoken Resources » Desktop/microphone
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-S 0190
SI-TAL ADAM Corpus
The SITAL ADAM corpus contains transcribed travel agent-client dialogues (450), which are human-human and human-machine interactions. Each dialogue is annotated at five levels of linguistic information: prosody, morphosyntax, syntax, semantics and pragmatics.

Human-human interactions are simulated telephone conversations which have been recorded on a digital tape as signed linear PCM 16bit at 16kHz with two microphones (one directional and one "close-talk"). The total amount of recorded speech is more than 7 hours, for a total number of 58,377 words (200 dialogues).
Human-machine dialogues (250) contain 1,250 utterances recorded at 8kHz and stored according the PCM-Ulaw 8 bit protocol.

Each dialogue has been orthographically transcribed (EAGLES) and the transcription is linked to the audio signal file. Each transcription file is also linked to five XML annotation files, one for each annotation levels.

SI-TAL stands for 'Integrated System for the Automatic treatment of Language'.
ADAM stands for 'Architecture for Dialogue Annotation on Multiple Levels'.
Production
Project : SI-TAL Project
Applications
Applications possible : Speech recognition#Spoken dialogue systems
application Area : Research
Contents Click on the arrow to display content.
 speech corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4