ELRA - ELRA-U-S0217 : Columbia/SRI/Colorado Deception Corpus

You are here » Universal Catalogue » Spoken Resources » Desktop/microphone

Language Resources

Search Catalogue

Send us information

Would you like to collaborate ?
Contact Us

Languages

Catalog Reference : ELRA-U-S0217

Columbia/SRI/Colorado Deception Corpus

The CSC Deception Corpus contains deceptive and non-deceptive speech. It was built by Columbia, SRI and Colorado researchers in order to find automatic methods to detect lies.

Recordings are interviews (25 to 50 minutes each) of 32 native English speakers. They were recorded on two channels by a close-talking microphone, at the sampling rate of 16kHz.

Subjects were asked to answer questions and to perform a series of tasks, and were told that their performance would be compared to a target profile of one of the ‘top entrepreneurs of America’. After giving them manipulated results, they were asked to play again to obtain performances closer to the target ones. In four of the six tasks, they had to deceive the interviewer, trying to show that they could convince others even if they did not tell the truth.

Transcriptions of these recordings contain 79,488 words, aligned on the audio signal. It is divided into the "truth" and the "lie" part.

Production

Creation date : 2005

Contents

Click on the arrow to display content.

speech corpus
Language(s) : English
Recording Channels : 2
Source Channel : Microphone
Transcription Entries : Orthographic
Transcription Segmentation : Breath group#Speaker turn
Transcription Segmentation Level : Full