You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W0304
Diachronic Corpus of Present-Day Spoken English
The DCPSE is a corpus of spoken British English covering the period between 1960 and 2000. It contains 885,436 words, fully-parsed and annotated (87,000 trees).
This corpus contains orthographic transcriptions of conversations, discussions, interviews or speeches. It consists of approximatively:
- 400,000 words from the International Corpus of English (ICE-GB), collected in the early 1990s (the ICE-GB is distributed through the ELRA catalogue under reference W0021)
- 400,000 words from the London-Lund Corpus, collected between the late 1960s and the early 1980s (see SD153 in the Universal Catalogue for a full description).
The ICE-GB was used as a gold standard for the parsing of DCPSE.
Identification
Period of coverage :
Version :
Release 1
Version history :
Production
Creation date :
2004
Applications
application Area :
Research
Technical Informations
Distribution medium :
CD-ROM
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
English (United Kingdom)
Number of tokens :
885,436 words
Annotation Coverage : Full
Annotation Granularity : Word
Annotation level : Syntactic
Annotation Mode : Automatic
Saturday 23 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4