You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-WC0157
The GNOME Corpus
The GNOME Corpus includes texts from three genres - museum labels, pharmaceutical leaflets, and tutorial dialogues - in which different types of discourse and semantic information have been annotated. The corpus was created to study the aspects of discourse that affect generation, particularly salience. The corpus has been used to study Centering both from a generation and from an interpretation perspective; to study many subtasks of generation, including text planning, aggregation, and sentence planning; and more recently to study the interpretation of anaphoric expressions, particularly bridging references.
The corpus has been used to develop and evaluate anaphora resolution systems. Each subcorpus contains about 6,000 NPs; about 3000 NPs were annotated in each domain. As for utterances, the corpus includes about 500 sentences.
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
English
Friday 01 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4