You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W0317
NP4E corpus
This is a coreferentially annotated corpus of newswire texts, extracted from the Reuters corpus (see WC0163). 5 clusters of texts have been selected on the topic of terrorism (55,000 words) : Bukavu bombing in Zaire, Peru hostages, Tajikistan hostages, Israel suicide bomb and China-Taiwan hijack.
The NP4E corpus is annotated for NP (noun phrase) coreference following specific guidelines. It is available in xml and in the MMAX format.
A small part of this corpus (12,500 words) is also annotated for event coreference. Events and arguments associated with each event are labelled in xml.
This corpus can be used for further studies on coreference annotation, cross-document coreference, anaphora resolution ...
Production
Creation date :
2006
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
English
Document source :
Internet
Annotation Coverage : Full
Annotation level : Morphological
Lexical Unit Information : Single word lemma
Annotation Mode : Manual
Annotation language : XML
Friday 01 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4