You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W 0149
Multisix Corpus
The Multisix corpus is a collection of 200 English questions with their answers retrieved manually from the Los Angeles Times corpus (year 1994). The 200 questions have been translated into five languages: Dutch, French, German, Italian and Spanish.
The corpus is in XML; each entry is structured in tags, with attributes and values defining the language, the type of question, the category of the answer (person, location, etc.), the answer, etc.
It constitutes a test set used for the cross-language tasks at CLEF QA-2003.
Production
Project :
CLEF
Creation date :
2003
Applications
Applications possible :
Information retrieval
application Area :
Research
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Multilingual
Language(s) :
English ; French ; Spanish ; Dutch ; German ; Italian
Annotation Coverage : Full
Annotation Granularity : Sentence
Annotation Mode : Manual
Annotation language : XML
Saturday 23 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4