Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W 0148
DISEQuA Corpus
The DISEQuA corpus is composed of 450 questions formulated into four languages: Dutch, Italian, Spanish and English. The answers have been manually retrieved in three document collections (not in English): La Stampa and SDA newspaper/wire articles (year 1994) for Italian, EFE (year 1994) for Spanish and Algemeen Dagblad and NRC Handelsblad (years 1994 and 1995) for Dutch.

The corpus is in XML; each entry is structured in tags, with attributes and values defining the language, the type of question, the category of the answer (person, location, etc.), the answer, etc.

This questions/answers set enables to test or train cross-language QA systems in twelve different combinations.

DISEQuA stands for Dutch, Italian, Spanish and English collection of Questions and Answers. It gathers resources created for CLEF 2003.
Identification
Period of coverage :
Version : v1.1
Version history :
Production
Project : CLEF Creation date : 2003
Applications
Applications possible : Information retrieval
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4