Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W0284
KRYS I Corpus
The KRYS I Corpus contains about 6300 PDF documents which have been classified into 70 genres by a human process.

Documents have been independently labelled by two kinds of people (students and secretaries) in order to compare results. A set of 70 labels has been defined, which can be classified into 10 genre groups :

- Book
- Article
- Short Composition
- Serial
- Correspondence
- Treatise
- Information Structure
- Evidential Document
- Visually Dominant Document
- Other Functional Document

The aim of this project was to improve automated metadata extraction from text documents.
Production
Creation date : 2008
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4