You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W0284
KRYS I Corpus
The KRYS I Corpus contains about 6300 PDF documents which have been classified into 70 genres by a human process.
Documents have been independently labelled by two kinds of people (students and secretaries) in order to compare results. A set of 70 labels has been defined, which can be classified into 10 genre groups :
- Book
- Article
- Short Composition
- Serial
- Correspondence
- Treatise
- Information Structure
- Evidential Document
- Visually Dominant Document
- Other Functional Document
The aim of this project was to improve automated metadata extraction from text documents.
Production
Creation date :
2008
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
English
Document source :
Internet
Annotation Granularity : Document
Saturday 23 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4