You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W 0035
CESS-CAT Catalan Corpus
This corpus contains 492,846 words of Catalan which have been syntactically annotated within the framework of the CESS-ECE project (Syntactically & Semantically Annotated Corpora, Spanish, Catalan, Basque). Different types of resources were created :
- CESS-CAT: the syntactically annotated version.
- AnCora-CAT: the semantically annotated version.
- AnCora-LEX-CAT: a verbal lexicon (1869 entries).
- AnCora-DEP-CAT: annotated with dependencies.
The CESS-CAT, which is at core here, was annotated using constituents and functions (with AGTK, University of Pennsylvania).
Production
Project :
Syntactically & Semantically Annotated Corpora (Spanish, Catalan, Basque)
Creation date :
2007
Applications
Applications possible :
Discourse analysis#Information retrieval
application Area :
Research
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
Catalan (Spain)
Annotation Coverage : Full
Annotation Granularity : Word
Annotation level : Syntactic
Annotation Mode : Automatic
Saturday 23 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4