You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-WC353
Sensem Corpus
This is a lexical database consisting of sentences extracted from the electronic version of the newspaper El Periodico de Catalunya. It illustrates the semantic and syntactic behavior of the 250 more frequent Spanish verbs. The corpus comprises one million words, with 100 examples of each verb. 25,000 sentences have been semantically and syntactically annotated, that is to say 800,000 words, and about 400,000 words have been manually checked. For the corpus annotation, a sense has been assigned to each verb and a semantic role has been assigned to the verb argument(s), using a verb lexicon specially created for these tasks. Then a category and a syntactic function have been automatically pre-selected. It is presented in the XML format.
Production
Project :
SENtence SEMantics Project
Applications
application Area :
Research
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
Spanish (Spain)
Number of tokens :
1 million words
Annotation Granularity : Word
Annotation level : Semantic
Annotation Mode : Automatic#Manual
Annotation language : XML
Friday 01 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4