Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W 0097
The Salsa Corpus
The SALSA corpus is based on the TIGER corpus, a syntactically annotated German newspaper corpus of 1,5 million words. Word sense and semantic roles were added to TIGER using the frames of FrameNet 1.2. In addition, predicate-specific frames were developed to handle predicate instances not covered by FrameNet. The corpus was hand-annotated. The total size of the annotation is about 20.000 verbal instances and 17.000 nominal instances.

It is a resource of great value for research in NLP (automatic acquisition of lexical semantic information, training of statistical parsers on a combination of syntactic and semantic role information, improvement of techniques for information access and extraction).

The Salsa corpus was developed within the framework of the Saarbrücken Lexical Semantics Annotation and Analysis Project.
Identification
Period of coverage :
Version : Release 2.0
Version history :
Production
Project : Saarbrücken Lexical Semantics Annotation and Analysis Project
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4