ELRA - ELRA-U-W 0015 : AMALGAM Multi-Treebank

You are here » Universal Catalogue » Written Resources » Written Corpora

Language Resources

Search Catalogue

Send us information

Would you like to collaborate ?
Contact Us

Languages

Catalog Reference : ELRA-U-W 0015

AMALGAM Multi-Treebank

This corpus is based on the IPSM raw text (60 sentences). Sentences have been parsed according to several rival parsing schemes. The result is a collection of parse trees:

- as raw output for Alice, DESPAR, ENGCG, Principar, Link, RANLP, Carroll/Briscoe Shallow Parser, WordPerfect's Grammatik, Tosca, Sextant.

- either 'hand-crafted' or post-edited to represent English corpus parsing schemes for UPenn, ICE, POW Bracketed, POW Numerical, SEC, BNC.

This resource was compiled in the framework of the AMALGAM project to study methods of mapping between tagsets and grammar schemes.

AMALGAM stands for Automatic Mapping Among Lexico-Grammatical Annotation Models.

Production

Project : The AMALGAM Project

Applications


application Area : Research

Contents

Click on the arrow to display content.

written corpus
Number of languages : Monolingual
Language(s) : English
Annotation Coverage : Full
Annotation Granularity : Word
Annotation level : Syntactic
Part of Speech : Nouns#Verbs#Adverbs#Adjectives#Pronouns#Determiners#Articles#Prepositions#Postpositions#conjunctions