Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W 0014
AMALGAM Multi-Tagged Corpus
This multi-tagged corpus contains 180 sentences taken from the following texts:
- the Industrial Parsing of Software Manuals (IPSM) text (60 sentences),
- the Lancaster/IBM Spoken English Corpus (SEC) text (60 sentences),
- the Corpus of London Teenager (COLT) English text (60 sentences).

The texts were tagged with the AMALGAM tagger using the Brown, ICE, LLC, LOB, UNIX Parts, POW, SEC and UPenn tagging schemes. The output of the AMALGAM tagger was proofread and edited by human experts in order to remove any error.

This resource was compiled to study methods of mapping between one set of tags and the others (the AMALGAM project).

AMALGAM stands for Automatic Mapping Among Lexico-Grammatical Annotation Models.
Production
Project : The AMALGAM Project
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4