Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W 0162
English-Vietnamese Corpus
This parallel corpus contains 5 million words of English and Vietnamese and is representative of various fields such as science, technology, daily conversation, etc.
It has been automatically word-aligned and POS-tagged. It includes the Susanne Corpus, a golden corpus manually annotated with lemma, POS tags, chunking tags, syntactic trees, etc. This corpus has been translated into Vietnamese by English teachers.

It has been compiled to train Vietnamese-related NLP tasks (segmentation, POS tagging, WSD, MT).

The quality of EVC is currently improved by manual correction of linguistic annotations.
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4