Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W 0265
Brazilian Portuguese-English Parallel Corpora
It is a bilingual Brazilian Portuguese-English corpora of parallel texts from different domains: scientific, law and journalistic. It contains the following sub-corpora:

- CorpusPE: 65 pairs of academic parallel texts (abstracts) on Computer Science. They are included in two verions: one authentic (non-revised) of 21,432 words, and another revised by a human translator (pre-edited corpus) of 21,492 words.

- CorpusALCA: 4 pairs of parallel official documents of the Free Trade Area of the Americas (FTAA). It contains 22,069 words.

- CorpusNYT: 7 pairs of parallel articles from "The New York Times". It contains 10,595 words.

These corpora were divided in three classes of corpora: test corpora, POS-tagged corpora and reference corpora.

It was developed to support the PESA project, which aims to investigate, implement and evaluate some sentence alignment methods of Brazilian Portuguese and English parallel texts.
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4