You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W 0198
Lacio-Ref Corpus
The Lacio-Ref is a reference corpus of newspaper articles in Brazilian Portuguese. It was developed in the Lacio-Web Project with the following other corpora:
- Mac-Morpho, a 1,1 million word gold standard part from Lacio-Ref, morpho-syntactically annotated (PALAVRAS, E. Bick) and manually validated.
- A part of the Lacio-Ref automatically annotated with lemmas, POS and syntactic tags (Curupira parser).
- Lacio-Dev, a deviation corpus composed of non-revised texts (516,840 tokens).
- Par-C, a Portuguese-English parallel corpus.
- Comp_C, a Portuguese-English comparable corpus (300,000 words in each language).
Production
Project :
Lacio-Web (LW) project
Applications
application Area :
Research
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
Portuguese (Brazil)
Saturday 23 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4