Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W0379
Bulgarian Polish Lithuanian corpus
This is a trilingual corpus of 3 million words in Bulgarian, Polish and Lithuanian. The BG-PL-LT corpus includes a parallel and a comparable corpus.

The parallel corpus contains more than 1 million words, aligned at the paragraph level. It consists of fiction texts translations from another language (such as English) to Bulgarian, Polish and Lithuanian. It also contains existing translations between these three languages (like texts of official documents of the European Union).

The comparable corpus contains about 2 million words in Bulgarian, Polish and Lithuanian. It consists of fiction texts and excerpts from on-line newspapers on the same thematic.

The BG-PL-LT corpus is still under construction. It is also planned to be annotated for POS and lemmas.
Production
Creation date : 2010
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus #18833
 written corpus #28833
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4