Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W0359
Bangla News Corpus
This is a corpus of news in Bangla (or Bengali). It is also called the Prothom-Alo corpus because texts have been collected from the electronic version of the most widely read newspaper in Bangladesh, Prothom-Alo. Texts have been encoded in Unicode.

The corpus contains 18,100,378 word tokens and 384,048 distinct word types.
Production
Project : Pan Localization
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4