You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W0336
Corpus of Contemporary Sinhala
This is a corpus of 10,000,000 words, which presents the modern usage of Sinhala. Sinhala (or Sinhalese) is a language spoken in Sri Lanka.
The Corpus of Contemporary Sinhala contains texts of various genres (from technical writing to news reportage). It is encoded in Unicode.
This corpus is still growing.
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
Sinhalese
Character set :
UNICODE
Number of tokens :
10,000,000 words
Saturday 23 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4