You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-WC337
Computer Corpus of Russian Newspapers Texts of the End of the XX-th Century
These data include full issues of 13 newspapers issued in 1994-1997. These newspapers are daily and weekly, central and regional, rightist, centrist and leftist. The corpus contains in total 11,401,479 running words, 15.004 different lexemes in 23,109 different texts of various volume. The data were annotated with morphological and semantical tags, lemmas, types of morphemes, etc. Each text is tagged by the newspaper name, the date of issue, the newspaper issue number, the author (if present), the newspaper rubric section (if present), the volume of the text. Each text is also tagged by genre tags: the most detailed classification contains 90 different genre categories of texts, and ore generalised classification combines those initial categories into 9 genre types, which are: pure informative texts, pure publicistic, informarive-publicistic, imaginative prose, imaginative-publicistic texts, written-colloquial texts, advertisement materials, official documents, varia.
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
Russian
Saturday 23 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4