Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W0364
Nova beseda
This is a wide collection of 4,158 Slovenian texts from various categories: newspapers, magazines, formal speech, fiction, non-fiction, scientific and technical texts. It contains about 162 million words, marked at the sentence level.

The corpus consists of 6 main parts:
- 2,310 texts collected from the Delo daily newspaper between 1998 and 2005 (120 million words),
- 711 texts of formal speech from Slovenian National Assembly session transcripts, between 1996 and 2004 (20 million words),
- 778 texts of fiction in Slovenian, including the complete works of the famous writers Drago Jancar, Ciril Kosmac and Ivan Cankar (12 million words),
- 78 texts of the Monitor computer magazine between 1999 and 2004 and Viva healthy living magazine (6 million words),
- 251 texts of non-fiction in Slovenian (2 million words),
- 26 scientific and technical publications (2 million words).

Before 2000, the corpus used to be called CORTES (CORpus of TExts in Slovenian).
Production
Creation date : 1999-2005
Applications
application Area : Education#Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4