Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-WC342
Croatian National Corpus
Croatian National Corpus (HNK) is a collection of selected texts covering different media, genres, styles, fields and topics. It is composed of two sub-corpora: one for contemporary Croatian and the other called HETA (Croatian Electronic Textual Archive).

Compilation of the corpus is still going on. The objective is to achieve a balanced corpus of 200 million words, with full POS/MSD-tagging and (partial) syntactic and semantic annotations.
The HNK currently contains 101.3 million tokens and is in XML format (XCES).
Identification
Period of coverage :
Version : v2.0
Version history : v1.0 v2.5 (announced)
Production
Creation date : 2005
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4