Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W 0095
Szeged Corpus for Hungarian
The Szeged Corpus is a morpho-syntactically annotated and POS-tagged Hungarian natural language database. It contains 1,2 million words from texts of various genres: fiction, short essays of 14 to 16 year-old students, newspaper articles, texts related to computer science, legal texts, economic and financial news.

The corpus was tagged using the Morpho-Syntactic Description tagging system and then manually disambiguated by linguists.
Corpus files are available in XML-format (compliant with the TEIxLite DTD scheme).
Identification
Period of coverage :
Version :
Version history : v1.0: 2002
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4