You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W 0210
Reader's Digest Corpus (Czech/English)
The Reader's Digest corpus is a parallel text of articles from Reader's Digest (1993-1996). The Czech part is translation of the English one.
Number of articles: 450
Number of parallel sentences: 53,117
Number of tokens in the English part: 1,010,346 (after tokenization and normalization)
Number of tokens in the Czech part: 877,658 (after tokenization and normalization)
Applications
application Area :
Research
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Bilingual
Language(s) :
English (United Kingdom)Czech
Alignment :
Sentence
Annotation Coverage : Full
Annotation Granularity : Word
Annotation level : Morphological
Friday 01 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4