You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W 0057
The Croco Corpus (German-English Parallel Corpus)
This is a German-English parallel corpus of one million words.
Texts are comparable in term of register; both translation directions are represented for each register.
Processing: tokenisation, tagging for POS, phrasal categories and grammatical functions.
Alignment: words, clauses, sentences, with a mapping of the grammatical functions.
Format: XML, conform to the XCES standards.
Meta-information on each text is also provided in headers.
The CroCo project was funded by the German Research Foundation (2005-2009). Its aim is to build a 'Cross-linguistic corpora' for the investigation of specific properties of translated texts on the other (explicitation for exemple).
Production
Project :
The Croco Project
Applications
application Area :
Education#Research
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Bilingual
Language(s) :
German (Germany)English (United Kingdom)
Alignment :
Word
Annotation Coverage : Full
Annotation Granularity : Morpheme
Annotation level : Syntactic
Annotation language : XML
Friday 01 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4