You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W 0100
German-English Parallel Corpus de-news
The German English parallel corpus is adapted from the de-news web site. Translation of German radio broadcast news into English has been performed by volunteers and the quality is overall very good. Sentence alignment is based on the Church and Gale algorithm.
Version 0.9 covers August 1996 to January 2000. It includes 9,756 news items, 66,317 German sentences (1,017,064 tokens), 62,475 English sentences (1,175,526 tokens) and 59,014 aligned sentences.
It is available in three formats: raw texts, preprocessed text and sentence-aligned.
It is designed for machine translation research.
Identification
Period of coverage :
Version :
v0.91
Version history :
Production
Creation date :
2000
Applications
application Area :
Research
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Bilingual
Language(s) :
GermanEnglish
Alignment :
Sentence
Number of tokens :
59,014
Saturday 23 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4