Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W 0146
The LOGON parallel tourist corpus of Norwegian-English texts
The LOGON corpus is a collection of Norwegian-English parallel texts from the domain of tourism. It is composed of several subcorpora:

- one subcorpus of general tourist texts: 180,000 words in each language, the quality of the translation varies a lot from one text to another.
- three subcorpora based on published books in the hiking domain: the Jotunheimen texts (30,000 words in Norwegian), the Turglede texts (40,000 words) and the Preikestolen texts (4,000 words). The translation of those texts is of high quality.

Texts have been aligned using IMS Corpus Workbench, and tagged with Oslo-Bergen Tagger (for Norwegian) and TreeTagger (for English).

It was designed to serve as training and testing material for the LOGON machine translation project.
Production
Project : LOGON project
Applications
application Area : Research#Tourism
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4