ELRA - ELRA-U-W 0055 : IJS-ELAN Slovene-English Parallel Corpus

You are here » Universal Catalogue » Written Resources » Written Corpora

Language Resources

Search Catalogue

Send us information

Would you like to collaborate ?
Contact Us

Languages

Catalog Reference : ELRA-U-W 0055

IJS-ELAN Slovene-English Parallel Corpus

This Slovene-English parallel corpus is composed of 15 texts and contains 500,000 words per language. It is tokenised, sentence segmented and aligned (encoding : XML (TEI/P4)). The lemmatisation and tagging were done with Multext East tools.

Identification

Period of coverage :

Version : v2.0
Version history :

Production

Creation date : 2002

Applications


application Area : Research

Technical Informations

Bytesize : 58 MB
Compression : Zip

Contents

Click on the arrow to display content.

written corpus
Number of languages : Bilingual
Language(s) : Slovenian (Slovenia)English (United Kingdom)
Alignment : Sentence
Annotation Coverage : Full
Annotation Granularity : Word
Annotation level : Morphological
Annotation Mode : Automatic
Annotation Scheme : TEI
Annotation language : XML