You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W0358
ANTARA Corpus
This corpus contains 250,000 sentences aligned in English and Indonesian (about 2.5 million words) from articles published between 2000 and 2007 through the ANTARA News Agency, covering political, economics, international, sport, science, entertainment and national news.
It is annotated in TEI P4.
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Bilingual
Language(s) :
English <<< >>> Indonesian
Alignment :
Sentence
Number of tokens :
2.5 million words
Annotation Scheme : TEI
Annotation language : SGML
Friday 01 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4