Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W0306
SAWA Corpus
This is a parallel corpus of English and Swahili which contains about a million words for each language.

The SAWA Corpus consists of parallel texts, collected from various bilingual documents :
- extracts from the Bible (New Testament),
- extracts from the Quran,
- the UN Declaration of Human Rights,
- movie subtitles,
- example sentences from a bilingual dictionnary English-Swahili,
- bilingual investment reports,
- texts from a local Kenyan translator.

It was tokenized, UTF-8 converted and word-aligned.
Production
Project : SAWA BOF UA-2007 Creation date : 2009
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4