Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W 0054
Carmel Corpus
It is a multilingual aligned corpus of literary texts in four languages: English, French, Italian, Spanish. It contains 10,000,000 words from 36 classics of travel story from 19th to early 20th century.

Processing : sentence segmentation and tokenization, POS tagging and lemmatization, WSD, thematic identification.
Format and standards: xml, tei, cesalign, tmx.

The Carmel project was part of the French program 'Technolangue' (2002-2006); the corpus was published in 2005.
Identification
Period of coverage : 19th century
Version :
Version history :
Production
Project : Carmel Project Creation date : 2005
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4