Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Written Corpora
Displaying 581 to 600 (of 730 products) Result Pages: [<< Prev]  ... 26  27  28  29  30 ...  [Next >>] 

ELRA-WC058
ILC Italian Reference Corpus (IRC)


The IRC is a 12,750,000 word corpus from texts of various types.
Language(s) : Italian

Click here for
more information


ELRA-WC059
COMPARA-Portuguese-English Parallel Translation Corpus 


COMPARA is a bi-directional parallel corpus based on an open-ended collection of Portuguese-English and English-Portuguese source-texts and translations (so far covering published fiction data).
It can be used to study translation and automatically compare and contrast English and Portuguese.
Language(s) : Portuguese - English

Click here for
more information


ELRA-WC060
IWN-Lemmatised corpus 


This corpus is divided into two different subsets : the "balanced corpus" constituted by various types of texts and the "financial corpus" which only contains texts belonging to the economic-financial domain.
Language(s) : Italian

Click here for
more information


ELRA-WC061
A Logical Approach for Semantic Representation 


Language(s) : Arabic

Click here for
more information


ELRA-WC062
A Computational Lexeme-Based Treatment of Arabic Morphology 


Language(s) : Arabic

Click here for
more information


ELRA-WC063
Sarfiyya corpus 


This a small corpus including journalistic and literary texts.
Language(s) : Arabic

Click here for
more information


ELRA-WC064
BulTreeBank Text Archive 


It is a collection of Bulgarian texts from the Internet (more than 90,000,000 running words).
Language(s) : Bulgarian (Bulgaria)

Click here for
more information


ELRA-WC065
Corpus of spelling mistakes 


It's a collection of 27 translations (1-3 pages long) from the German, English, Italian or Spanish press. It consists of 15,000 words or 637 sentences.
Language(s) : French

Click here for
more information


ELRA-WC066
Japanese Relevance-tagged Corpus 


It's an annotated corpus from newspapers. 1,300 sentences have been tagged.
Language(s) : Japanese

Click here for
more information


ELRA-WC067
Arabic and English newspapers Corpus 


It contains texts from English and Arabic newspapers.
Language(s) : Arabic - English

Click here for
more information


ELRA-WC068
A Formal Grammar in Modern Standard Arabic 


The goal of this grammar is the description of sentence structure.
Language(s) : Modern Standard Arabic

Click here for
more information


ELRA-WC069
Mémoire de traduction 


It consists of Canadian parlementarian debates of 100 million words for each language.
Language(s) : French - English

Click here for
more information


ELRA-WC071
Aarhus Corpus of Tagged Old Danish Texts (ACOD) 


The Aarhus Corpus of Tagged Old Danish Texts (ACOD) is an approximately 36,000 word morphologically tagged corpus of Old Danish texts from the period 1174-ca.1400.
Language(s) : Danish

Click here for
more information


ELRA-WC072
Senseval corpus 


This is the data used for the Senseval-2 "lexical sample" task for Danish.
Language(s) : Danish

Click here for
more information


ELRA-WC073
VISL's mixed free Danish Corpus (DFK) 


It contains mainly transcribed Danish parliamentary discussions.
Language(s) : Danish

Click here for
more information


ELRA-WC074
Bergenholtz' corpus 


It consists of collections from books, newspapers and magazines and contains approximately 3 million words.
Language(s) : Danish

Click here for
more information


ELRA-WC075
Arboretum Treebank 


This is a hybrid treebank for Danish.
Language(s) : Danish

Click here for
more information


ELRA-WC076
The Reference Corpus 


This tokennized and tagged corpus is composed of 13,746 articles
taken from a Canadian newspaper.
Language(s) : English

Click here for
more information


ELRA-WC077
MED-TYP Database 


This corpus is a typological database for Mediterranean languages.
Language(s) : Catalan - Spanish - French - Provençal - Italian - Sardinian - Friulan - Slovene - - Albanian - Modern Greek - Turkish - Maltese - Modern Hebrew - Arabic

Click here for
more information


ELRA-WC079
Italian legal texts with Semantic mark-up 


This is an Italian legal text corpus with semantic role mark-up annotation.
Language(s) : Italian

Click here for
more information


Displaying 581 to 600 (of 730 products) Result Pages: [<< Prev]  ... 26  27  28  29  30 ...  [Next >>] 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4