Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Written Corpora
Displaying 561 to 580 (of 730 products) Result Pages: [<< Prev]  ... 26  27  28  29  30 ...  [Next >>] 

ELRA-WC037
SEMCOR (Semantic ConCordance) 


It contains 250,000 words of sense-tagged text.
Language(s) : English

Click here for
more information


ELRA-WC038
ETAP-Scania corpus 


It's a collection of truck maintenance manuals consisting of 80 documents in eight languages and 1.6 million words.
Language(s) : Swedish - Dutch - English - Finnish - French - German - Italian - Spanish

Click here for
more information


ELRA-WC039
ETAP-Swedish Statement of Government Policy Corpus 


It's a multilingual collection of Government Statements containing 26,709 current words (371kB).
Language(s) : Swedish - English - French - German - Spanish

Click here for
more information


ELRA-WC040
ETAP-Invandrartidningen I and II corpora 


This corpus consists of a few texts and is growing.
Language(s) : Swedish - Arabic - English - Finnish - Persian - Polish - Sardinian - Spanish

Click here for
more information


ELRA-WC041
GRACE-Corpus 


The size of the training corpus is around 10 million words and it consists of texts evenly distributed between literary works and newspaper articles.
Language(s) : French

Click here for
more information


ELRA-WC042
Greek corpus 


It consists of articles from financial newspapers, magazines and portals, of 12 000 000 words in total.
Language(s) : Greek

Click here for
more information


ELRA-WC043
Croatian-English parallel corpus 


It contains Croatian newspaper texts in several domains such as politics, economy, finance, ecology, tourism.
Number of words: 1,6 million for Croatian, 1,9 million for English.
Language(s) : CroatianEnglish

Click here for
more information


ELRA-WC044
DoRo-El Mundo 


It contains newspapers in several domains (culture, sports, economy, international news and national news): 129,393 tokens and 7,067 types.
Language(s) : Spanish

Click here for
more information


ELRA-WC045
UAM Spanish Treebank 


It contains a newspaper on-line edition and a consumer association magazine and consisted of 1 500 annotated sentences with a total of 22 695 words.
Language(s) : Spanish

Click here for
more information


ELRA-WC046
Parallel Corpus of Italian/German Legal Texts 


It consists of Italian and German legal documents, with 5 million words.
Language(s) : Italian - German

Click here for
more information


ELRA-WC047
Juridical Corpus 


Language(s) : Spanish

Click here for
more information


ELRA-WC048
Secondary school textual Corpus 


Language(s) : Spanish

Click here for
more information


ELRA-WC049
Technical and scientific textual Corpus 


This corpus contains about 20 million names of scientific and technological terminology. This terminology has been divided into about 50 areas of knowledge.
Language(s) : Spanish

Click here for
more information


ELRA-WC050
CREA (Modern Spanish Reference Corpus) 


It consists of spoken (10%) and written (90 %) texts about science and technology, social science, arts, etc. with 125 millions words.
Language(s) : Spanish

Click here for
more information


ELRA-WC051
CORDE (Diachronic Corpus of Spanish) 


It contains 125 millions words of fiction and non fiction texts.
Language(s) : Spanish

Click here for
more information


ELRA-WC052
Turin University Treebank (TUT) 


The corpus contains texts from newspapers, magazines,
novels and press news. Its current size is 1500 annotated
sentences 33.868 words).
Language(s) : Italian

Click here for
more information


ELRA-WC054
Corpus of Modern Greek 


It consists of 240 press texts of 560,000 words in total.
Language(s) : Greek

Click here for
more information


ELRA-WC055
KAIST Corpus 


It contains literature, newspaper, academic thesis, ..., for a total of 100 million word units.
Language(s) : Korean

Click here for
more information


ELRA-WC056
TEC (Translational English Corpus) 


It consists of written texts translated into English from a variety of source languages: biography, fiction, newspaper and inflight magazines. It contains 6 million tokens.
Language(s) : English

Click here for
more information


ELRA-WC057
Deutscher Wortschatz-Large monolingual corpora 


This corpus contains more than 300 million words with approx. 6 million different word types in 13,4 million sentences, for German; 250 million words, 850.000 word types and 13 million sentences, for English, and 22 million words, 600.000 word types and 1,5 million sentences, for Dutch.
Language(s) : English - German - Dutch

Click here for
more information


Displaying 561 to 580 (of 730 products) Result Pages: [<< Prev]  ... 26  27  28  29  30 ...  [Next >>] 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4