Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Written Corpora
Displaying 501 to 520 (of 730 products) Result Pages: [<< Prev]  ... 26  27  28  29  30 ...  [Next >>] 

ELRA-WC0207
English-Norwegian Parallel Corpus (ENPC) 


The corpus consists of text excerpts of approximately 10.000 to 15.000 words from fictional and non-fictional Norwegian and English original texts and their translations, amounting to a total of 200 texts, or 2.6 million words.
Language(s) : English - Norwegian

Click here for
more information


ELRA-WC0208
Oslo Multilingual Corpus (OMC) 


This is an extension of the 2,6 million-word English-Norwegian Parallel Corpus (ENPC). German, Dutch and Portugese translations were added for some of the texts. It contains fictional and non-fictional texts.
Language(s) : Dutch - English - German - Norwegian - Portuguese

Click here for
more information


ELRA-WC0209
Oslo Corpus of Bosnian Texts 


It consists of approximately 1.5 million words and comprises several different genres: fiction (novels and short stories), essays, children's stories, folklore, islamic texts, legal texts, and newspapers and journals.
Language(s) : Bosnian

Click here for
more information


ELRA-WC0210
Tycho Brahe Parsed Corpus of Historical Portuguese 


This electronic annotated corpus consists of texts written between 1500-1900.
Language(s) : Portuguese

Click here for
more information


ELRA-WC0211
Susanne Corpus 


It contains annotations of a 130,000-word cross-section of written American English
Language(s) : English (USA)

Click here for
more information


ELRA-WC0212
Penn-Helsinki Parsed Corpus of Middle English (PPCME2) 


It includes a total of roughly 1.2 million words of running text. It comprises 55 text samples, each of which is given in three forms: a text file, a part-of-speech tagged file and a parsed file. In addition, there is a file with philological and bibliographical information about each text.
Language(s) : English

Click here for
more information


ELRA-WC0213
Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME) 


It consists of nearly 1.8 million words. Each of the texts in the corpus is available in parsed, POS-tagged, and unannotated form. The corpus is divided into three subcorpora : the Helsinki directories (roughly 573,000 words), the Penn1 directories (roughly 615,000 words) and the Penn2 directories (roughly 606,000 words).
Language(s) : Modern English

Click here for
more information


ELRA-WC0214
York-Helsinki Parsed Corpus of Old English Poetry 


It contains 71,490 words of Old English poetic texts, that are syntactically and morphologically annotated.
Language(s) : Old English

Click here for
more information


ELRA-WC0215
York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE) 


This is a 1.5 million word syntactically-annotated corpus of Old English prose texts.
Language(s) : Old English

Click here for
more information


ELRA-WC0216
Brooklyn-Geneva-Amsterdam-Helsinki Parsed Corpus of Old English 


It contains 106,210 words of Old English texts that are syntactically and morphologically annotated.
Language(s) : Old English

Click here for
more information


ELRA-WC0217
Lancaster/Oslo-Bergen Corpus (LOB) 


It contains approximately one million words of British written English dating from 1960 and consisting of 15 different genre categories.
Language(s) : English (United Kingdom)

Click here for
more information


ELRA-WC0218
Lancaster-Leeds Treebank 


This is a manually parsed subsample of the LOB corpus showing the surface phrase structure of each sentence. It consists of approximately 45,000 words taken from all the genre categories of the LOB corpus.
Language(s) : English

Click here for
more information


ELRA-WC0219
Brown Corpus of Standard American English 


It consists of one million words of 500 American English texts printed in 1961, each consisting of 2,000 words.
Language(s) : English (USA)

Click here for
more information


ELRA-WC0220
Corpus des Oeuvres de Philosophie en Langue Française 


It contains 229 works (including 651 images).
Language(s) : French

Click here for
more information


ELRA-WC0221
Miscellaneous French Texts 


It consists of 38 titles of french texts.
Language(s) : French

Click here for
more information


ELRA-WC0222
Corpus Médical de la Faculté de Médecine de Grenoble 


It contains 284 questions related to medical pathologies of 31 subjects.
Language(s) : French

Click here for
more information


ELRA-WC0223
Lancaster Parsed Corpus (LPC) 


This is a subsample of the LOB corpus, parsed by computer and manually corrected by several researchers. It contains approximately 140,000 words with samples from each of the 15 categories in the LOB corpus.
Language(s) : English (United Kingdom)

Click here for
more information


ELRA-WC0224
American Printing House for the Blind Treebank (APHB) 


This is a 200,000-word skeleton-parsed corpus of a wide range of English texts.
Language(s) : English (USA)

Click here for
more information


ELRA-WC0225
Associated Press Treebank (AP) 


This is a skeleton-parsed corpus of American newswire reports containing 1,000,000 words.
Language(s) : English (USA)

Click here for
more information


ELRA-WC0226
Canadian Hansard Treebank 


This is a 750,000-word skeleton-parsed corpus of proceedings in the Canadian Parliament.
Language(s) : English (Canada) - French (Canada)

Click here for
more information


Displaying 501 to 520 (of 730 products) Result Pages: [<< Prev]  ... 26  27  28  29  30 ...  [Next >>] 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4