Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Written Corpora
Displaying 481 to 500 (of 730 products) Result Pages: [<< Prev]  ... 21  22  23  24  25 ...  [Next >>] 

ELRA-WC0183
Louvain Corpus of Native English Essays (LOCNESS) 


LOCNESS is a corpus of native English essays containg 324,304 words : British pupils' A level essays (60,209 words), British university students essays (95,695 words) and American university students' essays (168,400 words).
Language(s) : English

Click here for
more information


ELRA-WC0184
CLUVI Parallel Corpora 


It contains over 23 million words in five language combinations related to Galician: English-Galician, Galician-Spanish, French-Galician, English-Galician-French-Spanish and Spanish-Galician-Catalan-Basque.

The parallel texts are aligned in an XML-adaptation of the TMX format (Translation Memory eXchange).
Language(s) : Galician - Spanish - English - French - Portuguese - Basque - Catalan

Click here for
more information


ELRA-WC0186
Web Pages Corpus 


This is a corpus of web pages email messages (440 documents), where each document is provided with one of the four category labels: conferences, jobs, resources and trash.
Language(s) : English

Click here for
more information


ELRA-WC0187
Syntactically Annotated Corpus of Tibetan 


Syntactically annotated corpus of spoken and written Tibetan from different regions and time periods.
Language(s) : Tibetan

Click here for
more information


ELRA-WC0190
Sejong Morph Tagged Corpus 


This corpus consists of 10 million morphologically annotated Korean words.
Language(s) : Korean

Click here for
more information


ELRA-WC0191
Sejong Morph Sense Tagged Corpus 


This corpus consists of 5,5 million of semantically annotated Korean words. It is TEI-compliant.
Language(s) : Korean

Click here for
more information


ELRA-WC0192
Sejong Korean Treebank 


This corpus contains syntactically parsed sentences (150,000 in 2003).
Language(s) : Korean

Click here for
more information


ELRA-WC0193
Cross-document Structure Theory (CST) Bank 


It consists of a collection of documents that have been annotated for cross-document structure theory relationships.
Language(s) : English

Click here for
more information


ELRA-WC0194
Terminal Device Oriented Comparable Corpora 


It contains 88,000 pairs of aligned sentences and a hundred Web newspaper articles.
Language(s) : Japanese

Click here for
more information


ELRA-WC0195
Named Organization Corpus 


This is a corpus of 13,665 organization names.
Language(s) : Chinese

Click here for
more information


ELRA-WC0196
GENIA corpus 


This is a part-of-speech tagged corpus in biomedical domain.
Language(s) : English

Click here for
more information


ELRA-WC0197
Domain-Specific Corpora 


The first corpus contains articles of general information about cancer and about different specific cancers. The size of this corpus is about 430,000 words. The other corpus (CHEM) contains about 350,000 words of different articles of chemistry for beginners.
Language(s) :

Click here for
more information


ELRA-WC0198
International Corpus of English - India (ICE-IND)


It consists of one million words of spoken and written English from India and contains 500 texts of approximately 2,000 words each.
Language(s) : English

Click here for
more information


ELRA-WC0199
Azra child corpus 


Language(s) : Turkish

Click here for
more information


ELRA-WC0201
Mine child corpus 


It contains 1683 sentences.
Language(s) : Turkish

Click here for
more information


ELRA-WC0202
Deniz child corpus 


It contains 7000 sentences.
Language(s) : Turkish

Click here for
more information


ELRA-WC0203
METU text corpus 


This is a 2 million word corpus from newspapers.
Language(s) : Turkish

Click here for
more information


ELRA-WC0204
New York Times Corpus 


It is a component of the American National Corpus First Release and consists of over 4000 articles from the New York Times newswire, for each of the odd-numbered days in July, 2002.
Language(s) : English

Click here for
more information


ELRA-WC0205
Slate Magazine Corpus 


It contains 4694 articles from the Slate archives published between 1996 and 2000, on topics such as News and Politics, Arts, Business, Sports, Technology, Travel, Food, etc.
Language(s) : English

Click here for
more information


ELRA-WC0206
ECI/MCI (Available since 01/09/1996)


The European Corpus Initiative Multilingual Corpus contains over 98 million words, covering most of the major European languages. The primary focus in this effort is on textual material of all kinds, including transcriptions of spoken material.
Language(s) : Albanian - Bulgarian - Chinese - Czech - Danish - Dutch - English - Estonian - French - Gaelic - German - Greek - Italian - Japanese - Latin - Lithuanian - Malay - Norwegian - Portuguese - Russian - Serbian - Spanish - Swedish - Turkish - Uzbek

Click here for
more information


Displaying 481 to 500 (of 730 products) Result Pages: [<< Prev]  ... 21  22  23  24  25 ...  [Next >>] 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4