|
Language Resources |
|
|
|
Search Catalogue |
|
|
|
Send us information |
|
|
|
Languages |
|
|
|
|
|
Displaying 401 to 420 (of 730 products) |
Result Pages: 21 |
The AQUAINT TimeBank contains 73 news report documents (31,000 tokens) annotated following the TimeML 1.2.1 specification, a language for the annotation and normalization of temporal information.
Language(s) : English
|
|
|
|
This is a parallel dependency treebank of 95,000 words. It consists of the English translation of the Danish Dependency Treebank aligned at the word-level.
Language(s) : Danish >>>> English
|
|
|
|
This is a corpus of 10,350 parallel sentences collected from comparable news corpora.
Language(s) : English <<< >>> other
|
|
|
|
It consists of word-aligned corpora in German / English, German / French and English / French).
Language(s) : German <<< >>> English - German <<< >>> French - English <<< >>> French
|
|
|
|
This is a collection of about 20,000 newsgroup documents, partitioned evenly across 20 different newsgroups. It contains 15-year-old Usenet messages.
Language(s) : English
|
|
|
|
This corpus contains 296 private email bodies, resulting in 31,469 tokens.
Language(s) : Luxembourgish, Letzeburgesch
|
|
|
|
This is a Twitter data set designed to comprise a complete set of tweets for a specific news driven vent. COP15 refers to the The 2009 United Nations Climate Change Conference that took place in Copenhagen, Denmark, between December 7 and December 18. The conference included the 15th Conference of the Parties (COP 15) to the United Nations Framework Convention on Climate Change.
A total of 207,782 tweets were downloaded during the month of December 2009 by querying the Twitter Search API with the term 'cop15'.
Language(s) : English
|
|
|
|
The Rovereto Twitter N-Gram Corpus (RTC) is an n-gram dataset of Twitter messages with gender labels of the authors and time of posting. The corpus is based on 75 million English tweets collected from the public stream of Twitter, between December 2010 and July 2011.
Language(s) : English
|
|
|
|
It consists of 18,057 words.
Language(s) : Thai
|
|
|
|
It consists of 16,384 words.
Language(s) : Thai
|
|
|
|
These bilingual texts were usefull for Thai students of English, and for foreign students of Thai.
This resource is not accessible anymore.
Language(s) : Thai - English
|
|
|
|
It consists of 25,000 sentences from 859 students.
Language(s) : English
|
|
|
|
It contains 1,7 million words.
Language(s) : Malay
|
|
|
|
It contains 37,589 verses.
Language(s) : Malay
|
|
|
|
This is a corpus of written Modern Greek texts consisting of about 20 million words of written texts from several media (books, periodicals, newspapers etc.), which belong to different genres (articles, essays, literary works, reports, biographies etc.) and various topics (economy, medicine, leisure, art, human sciences etc.).
Language(s) : Greek
|
|
|
|
Displaying 401 to 420 (of 730 products) |
Result Pages: 21 |
|
|