Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Written Corpora
Displaying 161 to 180 (of 730 products) Result Pages: [<< Prev]  ... 6  7  8  9  10 ...  [Next >>] 

ELRA-U-W 0153
Question/Answer Resources for Spanish 


These Q/A resources comprise several sets of questions/answers in Spanish, or translated into Spanish. Those sets were used in CLEF campaigns.
Language(s) : Spanish - EnglishSpanish

Click here for
more information


ELRA-U-W 0154
Question/Answer Resources for Finnish 


These resources comprise sets of Q/A used in CLEF campaigns and translated from English into Finnish.
Language(s) : EnglishFinnish

Click here for
more information


ELRA-U-W 0155
Meteo Corpus (Report and Warning) 


This resource contains two English-French parallel texts from the domain of meteorology: a report and a warning bitext.
Language(s) : EnglishFrench

Click here for
more information


ELRA-U-W 0156
TREC Question/Answer Resources 


TREC QA resources gather sets of English data for each TREC QA evaluation exercises.
Language(s) : English

Click here for
more information


ELRA-U-W 0157
Hakka Written Texts 


This Hakka corpus is composed of a set of 59 articles for 42,337 syllables.
Language(s) : Chinese

Click here for
more information


ELRA-U-W 0158
Mongolian Written Corpus 


This is a Mongolian corpus of 5 million words, covering various domains: laws, literature and newspaper. It is POS tagged and syntactically annotated.
Language(s) : Mongolian

Click here for
more information


ELRA-U-W 0159
GyanNidhi Multilingual Parallel Corpus 


GyanNidhi corpus contains parallel texts for English and eleven Indian languages: Hindi, Punjabi,Marathi, Bengali, Oriya, Gujarati, Telugu, Tamil, Kannada, Malayalam, Assamese (50,000 pages per language).
Language(s) : English - Assamese - Kannada - Hindi - Panjabi, Punjabi - Maharati - Bengali - Oriya - Gujarati - Telugu - Tamil - Malayalam

Click here for
more information


ELRA-U-W 0160
Bahasa Indonesia Newspapers Collection 


Those texts are extracts from Indonesian newspapers: the Kompas and the Tempo. They contain more than 9 million words.
Language(s) : Indonesian

Click here for
more information


ELRA-U-W 0161
Thai National Corpus 


This is a text corpus for the Thai language. It contains 14 million words collected from various genres.
Language(s) : Thai

Click here for
more information


ELRA-U-W 0162
English-Vietnamese Corpus (EVC)


This parallel corpus contains 5 million words of English and Vietnamese and is representative of various fields such as science, technology, daily conversation, etc.
It has been automatically word-aligned and POS-tagged.
Language(s) : EnglishVietnamese

Click here for
more information


ELRA-U-W 0163
Uyghur Text Corpus 


Uyghur is a language spoken in the Xin Jiang Uyghur autonomous region of China.
The corpus contains 2,000,000 words extracted from Xinjiang Daily, Uyghur webpages, Uyghur novels, etc.
Language(s) : Uighur (China)

Click here for
more information


ELRA-U-W 0164
Uyghur-Chinese Parallel Corpus 


This Uyghur-Chinese parallel corpus was designed for Uyghur-Chinese and Chinese-Uyghur machine translation.
Language(s) : Uighur (China)Chinese (China)

Click here for
more information


ELRA-U-W 0165
Mainichi Shimbun Japanese Newspaper Corpus 


Mainichi Shimbun is a newspaper from Japan. This database contains raw text of newspaper articles for the period 1991-2001 (approximately 100,000 articles per year).
Language(s) : Japanese

Click here for
more information


ELRA-U-W 0166
Nihon Keizai Shimbun Japanese Newspaper Corpus 


This resource contains the raw text of Nihon Keizai Shimbun newspaper articles for 1990-2000.
Language(s) : Japanese

Click here for
more information


ELRA-U-W 0167
Nihon Keizai Sangyo, Kin'yu, Ryutsu Shimbun Japanese Newspaper Corpus 


This resource contains the raw text of Nihon Keizai Sangyo, Kin'yu, Ryutsu Shimbun newspaper articles for 1994-2000.
Language(s) : Japanese

Click here for
more information


ELRA-U-W 0168
Yomiuri Shimbun Japanese Newspaper Corpus 


This resource contains the raw text of Japanese newspaper articles of Yomiuri Shimbun for 1987-2001.
Language(s) : Japanese

Click here for
more information


ELRA-U-W 0169
Yomiuri Shimbun Newspaper Corpus (English articles) 


This resource contains the raw text of English newspaper articles of Yomiuri Shimbun for 1989-2001.
Language(s) : English (Japan)

Click here for
more information


ELRA-U-W 0170
Asahi Shimbun Japanese Newspaper Corpus 


This resource contains the raw text of Asahi Shimbun newspaper articles for 1984-2005.
Language(s) : Japanese

Click here for
more information


ELRA-U-W 0171
RWC-DB-TEXT-96-2 


This resource consists of the morphologically analyzed data of Iwanami Japanese Dictionary (5th edition) with index tags. It was manually post-edited.
Language(s) : Japanese

Click here for
more information


ELRA-U-W 0172
RWC-DB-TEXT-97-1 


This resource contains the differential data of the results of morphological analysis of the Mainichi Shimbun Newspaper Corpus (all articles from 1991-1995).
Language(s) : Japanese

Click here for
more information


Displaying 161 to 180 (of 730 products) Result Pages: [<< Prev]  ... 6  7  8  9  10 ...  [Next >>] 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4