Universal Catalogue

You are here » Universal Catalogue » Written Resources » Written Corpora

Language Resources

Search Catalogue

Send us information

Would you like to collaborate ?
Contact Us

Languages

Written Corpora

Displaying 541 to 560 (of 730 products)

Result Pages: [<< Prev] ... 26 27 28 29 30 ... [Next >>]

ELRA-WC0247

CELT (Corpus of ELectronic Texts)

CELT is a resource for contemporary and historical Irish documents in literature, history and politics.
Language(s) : Irish

Click here for
more information

ELRA-WC0248

Ogamica

It consists of romanization, transcription, translitteration, picture, bibliography, description and historical and philological notes.
Language(s) : Irish

Click here for
more information

ELRA-WC0249

CWBC (Corpus of Written British Creole)

It consists of a diverse collection of annotated texts of around 12,000 words. The tags mark differences in spelling, lexis, discoursal and grammatical structure between Standard English and the language of the Corpus texts.
Language(s) :

Click here for
more information

ELRA-WC0250

Croatian Language Corpus

The Croatian Language Corpus cover various domains and genres. It includes literature and other written sources from the second half of the 19th century on.
Language(s) : Croatian

Click here for
more information

ELRA-WC0251

International Corpus of English - Hong Kong (ICE-HK)

It consists of one million words of spoken and written English from Hongkong and contains 500 texts of approximately 2,000 words each.
Language(s) : English (New Zealand)

Click here for
more information

ELRA-WC0252

International Corpus of English - East Africa (ICE-EA)

It consists of one million words of spoken and written English from Kenya and Tanzania, and contains 500 texts of approximately 2,000 words each.
Language(s) : English (New Zealand)

Click here for
more information

ELRA-WC0253

International Corpus of English - Great Britain (ICE-GB)

(Available since 06/03/2000)

It consists of one million words of spoken and written English from Great Britain and contains 500 texts of approximately 2,000 words each. The corpus is POS-tagged and parsed.
Language(s) : English (New Zealand)

Click here for
more information

ELRA-WC0254

International Corpus of English - Singapore (ICE-SIN)

It consists of one million words of spoken and written English from Singapore and contains 500 texts of approximately 2,000 words each.
Language(s) : English (New Zealand)

Click here for
more information

ELRA-WC0255

International Corpus of English - Philippines (ICE-PHI)

It consists of one million words of spoken and written English from the Philippines and contains 500 texts of approximately 2,000 words each.
Language(s) : English (New Zealand)

Click here for
more information

ELRA-WC0256

International Corpus of English - New Zealand (ICE-NZ)

It consists of one million words of spoken and written English from New Zealand and contains 500 texts of approximately 2,000 words each.
Language(s) : English (New Zealand)

Click here for
more information

ELRA-WC0257

Corpus of Contemporary Arabic (CCA)

It contains around one million words and 4 categories of spoken data and 43 categories of written data.
Language(s) : Arabic

Click here for
more information

ELRA-WC0258

The Lancaster Speech, Writing and Thought Presentation Written Corpus

It contains approximately 260,000 words of prose fiction, newspaper news reports and (auto)biography, tagged with the Leech and Short (1981) category set.
Language(s) : English (New Zealand)

Click here for
more information

ELRA-WC028

JEIDA-English-Japanese Bilingual Corpus

White papers from Japanese ministries - 1992-1996.
Language(s) : English - Japanese

Click here for
more information

ELRA-WC029

RWC-DB-TEXT-95-2

Manually post-edited morphological analysis of 3 000 articles from the Mainichi Shimbun in 1994.
Language(s) : Japanese

Click here for
more information

ELRA-WC030

RWC-DB-TEXT-94-2

Morphological analysed data of JEIDA's annual report.
Language(s) : Japanese

Click here for
more information

ELRA-WC031

RWC-DB-TEXT-95-3

The same set of articles in RWC-DB-TEXT-95-2 tagged with the universal decimal classification (UDC).
Language(s) : Japanese

Click here for
more information

ELRA-WC032

RWC-DB-TEXT-95-1

It's a 100 million words corpus containing newspaper articles.
Language(s) : Japanese

Click here for
more information

ELRA-WC033

RWC-DB-TEXT-94-1

It contains MITI (Ministry of International Trade and Industry) white papers from 1993 to 1995.
Language(s) : Japanese

Click here for
more information

ELRA-WC034

Parallel English and Japanese Text

Text corpus whose source is "Heisei Highs and Lows".
Language(s) : English - Japanese

Click here for
more information

ELRA-WC036

SENSEVAL-SEMCOR

It contains 100 sense-tagged instances and 83 words.
Language(s) : English

Click here for
more information

Displaying 541 to 560 (of 730 products)

Result Pages: [<< Prev] ... 26 27 28 29 30 ... [Next >>]