Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Written Corpora
Displaying 621 to 640 (of 730 products) Result Pages: [<< Prev]  ... 31  32  33  34  35 ...  [Next >>] 

ELRA-WC0001
BulTreeBank 


This is a treebank for Bulgarian annotated with detailed syntactic information.
Language(s) : Bulgarian

Click here for
more information


ELRA-WC2
BLIS Parallel Text (Hong Kong Hansards) 


Hong Kong Hansards contains excerpts from the Official Record of Proceedings (hansards) of the Legislative Council of Hong Kong.
Language(s) : Chinese - English

Click here for
more information


ELRA-WC3
EFE News Text 


Written data from the EFE news agency.
Language(s) : Spanish

Click here for
more information


ELRA-WC300
Monolingual Web Corpus 


3TB of data downloaded from the web and filtered
Language(s) : English

Click here for
more information


ELRA-WC301
MUSE Corpus 


It amounts to 300k and consists of annotated data in the domain of news politics.
Language(s) : Greek

Click here for
more information


ELRA-WC302
TimeBank Corpus 


The corpus contains 186 news report documents, with a total of 68.5K words.
Language(s) : English

Click here for
more information


ELRA-WC304
Italian Newspaper Corpus 


This corpus contains 17 articles for a total of 10,000 words, from the Italian newspaper "il Sole-24 Ore".
Language(s) : Italian

Click here for
more information


ELRA-WC305
MEDLEX Corpus 


This is a medical corpus of 50,000 documents for a total of 20 millions tokens.
Language(s) : Swedish (Sweden)

Click here for
more information


ELRA-WC306
Reference Corpus of Written Dutch 


The project is still on-going and the corpus is not constructed yet. The aim is to product a 500 million word reference corpus of written Dutch.
Language(s) : Dutch

Click here for
more information


ELRA-WC307
DiaCORIS Corpus 


The aim of the project is to extend the CORIS/CODIS Corpus. Thus, the DiaCORIS Corpus will include Italian texts produced between 1861 and 1945. The total side will be 15 million words.
Language(s) : Italian

Click here for
more information


ELRA-WC308
CORIS/CODIS Corpus 


The CORIS/CODIS corpus is a reference corpus for modern Italian. It contains texts from the last two decades of the 20th century, for a total of 100-million words.
Language(s) : Italian

Click here for
more information


ELRA-WC309
OVI (Opera del Vocabolario Italiano) Database 


It contains about 19 millions words of literary and non literary texts in prose and poetry written in early/old Italian from the beginning of the XIII century to 1375.
Language(s) : Italian

Click here for
more information


ELRA-WC310
BIVIO (Biblioteca Virtuale On-Line) Corpus 


It consists of texts in the domain of the history of Italian renaissance fine arts: about 200 literary and essayistic works by about 60 authors of the XV-XVII centuries.
Language(s) : Italian

Click here for
more information


ELRA-WC311
LIZ (Letteratura italiana Zanichelli) Corpus 


The corpus contains literary texts, that is to say 1000 works in poetry or prose from the XIII to the XX-century.
Language(s) : Italian

Click here for
more information


ELRA-WC312
The Italian Section of the Biblioteca Digitale IntraText 


It consists of 2575 texts, mainly in the domain of religion, thelogy and moral.
Language(s) : Italian

Click here for
more information


ELRA-WC313
The Progetto Manuzio Corpus 


It contains about 1200 literary and non literary texts.
Language(s) : Italian

Click here for
more information


ELRA-WC314
Annotated Czech-English Aligned Corpus 


"515 sentences from the Prague Czech-English Dependency
Treebank were manually annotated."
Language(s) : Czech - English

Click here for
more information


ELRA-WC315
Hebrew Cantillation Tree Bank 


In the Masoretic text of the Hebrew Bible, the cantillation marks the division and subdivision of each verse. This structural information of every verse has been represented as a tree in XML format, constituting a cantillation tree bank.
Language(s) : Hebrew

Click here for
more information


ELRA-WC316
Hunglish Corpus Written Resources 


This is a sentence-aligned English–Hungarian parallel corpus. It contains 23.7 million English and 29.4 million Hungarian words in 2.07 million sentence pairs from 5 genres of text.
Language(s) : Hungarian - English

Click here for
more information


ELRA-WC317
Hungarian Webcorpus 


This corpus contains 1,48 billion words (589 million were fully filtered) extracted from 18 million pages downloaded from the .hu domain.
Language(s) : Hungarian

Click here for
more information


Displaying 621 to 640 (of 730 products) Result Pages: [<< Prev]  ... 31  32  33  34  35 ...  [Next >>] 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4