|
Language Resources |
|
|
|
Search Catalogue |
|
|
|
Send us information |
|
|
|
Languages |
|
|
|
|
|
Displaying 41 to 60 (of 730 products) |
Result Pages: 3 |
This Estonian corpus (mostly newspapers and fiction) is divided in decade, from 1890 to 1999. The texts come in two versions : annotated according to the TEI or unannotated.
Language(s) : Estonian (Estonia)
|
|
|
|
The FIDA corpus is a reference corpus of the Slovene language containing 100,000,000 words. It gathers contemporary written texts and transcripts of speech data, from various genres (from literary to scientific texts).
Language(s) : Slovenian (Slovenia)
|
|
|
|
This corpus contains 188,650 words of Spanish which have been syntactically annotated within the framework of the CESS-ECE project.
Language(s) : Spanish (Spain)
|
|
|
|
This corpus contains 492,846 words of Catalan which have been syntactically annotated within the framework of the CESS-ECE project.
Language(s) : Catalan (Spain)
|
|
|
|
AnCora-ESP is a Spanish corpus of 188,513 words which has been semantically annotated (still under development, aim: 500,000 words).
Language(s) : Spanish (Spain)
|
|
|
|
AnCora-CAT is a Catalan corpus of 395,379 words which have been semantically annotated (still under development, aim: 500,000 words).
Language(s) : Catalan (Spain)
|
|
|
|
This corpus contains 350,000 words of Basque which have been syntactically annotated within the framework of the CESS-ECE project (still under development).
Language(s) : Basque (Spain)
|
|
|
|
The Europarl Corpus is a multilingual collection of texts extracted from the proceedings of the European Parliament. It concerns 11 languages: Danish, German, Greek, English, Spanish, Finnish, French, Italian, Dutch, Portuguese, Swedish. The number of words is close to 55 millions for each language.
Language(s) : Danish (Denmark) - German (Germany) - Greek (Greece) - English (United Kingdom) - Spanish (Spain) - Finnish (Finland) - French (France) - Italian (Italy) - Dutch (Netherlands) - Swedish (Sweden) - Portuguese (Portugal)
|
|
|
|
This Danish-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 1,684,664 aligned sentences, 43,692,760 words in L1 and 46,282,519 words in L2.
Language(s) : Danish (Denmark)English (United Kingdom)
|
|
|
|
This Greek-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 960,356 aligned sentences.
Language(s) : Greek (Greece)English (United Kingdom)
|
|
|
|
This Spanish-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 1,689,850 aligned sentences, 48,860,242 words in L1 and 46,843,295 words in L2.
Language(s) : Spanish (Spain)English (United Kingdom)
|
|
|
|
This Finnish-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 1,646,143 aligned sentences, 32,355,142 words in L1 and 45,136,552 words in L2.
Language(s) : Finnish (Finland)English (United Kingdom)
|
|
|
|
This French-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 1,723,705 aligned sentences, 51,708,806 words in L1 and 47,915,991 words in L2.
Language(s) : French (France)English (United Kingdom)
|
|
|
|
This Italian-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 1,635,140 aligned sentences, 46,380,851 words in L1 and 47,236,441 words in L2.
Language(s) : Italian (Italy)English (United Kingdom)
|
|
|
|
This Dutch-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 1,715,710 aligned sentences, 47,477,378 words in L1 and 47,166,762 words in L2.
Language(s) : Dutch (Netherlands)English (United Kingdom)
|
|
|
|
This Portuguese-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 1,681,991 aligned sentences, 47,621,552 words in L1 and 47,000,805 words in L2.
Language(s) : Portuguese (Portugal)English (United Kingdom)
|
|
|
|
This Swedish-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 1,570,411 aligned sentences, 38,537,243 words in L1 and 42,810,628 words in L2.
Language(s) : Swedish (Sweden)English (United Kingdom)
|
|
|
|
This German-English parallel corpus is extracted from the proceedings of the European Parliament (04/1996-10/2009). It contains 1,581,107 aligned sentences, 41,587,670 words in L1 and 43,848,958 words in L2.
Language(s) : German (Germany)English (United Kingdom)
|
|
|
|
CAST3LB is a Spanish treebank of 100,000 words corresponding to 4,000 sentences. The annotation concerns: POS for morphosyntactic information, constituents and functions for syntactic information.
Language(s) : Spanish (Spain)
|
|
|
|
CAT3LB is a Catalan treebank of 100,000 words corresponding to 2,600 sentences. The annotation concerns: POS for morphosyntactic information, constituents and functions for syntactic information.
Language(s) : Catalan (Spain)
|
|
|
|
Displaying 41 to 60 (of 730 products) |
Result Pages: 3 |
|
|