|
Language Resources |
|
|
|
Search Catalogue |
|
|
|
Send us information |
|
|
|
Languages |
|
|
|
|
|
Displaying 61 to 80 (of 97 products) |
Result Pages: 4 |
This is a lexicon of approximately 40,000 lemmas; the description includes a subset of derived words. It can allow the generation of more than 1,200,000 tokens accompanied by attribute-value description.
Language(s) : Romanian
|
|
|
|
This explanatory dictionary contains the vocabulary of the Hungarian literary and common language. The data compiled goes from the beginning of the 19th century up to the present time.
It is based on the electronic Hungarian Historical Corpus, containing 27 million running words, on the archive of 6 million dictionary notes, and on other texts from CD-ROMs.
Language(s) : Hungarian
|
|
|
|
This collection of lexicons concerns many languages:
- Arabic (Iraqi, Egyptian, ..), Bahasa Indonesia, Basque, Catalan, Croatian, Czech, Dari, Dutch, English (Australian, Indian, ..), French (Canadia, Belgian, ..), German, Hebrew, Hindi, Italian, Japanses, Korean, Mandarin, Pashto, Persian/Farsi, Polish, Portuguese (Brazilian, European), Russian, Serbian, Sorani (Kurdish), Spanish (European, American, ..), Swedish, Thai, Urdu.
And many types of content: common words (with morphological, pronunciation information), family names, proper names, numbers, etc.
Language(s) :
|
|
|
|
This is an electronic lexicon for Mandarin Chinese containing 80,000 entries.
Language(s) : Chinese
|
|
|
|
This is an electronic lexicon for Mandarin Chinese containing 88,000 entries.
Language(s) : Chinese
|
|
|
|
This database was compiled from the Sinica Corpus. It is composed of high-frequency initial and final morphemes (4,025 in total).
Language(s) : Chinese
|
|
|
|
This lexicon for Chinese contains 42,138 entries (w/frequency).
Language(s) : Chinese
|
|
|
|
This word list for Chinese was extracted from the Sinica corpus (3.0). Each word is associated with a part-of-speech and a frequency.
Language(s) : Chinese
|
|
|
|
This Swedish dictionary contains approximately 60,000 lemmas illustrated with several thousands phrases.
Language(s) : Swedish
|
|
|
|
MiniDir-Cat is a Catalan lexicon which was conceived specifically as a resource oriented to WSD tasks.
Language(s) : Catalan
|
|
|
|
MiniDir is a Spanish lexicon designed specifically for automatic WSD.
Language(s) : Spanish
|
|
|
|
This is a monolingual dictionary for German (20th century). It contains more than 100 millions words.
Language(s) : German
|
|
|
|
The Texai lexicon was created by merging WordNet, OpenCyc mappings, the CMU Pronouncing Dictionary and an extract from Wiktionary. It is a broad-coverage English lexicon with word forms, pronunciations, word senses, glosses, and sample phrases.
Language(s) : English
|
|
|
|
This Polish resource consists of three parts: common words, proper names and special application words. It contains approximately 50,000 common words extracted from large text corpora, 45,000 proper names (PN lexicon) and 5,000 special application words relevant for voice driven applications (SAP lexicon).
Each lexical entry is transcribed phonetically.
Language(s) : Polish
|
|
|
|
VerbNet is a class-based verb lexicon, containing 3769 lemmas. This is a hierarchical domain-independent lexicon, organizing verbs into classes that have common syntax and semantics linking.
It is organized into 274 verb classes, described by thematic roles, restrictions on the arguments, and frames made of a syntactic description as well as semantic predicates with a temporal indication.
Language(s) : English
|
|
|
|
The eXtended WordNet is a lexical-semantic network based on Princeton WordNet 2.0 (lexical reference system), from which each definitional gloss is transformed into a specific format to make possible derivation and logic relations for automatic knowledge processing systems.
Language(s) : English
|
|
|
|
The "Tesouro informatizado da lingua galega" is a lexicographic database with lemmatized forms and grammatical categories.
It offers information about language use in more than 1,400 texts dating from 1850 until 2002.
Language(s) : Galician
|
|
|
|
The "Cronfa Electroneg o Gymraeg" is a lexical database in Welsh. It contains 1 million words from a corpus of 500 texts covering different fields in Welsh prose writing. Each word is associated with part-of-speech, lemma and frequency in the corpus.
Language(s) : Welsh
|
|
|
|
This is a lexical database of phonological similarities between French words. It contains 105,464 words of 2 to 8 phonemes from the lexical database "Lexique 3", as well as 50,893 trisyllabic words and 8,624 quadrisyllabic words.
Language(s) : French
|
|
|
|
NorNet is a monolingual WordNet of modern Norwegian which contains approximately 80,000 lexical relations.
This is a lexical-semantic network that is structured along the same lines as the Princeton WordNet (lexical reference system).
Language(s) : Norwegian
|
|
|
|
Displaying 61 to 80 (of 97 products) |
Result Pages: 4 |
|
|