You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W0383
FALKO corpus of learner German
This corpus contains texts produced by learners of German (from 49 different native languages). The most represented learners are from Danemark, England, France, Poland and Russia.
It comprises several sub-corpora:
- the summary corpus (text summaries written by advanced learners of German),
- the essay corpus (essays written by advanced learners),
- a baseline corpus with native speaker data available (for the essay corpus),
- the longitudinal corpus (data collected over several semesters from learners with different proficiency levels).
It was automatically annotated for Part-of-Speech and lemma. Specific annotation levels are provided for the labelling of learner errors.
The learners part contains 132,187 tokens and the native speakers part contains 88,730 tokens. Work is still in progress to enlarge the database.
Identification
Period of coverage :
Version :
2009
Version history :
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
German
Annotation Granularity : Word
Annotation level : Morphological
Lexical Unit Information : Single word lemma
Annotation Mode : Automatic
Friday 01 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4