Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W0383
FALKO corpus of learner German
This corpus contains texts produced by learners of German (from 49 different native languages). The most represented learners are from Danemark, England, France, Poland and Russia.

It comprises several sub-corpora:
- the summary corpus (text summaries written by advanced learners of German),
- the essay corpus (essays written by advanced learners),
- a baseline corpus with native speaker data available (for the essay corpus),
- the longitudinal corpus (data collected over several semesters from learners with different proficiency levels).

It was automatically annotated for Part-of-Speech and lemma. Specific annotation levels are provided for the labelling of learner errors.

The learners part contains 132,187 tokens and the native speakers part contains 88,730 tokens. Work is still in progress to enlarge the database.
Identification
Period of coverage :
Version : 2009
Version history :
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4