DK87-90 is a text corpus with samples from 100 new novels, 50 magazines and 50 newspapers, from the years 1987-90. It consists of 4 million running words.
The Corpus BySoc consists of transcriptions of approximately 80 long conversations with ordinary Danes. It contains 1.3 million running words of free-style speech.
It contains 5,000 sentences of 400 Japanese newspaper articles with the following annotated information: predicate-argument relations, coreferences, and relations between nouns.