You are here
»
Universal Catalogue
»
Written Resources
»
Written Corpora
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-W 0193
Repentino
Repentino is composed of textual named entity instances (set of proper nouns denoting a specific entity classified as to which kind of entity they denote: company, book title, place name, etc.). Currently, Repentino gathers more than 450,000 instances (in XML) extracted mainly from a large document collection and also from several thematic Web sites.
It is organised in 11 major categories, in turn subdivided in 97 subcategories. It has been manually validated to ensure its quality.
- Location: Terrestrial, Hydrographic, Address, ..
- Organisations: Company, Government/Administration, ..
- Beings: Human, Human-Collective, Non-Human, ..
- Event: Ephemerid, Cyclic, ..
- Products: Brands, Consumables, Electronics/Appliances, ..
- Art/Media/Communication: Books, Movies, TV/Radio/Theatre, ..
- Paperwork: Laws, Certificates, Documents, ..
- Substance: Group, Ore, ..
- Abstraction: Disciplines/Crafts, Period/Movement/Trend, ..
- Nature: Animal, Physiology, Micro-organisms, ..
- Miscellanea.
It is intended to help the development of named entity recognition systems for Portuguese.
REPENTINO stands for REPositório para reconhecimento de ENTIdades com NOme.
Applications
application Area :
Research
Contents
Click on the arrow to display content.
written corpus
Number of languages
: Monolingual
Language(s) :
Portuguese
Saturday 23 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4