Universal Catalogue  
  You are here » Universal Catalogue » Written Resources » Written Corpora
Language Resources
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Anglais
Catalog Reference : ELRA-U-W 0193
Repentino
Repentino is composed of textual named entity instances (set of proper nouns denoting a specific entity classified as to which kind of entity they denote: company, book title, place name, etc.). Currently, Repentino gathers more than 450,000 instances (in XML) extracted mainly from a large document collection and also from several thematic Web sites.
It is organised in 11 major categories, in turn subdivided in 97 subcategories. It has been manually validated to ensure its quality.

- Location: Terrestrial, Hydrographic, Address, ..
- Organisations: Company, Government/Administration, ..
- Beings: Human, Human-Collective, Non-Human, ..
- Event: Ephemerid, Cyclic, ..
- Products: Brands, Consumables, Electronics/Appliances, ..
- Art/Media/Communication: Books, Movies, TV/Radio/Theatre, ..
- Paperwork: Laws, Certificates, Documents, ..
- Substance: Group, Ore, ..
- Abstraction: Disciplines/Crafts, Period/Movement/Trend, ..
- Nature: Animal, Physiology, Micro-organisms, ..
- Miscellanea.

It is intended to help the development of named entity recognition systems for Portuguese.

REPENTINO stands for REPositório para reconhecimento de ENTIdades com NOme.
Applications
application Area : Research
Contents Click on the arrow to display content.
 written corpus 
 

Joint Copyright © 2008 ELRA & ELDA
Universal Catalogue 1.0.4