You are here
»
Universal Catalogue
»
Spoken Resources
»
Desktop/microphone
Language Resources
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Send us information
Would you like to collaborate ?
Contact Us
Languages
Catalog Reference : ELRA-U-S 0132
BD-PUBLICO
BD-PUBLICO is a database of read speech; 120 speakers were recorded in a sound-proof room with a high quality microphone (at a sampling frequency of 16kHz).
6 months of news were collected from the newspaper PUBLICO, for a total of 10 M words. It was divided in three subsets: training, development and evaluation.
The speakers were between 19 and 28 and a wide variety of accents are represented.
A pronunciation lexicon with citation phonemic transcriptions for each word was also produced.
The aim was to create a corpus equivalent in size to the WSJ0 database.
BD-PUBLICO stands for Base de Dados em Portugues eUropeu, vocaBulario Largo, Independente do orador e fala COntinua.
Applications
Applications possible :
Speech recognition
application Area :
Research
Contents
Click on the arrow to display content.
speech corpus
Language(s) :
Portuguese (Portugal)
Source Channel :
Microphone
Speech Acquisition Mode : Acoustic
Transcription Entries : Orthographic
Friday 22 November, 2024
Joint Copyright © 2008
ELRA
&
ELDA
Universal Catalogue 1.0.4