The development of a speaker independent "general purpose"
phonetic recognizer for Italian is described. The CSLU Toolkit
was used to develop and implement the system. The recognizer,
based on a frame-based hybrid HMM/ANN architecture trained
on context-dependent categories to account for coarticulatory
variation, recognizes 38 different phonemes (not including
silence or closures), and can distinguish between stressed and
unstressed vowels as well as open and closed vowels. The
APASCI corpus, containing nearly 2500 sentences read by 100
speakers, where the sentences have been designed to maximize
the number of phonemes occurring in different contexts, was
used for training and testing. As of the time of this writing, a
phoneme-level
accuracy of 82.90% on the development set and
of 80.53% on the test set has been obtained. This level of
accuracy is much greater than on a similar English-language
corpus (with state-of-the-art performance of slightly better than
70%) and it represents the best performance obtained so far on
this corpus.
High Performance "General Purpose" Phonetic Recognition for Italian
Publication type:
Contributo in atti di convegno
Publisher:
China Military Friendship Publish, Beijing, CHN
Source:
ICSLP 2000 - 6th International Conference on Spoken language Processing, pp. 527–530, Beijing, Cina, 16-20 October, 2000
Date:
2000
Resource Identifier:
http://www.cnr.it/prodotto/i/240945
http://www2.pd.istc.cnr.it/Papers/PieroCosi/cp-ICSLP2000-02.pdf
urn:isbn:7-80150-114-4
Language:
Eng