A speaker independent
bimodal
phonetic classification
experiment regarding the Italian plosive consonants is described.
The phonetic classification scheme is based on a feed forward
recurrent back-propagation neural network working on audio and
visual information. The speech signal is processed by an auditory
model producing spectral-like parameters, while the visual signal
is processed by a specialized hardware, called ELITE, computing
lip and jaw kinematics parameters.
Publication type:
Contributo in atti di convegno
Publisher:
Citation Delaware, New Castle, Delaware, USA
Source:
ICSLP-1996, International Conference on Spoken Language Processing, pp. 54–57, Philadelphia, PA USA, October 3-6, 1996
Date:
1996
Resource Identifier:
http://www.cnr.it/prodotto/i/241415
https://dx.doi.org/10.1109/ICSLP.1996.607023
info:doi:10.1109/ICSLP.1996.607023
http://www2.pd.istc.cnr.it/Papers/PieroCosi/cp-ICSLP96.pdf
urn:isbn:0-7803-3555-4
Language:
Eng