This research focuses on the spatio-temporal characteristics of lips and
jaw movements and on their relevance for lip-reading, bimodal communication
theory and bimodal recognition applications. 3D visible articulatory targets for
vowels and consonants are proposed. Relevant modifications on the spatiotemporal
consonant targets due to coarticulatory phenomena are exemplified.
When visual parameters are added to acoustic ones as inputs to a Recurrent Neural
Network system, high recognition results in plosive classification experiments are
obtained.
Publication type:
Contributo in volume
Publisher:
Springer-Verlag, Berlin/Heidelberg, DEU
Source:
Speechreading by Humans and Machine: Models, Systems and Applications, edited by D.G. Storke and M.E. Henneke, pp. 291–313. Berlin/Heidelberg: Springer-Verlag, 1996
Date:
1996
Resource Identifier:
http://www.cnr.it/prodotto/i/238198
http://www.pd.istc.cnr.it/Papers/PieroCosi/cp-NATO95.pdf
urn:isbn:3-540-61264-5
Language:
Eng