The need of D, DD, DDD, DDDD.... measures is a clear sign of the loss in the representation capability of classical frame-based analysis techniques. Mainly coarticulation effects in fluent speech are hidden and obscured by the classical short-time analysis technique. In fact, almost every acceptable ASR system is forced to introduce this kind of post-processing technique, in order to obviate to that loss.
Following previous work on Auditory Modeling (AM) techniques for speech analysis front-end for automatic speech segmentation (ASS) and automatic speech recognition (ASR), evidences against frame-based analysis techniques, thus against the need of D, DD, will be given and exploited in this paper.
Various examples, mostly on plosives or other non-stationary consonants, will be illustrated, with the aim of underlying the superiority of "sampling after processing" relatively to "framing before processing" in speech analysis tasks.
D, DD, DDD, DDDD ...... Evidences Against Frame-Based Analysis Techniques.
Publication type:
Contributo in atti di convegno
Source:
NATO Advance Institute on Computational Hearing, pp. 163–168, Il Ciocco (Tuscany), Italy, 1-12 July, 1998
Date:
1998
Resource Identifier:
http://www.cnr.it/prodotto/i/241054
http://www.pd.istc.cnr.it/Papers/PieroCosi/cp-NATO98.pdf
Language:
Eng