Voice quality is recognized to play an important role for the
rendering of emotions in verbal communication. In this paper
we explore the effectiveness of a sinusoidal modeling processing framework for voice transformations finalized to the analysis and synthesis of emotive speech. A set of acoustic cues
is selected to compare the voice quality characteristics of the
speech signals on a voice corpus in which different emotions
are reproduced. The sinusoidal signal processing tool is used
to convert a neutral utterance into emotive utterances. Two different procedures are applied and compared: in the first one,
only the alignment of phoneme duration and of pitch contour
is performed; the second procedure refines the transformations
by using a spectral conversion function. This refinement improves the reproduction of the different voice qualities of the
target emotive utterances. The acoustic cues extracted from the
transformed utterances are compared to the emotive original utterances, and the properties and quality of the transformation
method are discussed.
Emotions and Voice Quality: Experiments with Sinusoidal Modeling
Publication type:
Contributo in atti di convegno
ISCA, International speech communication association, Baixas, FRA
VOQUAL 2003, Voice Quality: Functions, Analysis and Synthesis, ISCA Workshop, pp. 127–132, Geneva, Switzerland, August 27-29, 2003
Resource Identifier: