Estimating the Spectral Envelope of Voiced Speech Using Multi-frame Analysis

Yoshi Shiga

In this talk, I will be presenting a novel approach for estimating the spectral envelope of voiced speech independently of its harmonic structure. Because of the quasi-periodicity of voiced speech, its spectrum indicates harmonic structure and only has energy at frequencies corresponding to integral multiples of F0. It is hence impossible to identify transfer characteristics between the adjacent harmonics. In order to resolve this problem, Multi-frame Analysis (MFA) is introduced. The MFA estimates a spectral envelope using many portions of speech which are vocalised using the same vocal- tract shape. Since each of the portions usually has a different F0 and ensuing different harmonic structure, a number of harmonics can be obtained at various frequencies to form a spectral envelope. The method is thereby expected to give a closer approximation to the vocal-tract transfer function.