Log-spectral distance

The log-spectral distance (LSD), also referred to as log-spectral distortion or root mean square log-spectral distance, is a distance measure (expressed in dB) between two spectra.[1] The log-spectral distance between spectra and is defined as:

where and are power spectra. Unlike the Itakura–Saito distance, the log-spectral distance is symmetric.[2]

In speech coding, log spectral distortion for a given frame is defined as the root mean square difference between the original LPC log power spectrum and the quantized or interpolated LPC log power spectrum. Usually the average of spectral distortion over a large number of frames is calculated and that is used as the measure of performance of quantization or interpolation.

See also

References

  1. Rabiner, Lawrence R; Juang, Biing-Hwang (1993). Fundamentals of speech recognition. PTR Prentice Hall.
  2. Enqvist, Per; Karlsson, Johan (2008). "Minimal Itakura-Saito Distance and Covariance Interpolation". 2008 47th IEEE Conference on Decision and Control: 137–142. doi:10.1109/CDC.2008.4739312. ISBN 978-1-4244-3123-6.


This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.