Rate Distortion Modeling of Spherical Vector Quantization Performance in Transform Audio Coding

Main Article Content

Wisarn Patchoo
Thomas R. Fischer

Abstract

A block-based Gaussian mixture model (GMM) is used to model the distribution of the transform audio data encoded using spherical vector quantization and lossless coding. The expectation-maximization algorithm is used to design the GMM to model the marginal density of the transform coefficients and the block energy density. A GMM-based rate-distortion
function is derived and shown to closely match the observed spherical VQ performance.

Article Details

How to Cite
Patchoo, W., & Fischer, T. R. (2010). Rate Distortion Modeling of Spherical Vector Quantization Performance in Transform Audio Coding. ECTI Transactions on Electrical Engineering, Electronics, and Communications, 9(1), 49–55. https://doi.org/10.37936/ecti-eec.201191.172265
Section
Research Article

References

[1] Extended AMR Wideband Codec; Transcoding functions, 3GPP TS 26.290, 2005.

[2] G.729 based Embedded Variable bit-rate coder: An 8-32 kbit/s scalable wideband coder bitstream interoperable with G.729, ITU-T G.729.1, 2006.

[3] Information Technology Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1,5 Mbit/s Part 3: Audio, ISO/IEC 11172.3, 1993.

[4] K. Sayood, Introduction to Data Compression, 3rd ed., Morgan-Kaufmann, San Francisco, CA, 2006.

[5] M. Xie and J. P. Adoul, "Embedded algebraic vector quantization (EAVQ) with application to wideband audio coding," Conf. Proceeding, ICASSP, pp 240-243, 1996.

[6] S. Ragot, B. Bessette, and R. Lefebore, "Low complexity multi-rate lattice vector quantization with application to wideband TCX speech coding at 32 kbps," Conf. Proceedings, ICASSP, pp. I501- I504, 2004.

[7] G. M. Gray, "Gauss mixture vector quantization," Conf. Proceedings, ICASSP, pp.1769-1772, 2001.

[8] D. J. Sakrison, "A geometric treatment of the source encoding of Gaussian random variable," IEEE Trans. Inform. Theory, vol. IT-14, pp. 481- 486, May 1968.

[9] T. R. Fischer and R. M. Dicharry, "Vector quantizer design for memoryless Gaussian, gamma, and Laplacian sources," IEEE Trans. Commun., vol. COMM-32, pp. 1065-1069, Sep. 1984.

[10] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression, New York, 1994.

[11] R. A. Redner and H. F. Walker, "Mixture densities, maximum likelihood and EM algorithm," SIAM Rev., vol. 26, no. 2, pp. 195-239, 1984.

[12] J. H. Piater, Mixture Models and Expectation Maximization, Lecture at ENSIMAG, May 2002, revised Nov. 2005.

[13] H. W. Sorenson and D. L. Aspach, "Recursive Bayesian estimation using Gaussian sums," Automatica, vol. 7, pp. 465-479, 1971.

[14] B. H. Juang, L. R. Rabiner, S. E. Levinson, and M. M. Sondhi, "Recent developments in the application of hidden markov models to speakerindependent isolated word recognition," Conf. Proceedings, ICASSP, pp. 9-12, 1985.

[15] P. Hedelin and J. Skoglund, "Vector quantization based on Gaussian mixture models," IEEE Trans. on Speech and Audio Proc., vol. 8, no. 4, pp. 385- 401, July 2000.

[16] A. D. Subramaniam, W. R. Gardner, and B. D. Rao, "Low-complexity source coding using Gaussian mixture models, lattice vector quantization, and recursive coding with application to speech spectrum quantization," IEEE Trans. Audio, Speech and Language Proc., vol. 14, no. 2, pp. 524-532, Mar. 2006.

[17] D. Y. Zhao, J. Samuelsson, and M. Nilsson, "On entropy-constrained vector quantization using Gaussian mixture models," IEEE Trans. Commun., vol. 56, no. 12, Dec. 2008.

[18] J. K. Su and R. M. Merserau, "Coding using Gaussian mixture and generalized Gaussian models," Conf. Proceedings, ICIP, pp. 217-220, 1996.

[19] L. Guillemot, Y. Gaudeau, S. Moussaoui, and J. M. Moureaux, "An analytical Gamma mixture based rate-distortionmodel for lattice vector quantization," EUSIPCO, 2006.

[20] P. Loyer, J. M. Moureaux, and M. Antonini, "Lattice codebook enumeration for generalized Gaussian source," IEEE Trans. Inform. Theory, vol. 49, No. 2 pp. 521-528, Feb. 2003.

[21] A. Papoulis and S. U. Pillai, Probability, Random Variable, and Stochastic Processes, 4th ed., McGraw-Hill 2002.

[22] D. Peel and G. Maclahlan, "Finite Mixture Models," Wiley interscience, 2000.

[23] Y. Z. Huang, D. B. OBrian, and R. M. Gray, "Classification of features and Images using Gauss mixtures with VQ clustering," Conf. Proceedings, Data Compression Conference (DCC), 2004.

[24] T. D. Lookabaugh and R. M. Gray, "Highresolution quantization theory and the vector quantization advantage," IEEE Trans. Inform. Theory, vol. IT-35, pp. 1020-1033, Sept. 1989.

[25] N. Jayant and P. Noll, Digital coding of waveforms: principles and applications to speech and video, Prentice Hall Professional Technical Reference, 1990.

[26] J. H. Conway and N. J. A. Sloane, Sphere Packing, Lattices, and Groups, New York: Springer-Verlag, 1988.

[27] J. H. Conway and N. J. A. Sloane, "Voronoi regions of lattices, second moments of polytopes, and quantization," IEEE Trans. Inform. Theory, vol. IT-28, pp. 211-226, Mar. 1982.

[28] T. M. Cover and J. A. Thomas, Element of Information Theory, 2nd ed., John Wiley & Son Inc., New Jersey, 2006.

[29] W. Patchoo and T. R. Fischer,"Gaussian-Mixture model of Lattice-based Spherical Vector Quantization Performance in Transform Audio Coding," Conf. Proceedings, ICASSP, pp. 197-200, 2010.