Skip to main content


Audio Watermarking through Deterministic plus Stochastic Signal Decomposition

Article metrics

  • 1585 Accesses

  • 2 Citations


This paper describes an audio watermarking scheme based on sinusoidal signal modeling. To embed a watermark in an original signal (referred to as a cover signal hereafter), the following steps are taken. (a) A short-time Fourier transform is applied to the cover signal. (b) Prominent spectral peaks are identified and removed. (c) Their frequencies are subjected to quantization index modulation. (d) Quantized spectral peaks are added back to the spectrum. (e) Inverse Fourier transform and overlap-adding produce a watermarked signal. To decode the watermark, frequencies of prominent spectral peaks are estimated by quadratic interpolation on the magnitude spectrum. Afterwards, a maximum-likelihood procedure determines the binary value embedded in each frame. Results of testing against lossy compression, low- and highpass filtering, reverberation, and stereo-to-mono reduction are reported. A Hamming code is adopted to reduce the bit error rate (BER), and ways to improve sound quality are suggested as future research directions.



  1. 1.

    Kirovski D, Malvar HS: Spread-spectrum watermarking of audio signals. IEEE Transactions on Signal Processing 2003, 51(4):1020-1033. 10.1109/TSP.2003.809384

  2. 2.

    Swanson MD, Zhu B, Tewfik AH, Boney L: Robust audio watermarking using perceptual masking. Signal Processing 1998, 66(3):337-355. 10.1016/S0165-1684(98)00014-0

  3. 3.

    Chou J, Ramchandran K, Ortega A: Next generation techniques for robust and imperceptible audio data hiding. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 3: 1349-1352.

  4. 4.

    Vercoe BL, Gardner WG, Scheirer ED: Structured audio: creation, transmission, and rendering of parametric sound representations. Proceedings of the IEEE 1998, 86(5):922-939. 10.1109/5.664280

  5. 5.

    Liu Y-W, Smith JO: Watermarking parametric representations for synthetic audio. Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 660-663.

  6. 6.

    Markel JD, Gray AH: Linear Prediction of Speech. Springer, New York, NY, USA; 1976.

  7. 7.

    Schroeder MR, Atal BS: Code-excited linear prediction (CELP): high-quality speech at very low bit rates. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '85), April 1985, Tampa, Fla, USA 10: 937-940.

  8. 8.

    McAulay RJ, Quatieri TF: Speech analysis/synthesis based on a sinusoidal representation. IEEE Transaction Acoustics, Speech, Signal Processing 1986, 34(4):744-754. 10.1109/TASSP.1986.1164910

  9. 9.

    Smith JO, Serra X: PARSHL: an analysis/synthesis program for non-harmonic sounds based on a sinusoidal representation. Proceedings of the International Computer Music Conference (ICMC '87), 1987, Tokyo, Japan 290-297.

  10. 10.

    Serra X, Smith JO: Spectral modeling synthesis: a sound analysis/synthesis system based on a deterministic plus stochastic decomposition. Computer Music Journal 1990, 14(4):12-24. 10.2307/3680788

  11. 11.

    S. N. Levine, “Audio representations for data compression and compressed domain processing,” Ph.D. dissertation, Stanford University, Stanford, Calif, USA, 1998.

  12. 12.

    Purnhagen H, Meine N: HILN-the MPEG-4 parametric audio coding tools. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '00), May 2000, Geneva, Switzerland 3: 201-204.

  13. 13.

    Wu C-P, Su P-C, Kuo C-CJ: Robust and efficient digital audio watermarking using audio content analysis. Proceedings of Security and Watermarking of Multimedia Contents II: Audio Watermarking, January 2000, San Jose, Calif, USA, Proceedings of SPIE 3971: 382-392.

  14. 14.

    M. Ali, “Adaptive signal representation with application in audio coding,” Ph.D. dissertation, University ofMinnesota,Minneapolis, Minn, USA, 1996.

  15. 15.

    Mansour MF, Tewfik AH: Time-scale invariant audio data embedding. EURASIP Journal on Applied Signal Processing 2003, 2003(10):993-1000. 10.1155/S1110865703304135

  16. 16.

    Bender W, Gruhl D, Morimoto N, Lu A: Techniques for data hiding. IBM Systems Journal 1996, 35(3-4):313-336.

  17. 17.

    Dong X, Bocko MF, Ignjatovic Z: Data hiding via phase manipulation of audio signals. Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, QC, Canada 5: 377-380.

  18. 18.

    Chen B, Wornell GW: Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Transactions on Information Theory 2001, 47(4):1423-1443. 10.1109/18.923725

  19. 19.

    Petrovic R: Audio signal watermarking based on replica modulation. Proceedings of the 5th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Service (TELSIKS '01), September 2001, Nis, Yugoslavia 1: 227-234.

  20. 20.

    Shin S, Kim O, Kim J, Choil J: A robust audio watermarking algorithm using pitch scaling. Proceedings of the 14th International Conference on Digital Signal Processing (DSP '02), October 2002, Pine Mountain, GA, USA 701-704.

  21. 21.

    Girin L, Marchand S: Watermarking of speech signals using the sinusoidal model and frequency modulation of the partials. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, QC, Canada 1: 633-636.

  22. 22.

    Liu Y-W, Smith JO: Watermarking sinusoidal audio representations by quantization index modulation in multiple frequencies. Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, QC, Canada 5: 373-376.

  23. 23.

    Harris FJ: On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE 1978, 66(1):51-83.

  24. 24.

    Bosi M: Perceptual audio coding. IEEE Signal Processing Magazine 1997, 14(5):43-49.

  25. 25.

    Zwicker E, Fastl H: Psychoacoustics, Facts and Models. Springer, Berlin, Germany; 1990.

  26. 26.

    Jayant N, Johnston J, Safranek R: Signal compression based on models of human perception. Proceedings of the IEEE 1993, 81(10):1385-1422. 10.1109/5.241504

  27. 27.

    Bosi M, Goldberg RE: Introduction to Digital Audio Coding and Standards. Kluwer Academic Publishers, Boston, Mass, USA; 2003.

  28. 28.

    Cox IJ, Miller ML, Bloom JA: Digital Watermarking. Morgan Kaufmann, San Francisco, Calif, USA; 2002.

  29. 29.

    Terhardt E: Calculating virtual pitch. Hearing Research 1979, 1(2):155-182. 10.1016/0378-5955(79)90025-X

  30. 30.

    Abe M, Smith JO: Design criteria for simple sinusoidal parameter estimation based on quadratic interpolation of FFT magnitude peaks. Proceedings of the 117th Audio Engineering Society Conventions and Conferences (AES '04), October 2004, San Francisco, Calif, USA 6256.

  31. 31.

    Shower EG, Biddulph R: Differential pitch sensitivity of the ear. Journal of the Acoustical Society of America 1931, 3(1A):275-287.

  32. 32.

    Wier CC, Jesteadt W, Green DM: Frequency discrimination as a function of frequency and sensation level. Journal of the Acoustical Society of America 1977, 61(1):178-184. 10.1121/1.381251

  33. 33.

    Zeng F-G, Kong Y-Y, Michalewski HJ, Starr A: Perceptual consequences of disrupted auditory nerve activity. Journal of Neurophysiology 2005, 93(6):3050-3063. 10.1152/jn.00985.2004

  34. 34.

    Liu Y-W: Audio watermarking through parametric synthesis models. In Digital Audio Watermarking Techniques and Technologies: Applications and Benchmarking. Edited by: Cvejic N. Idea Group, Hershey, Pa, USA; 2007.

  35. 35.

    Scharf LL, McWhorter LT: Geometry of the Cramer-Rao bound. Proceedings of the 6th IEEE SP Workshop on Statistical Signal and Array Processing, October 1992, Victoria, BC, Canada 31(3):301-311.

  36. 36.

    Wolters M, Kjörling K, Homm D, Purnhagen H: A closer look into MPEG-4 high efficiency AAC. Proceedings of the 115th Audio Engineering Society Conventions and Conferences (AES '03), October 2003, New York, NY, USA

  37. 37.

    Allen JB, Berkley DA: Image method for efficiently simulating small-room acoustics. Journal of the Acoustical Society of America 1979, 65(4):943-950. 10.1121/1.382599

  38. 38.

    Kabal P: An examination and interpretation of ITU-R BS.1387: perceptual evaluation of audio quality. Department of Electrical & Computer Engineering, McGill University, Montreal, Canada; 2003.

  39. 39.

    Pless V: Introduction to the Theory of Error-Correcting Codes. 3rd edition. Wiley-Interscience, New York, NY, USA; 1998.

  40. 40.

    Eggers JJ, Bäuml R, Tzschoppe R, Girod B: Scalar Costa scheme for information embedding. IEEE Transactions on Signal Processing 2003, 51(4):1003-1019. 10.1109/TSP.2003.809366

  41. 41.

    Moulin P, Koetter R: Data-hiding codes. Proceedings of the IEEE 2005, 93(12):2083-2126.

Download references

Author information

Correspondence to Yi-Wen Liu.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article


  • Index Modulation
  • Inverse Fourier Transform
  • Cover Signal
  • Watermark Scheme
  • Sinusoidal Signal