Voice Acoustics: an introduction (and redirection)

Speech science has a long history. Speech and voice acoustics are an active area of research in many labs, including our own, which studies the singing and speaking voice. This document gives an introduction and overview. This is followed by a more detailed account, sometimes using experimental data to illustrate the main points. Throughout, a number of simple experiments are suggested to the reader.

More about speech and singing

Voice science is a broad and active area of research. The references quoted in this essay appear below, and below that is a collection of links. One of the aims of this essay is to provide an introduction to our research on the voice and to our publications on voice and music acoustics.

This web essay was written by Joe Wolfe, Maëva Garnier and John Smith of the Acoustics Group at UNSW, 2009.

References

Alku, P., (1991). "Glottal Wave Analysis With Pitch Synchronous Iterative Adaptive Inverse Filtering", in Proceedings of Second European Conference on Speech Communication and Technology, Genova, Italy.
Barrichelo, V. M. O., Heuer, R. J., Dean, C. M. & Sataloff, R. T. (2001)”Comparison of singer's formant, speaker's ring, and LTA spectrum among classical singers and untrained normal speakers”, J. Voice, 15, 344-350.
Baken, R.J. and Orlikoff, R.F. (2000). Clinical Measurement of Speech and Voice. 2nd ed. Singular Publishing Group, San Diego, California.
Barney, A., De Stefano, A., and Henrich, N. (2007). “The effect of glottal opening on the acoustic response of the vocal tract” Acta Acustica united with Acustica, 93, 1046-1056.
Behnke E. (1880). The mechanism of the human voice, 12th ed. London: J. Curwen & Sons, Warwick Lane, E.C.
Bele, I. (2006) "The speaker's formant". J. Voice, 20, 555-578.
Bjorkner, E. (2006). Why so different? Doctoral dissertation. KTH, Stockholm.
Bloothooft, G. and Plomp, R. 1986a. “Spectral analysis of sung vowels. III. Characteristics of singers and modes of singing.” J. Acoust. Soc. Am. 79, 852-864.
Bloothooft, G. and Plomp, R. (1986b). The sound level of the singer's formant in professional singing. J.Acoust.Soc.Am., 79, 2028-2033.
Carlson, R., Granström, B. and Fant, G. (1970). "Some studies concerning perception of isolated vowels." STL-QPSR 2-3: 19-35.
Chen, M.Y. (1997). “Acoustic correlates of English and French nasalized vowels”. J. Acoust. Soc. Am. 102, 2360-2370.
Childers, D.G., Krishnamurthy A.K., (1985). "A critical review of electroglottography". Critical rev. biomed.l eng., 12, 131-161.
Childers, D. G. & Lee, C. K. (1991). “Vocal quality factors: analysis, synthesis, and perception”. J.Acoust.Soc.Am., 90, 2394-2410.
Clark, J. Yallop, C. and Fletcher, J., An Introduction to Phonetics and Phonology, Blackwell, Oxford (2007).
Cleveland, T. F., Sundberg, J. and Stone, R. E. (2001). “Long-term-average spectrum characteristics of country singers during speaking and singing.” J. Voice, 15, 54-60.
Dang, J. and Honda, K., (1997). "Acoustic characteristics of the piriform fossa in models and humans", J.Acoust.Soc.Am., 101: 456-465.
Ekholm, E., Papagiannis, G. C. and Chagnon, F. P. 1998. Relating objective measurements to expert evaluation of voice quality in western classical singing: Critical perceptual parameters. J. Voice 12, 182-196.
Elliot, S.J., Bowsher, J.M. (1982). "Regeneration in brass wind instruments", J. Sound & Vibration 83, 181-217.
Fant, G. (1960). Acoustic Theory of Speech Production. Mouton & Co, The Hague, Netherlands.
Feng, G. and Castelli, E. 1996. Some acoustic features of nasal and nasalized vowels: A target for vowel nasalization. J. Acoust. Soc. Am., 99, 3694-3706.
Flanagan, J. and Landgraf, L. (1968). "Self-oscillating source for vocal-tract synthesizers", IEEE Trans. Audio and Eletroacoustics, 16, 57-64.
Flanagan, J. L. (1960). Analog Measurements of Sound Radiation from the Mouth. J.Acoust.Soc.Am., 32, 1613-1620.
Fletcher, N.H. "Autonomous vibration of simple pressure-controlled valves in gas flows" J. Acoust. Soc. Am. 93: 2172-2180, 1993.
Garcia M. (1855). Observations on the human voice. In: Proc. Royal Soc. London, p. 399-410.
Garnier, M. (2007). Communication in noisy environments: from adaptation to vocal straining. Ph.D thesis, University of Paris 6.
Garnier, M., Henrich, N., Castellengo, M., Sotiropoulos, D. and Dubois, D. (2007). "Characterisation of Voice Quality in Western Lyrical Singing: from Teachers's Judgements to Acoustic Descriptions". J. Interdisciplinary Music Studies 1(2): 62-91.
Garnier, M., Wolfe, J., Henrich, N. and Smith, J. (2008). "Interrelationship between vocal effort and vocal tract acoustics: a pilot study". Proc. of ICSLP, Brisbane, Australia.
Garnier, M., Henrich, N., Smith, J. and Wolfe, J. (2010) "Vocal tract adjustments in the high soprano range" J. Acoust. Soc. America. 127, 3771-3780.
Gauffin, J. and Sundberg, J. (1989). "Spectral correlates of glottal voice source waveform characteristics." J. Speech an Hearing Research 32(3): 556-565.
Ghonim, A., Smith, J. and Wolfe, J. (2007) “The sounds of world English”
Goldstein, J.L. (1973). "An optimum processor theory for the central formation of the pitch of complex tones". J.Acoust.Soc.Am., 54, 1496-1516.
Hardcastle, W. and Laver, J.D. (1999). “The Handbook of Phonetic Sciences”. Blackwell Handbooks in Linguistics, Wiley-Blackwell.
Henrich, N., Kiek, M., Smith,. J. and Wolfe, J. (2007) "Resonance strategies in Bulgarian women's singing", Logopedics Phoniatrics Vocology, 32, 171-177.
Henrich, N. (2006). "Mirroring the voice from Garcia to the present day: some insights into singing voice registers." Logopedics Phoniatrics Vocology 31(1): 3-14.
Henrich, N., d'Alessandro, C., Doval, B. and Castellengo, M. (2005). "Glottal open quotient in singing: Measurements and correlation with laryngeal mechanisms, vocal intensity, and fundamental frequency." J.Acoust.Soc.Am. 117: 1417-1430.
Henrich, N., Smith, J. and Wolfe, J. (2011) "Vocal tract resonances in singing: strategies in different vocal ranges" J. Acoust. Soc. America. 129, 1024-1035. Hertegard, S. Larsson, H. and Grandqvist, S. "Vocal fold resonances at thigh and low pitch tuning", Proc. Stockholm Music Acoustics Conference (SMAC 03), (R. Bresin, ed) Stockholm, Sweden. 459–461 (2003)
Hertegard, S., Gauffin, J., Sundberg, J. 1990. “Open and covered singing as studied by means of fiberoptics, inverse filtering and spectral analysis”. J. Voice, 4, 220-230.
Hirano M, Vennard W, Ohala J. (1970). "Regulation of register, pitch and intensity of voice". Folia Phoniatrica, /22, 1-20.
Hollien, H. and Michel, J. F. (1968). "Vocal fry as a phonational register." J. Speech Hearing Research 11(3): 600-604.
Itoh, T., Takeda, K. and Itakura, F. (2002) "Acoustic analysis and recognition of whispered speech", in Proceedings of ICASSP, vol.1, 389-392.
Imagawa, H., Sakakibara, K.-I., Tayama, N. (2003). "The effect of the hypopharyngeal and supra-glottic shapes on the singing voice", in Proceedings of SMAC, Stockholm, Sweden.
Joliveau, E., Smith, J. and Wolfe, J. (2004a) “Tuning of vocal tract resonances by sopranos” Nature, 427, 116.
Joliveau, E., Smith, J. and Wolfe, J. (2004) "Vocal tract resonances in singing: the soprano voice", J. Acoust. Soc. America, 116, 2434-2439.
Johnson, K. (2003). Acoustic and Auditory Phonetics. 2nd ed.Blackwell, Oxford.
Kallail, K. J., and Emanuel, F. W. (1984a). "Formant–frequency differences between isolated whispered and phonated vowel samples produced by adult female subjects", J. Speech Hear. Res. 27, 245–251.
Kallail, K. J., and Emanuel, F. W. (1984b). "An acoustic comparison of isolated whispered and phonated vowel samples produced by adult male subjects", J. Phonetics 12, 175-186.
Katz, B. & D'Alessandro, C. (2007). Measurement of 3D Phoneme-Specific Radiation Patterns in Speech and Singing, LIMSI. http://rs2007.limsi.fr/index.php/PS:Page_14
Kitamura, T., Honda, K. and Takemoto, H., (2005). "Individual variation of the hypopharyngeal cavities and its acoustic effects", Acoustic Science and Technology, 26(1): 16-26.
Klatt, D. H. and Klatt, L. C. (1990) "Analysis, synthesis, and perception of voice quality variations among female and male talkers",J.Acoust.Soc.Am., 87, 820-857.
Kob, M. & Jers, H. (1999). Directivity measurement of a singer. J.Acoust.Soc.Am., 105, 1003.
Kob, M. (2003) “Analysis and modelling of overtone singing in the sygyt style” Appl. Acoust., 65, 1249-1259.
Kob, M. Henrich, N. Howard, D., Herzel, H., Tokuda, I. and Wolfe, J. "Analysing and understanding the singing voice: recent progress and open questions" Current Bioinformatics. In press.
Leino, T. (1993). Long-term average spectrum study on speaking voice quality in male actors. Proceedings of SMAC, Stockholm, Sweden, 206-210.
Lieberman, P., and Blumstein, S.E. (1988). "Speech physiology, speech perception, and acoustic phonetics." Cambridge University Press, Cambridge, UK.
Lindblom, B. E. F., and Sundberg, J. E. F. (1971). “Acoustical consequences of lip, tongue, jaw, and larynx movement,” J. Acoust. Soc. Am. 50, 1166-1179.
Matsuda, M. and Kasuya, H., (1999)"Acoustic nature of the whisper", in Proceedings of Eurospeech'99, 133-136.
Miller, R.L. (1959). "Nature of the Vocal Cord Wave". J.Acoust.Soc.Am., 31, 6, 667-677.
Miller, D.G. and Schutte, H.K. (1993). "Physical definition of the ‘flageolet register’". J. Voice, 7, 3, 206-212.
Miller DG. (2000). Registers in singing: empirical and systematic studies in the theory of the singing voice. Doctoral dissertation, University of Groningen.
Nawka, T., Anders, L. C., Cebulla, M. & Zurakowski, D. (1997). “The speaker's formant in male voices”, J. Voice, 11, 422-428.
Nearey, T. (1989). "Static, dynamic, and relational properties in vowel perception.". J.Acoust.Soc.Am.. 85, pp. 2088-2113.
Novak, A. and Vokral, J. (1995). "Acoustic parameters for the evaluation of voice of future voice professionals." Folia Phoniatrica Logopedica 47: 279-285.
Petersen, G.E., and Barney, H.L., ‘Control methods used in a study of vowels’, J. Acoust. Soc. Am. 24, 175-184 (1952).
Pinczower, R., Oates, J. (2005) “Vocal Projection in Actors: The Long-Term Average Spectral Features That Distinguish Comfortable Acting Voice From Voicing With Maximal Projection in Male Actors”, J. Voice 19, 440-453.
Rothenberg, M. “An interactive model for the voice source” Quarterly Prog. Status Report, Dept Speech, Music and Hearing, KTH, Stockholm, 22. 1-17 (1981).
Rothenberg, M. (1973). "A new inverse-filtering technique for deriving the glottal air flow waveform during voicing". J.Acoust.Soc.Am., 53, 6, 1632-1645.
Roubeau B, Castellengo M, Bodin P, Ragot M. (2004). "Laryngeal registers as shown in the voice range profile". Folia Phoniatrica Logopaedica, 56, 5, 321-33.
Scherer, R.C. (1991). "Physiology of phonation: A review of basic mechanics". Phonosurgery: Assessment and surgical management of voice. 77-93.
Smith J., Henrich N., Wolfe J. (2007) “ Resonance tuning in singing”, 19th International Congress on Acoustics, Madrid, Spain, Sept. 2007.
Smits, R., ten Bosch, L., and Collier, R. (1996). "Evaluation of various sets of acoustic cues for the perception of prevocalic stop consonants. I. Perception experiment".J.Acoust.Soc.Am., 100, 3852-3864.
Steinhauer, K.M., Rekart, D.M. and Keaten, J. (1992). “Nasality in modal speech and twang qualities: Physiologic, acoustic, and perceptual differences”, J.Acoust.Soc.Am., 92, p. 2340.
Stevens, K.N. (1999). Acoustic Phonetics. MIT Press, Cambridge, MA.
Stone, R., Cleveland, T., Sundberg, J., Prokop, J. (2003). “Aerodynamic and acoustical measures of speech, operatic, and broadway vocal styles in a professional female singer.” J. Voice, 17, 283-297.
Sundberg, J. (1974) “Articulatory interpretation of the ‘singing formant’,” J.Acoust.Soc.Am. 55, 838-844.
Sundberg, J., Gramming, P. and Lovetri, J. (1993) “Comparisons of pharynx, source, formant, and pressure characteristics in operatic and musical theatre singing”, J. Voice, 7, 301-310.
Sundberg, J. (2001), ‘Level and centre frequency of the singer’s formant’, J. Voice 15, 176-186.
Sundberg, J., and Skoog, J. (1997) “Dependence of jaw opening on pitch and vowel in singers,” J. Voice 11, 301-306.
Sundberg, J. (1970). "Formant structure and articulation of spoken and sung vowels." Folia Phoniatrica (Basel) 22(1): 28-48.
Svec, J., Schutte, H.K. and Miller, D.G. (1999). "On pitch jumps between chest and falsetto registers in voice: Data from living and excised human larynges". The J.Acoust.Soc.Am., 106, 3, 1523-1531.
Svec, J., Schutte, H.K. (1996). "Videokymography: High-speed line scanning of vocal fold vibration". J. Voice, 10, 2 , 201-205.
Swerdlin, Y., Smith, J. and Wolfe, J. (2010) "The effect of whisper and creak vocal mechanisms on vocal tract resonances" J. Acoust. Soc. America. 127, 2590-2598.
Takemoto, H., Adachi, S., Kitamura, T., Mokhtari, P., Honda, K. (2006). "Acoustic roles of the laryngeal cavity in vocal tract resonance", J.Acoust.Soc.Am., 120: 2228-2238.
Titze, I. (2001). “Acoustic Interpretation of Resonant Voice”, J. Voice 15, 519-528.
Titze, I.R., Bergan, C.C, Hunter, E.J. and Story, B. (2003). Source and filter adjustments affecting the perception of the vocal qualities twang and yawn. Logopedics Phoniatrics Vocology 28 : 47 – 155.
Van Den Berg, J. (1958). "Myoelastic-aerodynamic theory of voice production". The Journal of Speech Language and Hearing Research, 1, 3, 227-244. Van Den Berg, J., Zantema, J.T., Doornenbal, P. Jr. (1957). “On the Air Resistance and the Bernoulli Effect of the Human Larynx”. J.Acoust.Soc.Am., 29, 5, 626-631.
Vurma, A., Ross, J. (2002). “Where Is a Singer's Voice if It Is Placed “Forward”?”, J. Voice, 16, 383-391.
Weiss, R., W . Brown, J. and Moris, J. (2001). "Singer's Formant in Sopranos: Fact or Fiction?" J. Voice, 15, 457-468.
Wolfe, J. and Smith, J. (2008) "Acoustical coupling between lip valves and vocal folds" Acoust. Australia, 26, 23-27.

Voice Acoustics: an introduction (and redirection)

More about speech and singing

References

Links