THE EFFECT OF VOICE QUALITY ON HIRING DECISIONS

This paper examines the effect of voice quality on hiring decisions. Considering voice quality an important tool in an individual’s self-presentation in the job market, it may very well enhance his/her job prospects, while some voice qualities may affect employers’ judgments in a negative way. Five men and five women were recorded reading four different utterances representing answers to job interviewers’ questions in four different phonation guises: modal, breathy, creaky and pressed. 38 professional employment interviewers recorded the speakers’ hireability and personality ratings (likeability, self-confidence and trustworthiness) on 7-point semantic differential scales based on the speakers’ voice. The results revealed a significant effect of the phonation guises on the speakers’ ratings with the modal voice being superior to the cluster of non-modal voices. Interestingly, the non-modal guises were evaluated in a very similar way, except for the self-confidence category with the breathy voice getting the lowest scores on the one hand and the pressed voice correlating with high self-confidence ratings on the other.


Introduction
Efficient sharing of information is one of the most characteristic aspects of the current period; in the digital era, more and more emphasis is being put on an individual's ability to communicate effectively and to convey messages clearly and accurately both verbally and non-verbally.The importance of an individual's voice in everyday interpersonal communication can thus hardly be overlooked (Laver, 1980: 1;DeVito, 2016: 48).Considering the contemporary job market context, required educational qualifications and professional experience certainly do not represent the only decisive factors in the recruitment and selection process (DeVito, 2016: 24).It is the overall self-presentation of applicants at job interviews that seems to play a very important role when it comes to hiring decisions.In this respect, voice quality is considered an essential https://doi.org/10.14712/24646830.2017.37
Although there exist quite a lot of studies on human voice, investigation into voice quality and its social role has not attracted much attention until recently.This question has become of great interest not only to the academic community, but also to a wide range of professionals as well as to the media (Greer & Winters, 2015).However, to our best knowledge, not many researchers have examined the importance of voice quality within social interaction in the Czech context.
According to numerous scientific findings, voice quality does play a vital role in interpersonal communication, as it is a significant indicator of the speaker's physical, psychological and social characteristics (Laver, 1980: 1;Kreiman, Vanlancker-Sidtis & Gerratt, 2003;Moisik, 2012).The subject of the present paper is the empirical mapping of voice quality as one of the essential factors in the hiring decision process.

Voice quality and types of phonation
Defining voice quality in a clear-cut, satisfactory and generally acceptable way is a rather challenging task (Kreiman et al., 2003).This term tends to be used in various contexts, e.g. a professional singer approaches voice quality in a different way than a phonetician (Childers & Lee, 1991).Nonetheless, even speech scientists do not seem to refer to voice quality unequivocally.The total auditory impression of the characteristic colouring of an individual speaker's voice may be seen, in a broad sense, as the result of both laryngeal and supralaryngeal features, i.e., differences in phonatory settings and vocal tract resonance characteristics, respectively.In the narrower sense, voice quality could be viewed as deriving entirely from the laryngeal activity (Laver, 1980: 1).This study addresses phonatory modifications, and the term voice quality will thus refer to the laryngeal level only.
The basic type of phonation is modal voice, typical of most speakers; Hollien motivates the term in the following way: "... it includes the range of fundamental frequencies that are normally used in speaking and singing (i.e., the mode)." (1974;cited in Laver, 1980: 109-110).Modal voice is characterised by a neutral phonatory setting; the vibration of vocal folds is periodic without any audible friction and the overall laryngeal tension is moderate (Laver, 1980: 94, 111).This mode of phonation is efficient, with relatively high voice intensity and no special effort required (Skarnitzl, 2016).
However, the neutral laryngeal setting may be modified both voluntarily and unconsciously reflecting speakers' communication goals (Henton & Bladon, 1985;Anderson, Klofstad, Mayew & Venkatachalam, 2014;Greer & Winters, 2015).Modifications of the neutral phonatory setting may also be caused by changes in speakers' state of health, affective states, or may derive from voice pathology (Tykalová, Rusz, Čmejla, Růžičková & Růžička, 2014).The most common non-modal phonation types differing from the modal one in at least one parameter are breathy voice, creaky voice and pressed voice (Laver, 1980: Chapter 3).These types of phonatory modifications are included in our experiment.
In breathy voice, the mode of vocal fold vibration is inefficient compared with that for modal voice and is accompanied by slight audible friction; vocal folds do not come fully together, which leads to a higher rate of airflow than in modal voice.Consequently, a considerable amount of air is wasted and speakers might need to pause more often to draw breath.Both the intensity and fundamental frequency of breathy voice tend to be rather low (Laver, 1980: chapter 3;Henton & Bladon, 1985).If the laryngeal setting is of permanent nature, it is mostly the case of pathological speech (Shipley & McAfee, 2009;cited in Skarnitzl, 2016).Finally, let us note that women's voices are generally breathier than those of male speakers (Henton & Bladon, 1985;Mendoza et al., 1996), a consequence of differences in the shape of the glottis (Titze, 1989).
Creaky voice represents quite a complex phonation type as there exist several different kinds of it (Keating, Garellek & Kreiman, 2015).Generally, it is characterised by a great irregularity of vocal fold vibrations, low fundamental frequency and intensity, accompanied by creaking and popping noises (Anderson et al., 2014;Abdelli-Beruh, Wolk & Slavin, 2014).Hollien and Wendahl describe the auditory effect of creaky voice as "a train of discrete excitations or pulses produced by the larynx" (1968; cited in Laver, 1980: 124).Catford refers to it as "a rapid series of taps, like a stick being run along a railing" (1964; cited in Henton & Bladon, 1988).Both descriptions imply that the frequency of the vibration typical of creaky voice is so low that listeners can often identify individual pulses.
The last non-modal phonation type to be mentioned here is pressed voice, which involves very high laryngeal tension settings, in some cases even accompanied by hyper-tension of the whole body (Gray & Wise, 1959;cited in Laver, 1980: 129).Pressed phonation is often described as unpleasant, rough, rasping and strident (see studies cit- ed in Laver, 1980: 127).As in the case of creaky voice, the vibration of vocal folds may be aperiodic, and the voice contains more noise components (Moisik, 2012); however, fundamental frequency (F0) tends to be higher in pressed voice (Laver, 1980: chapter 3).
Spectrograms of an open central vowel [aː] pronounced by one of our female speakers in the four voice qualities mentioned above are shown in Figure 1.

Phonatory modifications in the social context
As mentioned above, various pragmatic reasons may lead speakers to modify neutral phonatory settings while interacting with other people.As for breathy voice, Laver (1980: 135) mentions the paralinguistic use of this phonation type in situations when an interlocutor wishes to communicate messages of a confidential or intimate character.Various studies show that female breathy voices are rated by listeners as more attractive (Henton & Bladon, 1985;Liu a Xu, 2011;Babel, McGuire & King, 2014;Greer & Winters, 2015).Henton and Bladon (1985) argue that British women might want to imitate breathy voice quality in particular communication contexts to increase chances of achieving their goals.Women with a breathy voice may thus be perceived as more desirable and may be given greater recognition by male interlocutors than women speaking with ordinary, modal voice.
The use of creaky voice has become quite widespread in English speakers, especially in the USA (Abdelli-Beruh et al., 2014;Anderson et al., 2014;Greer & Winters, 2015).Greer & Winters (2015) examined the possible social factors behind the increased use of creaky voice by young Americans.They found that creaky quality, traditionally interpreted as a masculine voice quality, contributes to the perception of greater authoritativeness, particularly in young women.Moreover, male speakers with creaky voice are perceived as more "cool" and more attractive.Young Americans may thus exploit this phonation type when attempting to establish authority; additionally, women may tend to use it more to gain the perceived higher status of men.Some studies show that American women who speak with creaky voice are often very successful and work in the sectors that are traditionally male-dominated, e.g.finance and print media (Carney, 2012;Lepore, 2012, cited in Anderson et al., 2014).Creaky voice, which is characterised by low fundamental frequencies, seems to be exploited when communicating intelligence, seriousness and determination.These findings are similar to those concerning the perception of pitch: speakers with lower-pitched voices tend to be perceived as stronger and more dominant (Puts, Hodges-Simeon, Cárdenas & Gaulin, 2007;Borkowska & Pawlowski, 2011).
Nonetheless, Anderson et al. (2014) conducted an experiment showing a rather negative perception of the creaky voice in American women.400 male and female listeners from across the United States rated audio recordings of seven women speaking in modal and creaky voice (average age 24).Women using creaky voice were perceived as less competent, less educated, less trustworthy, less attractive, and less hireable, all this regardless of the listeners' gender, age and region.
Finally, pressed voice is often used to signal anger and hostility (Moisik, 2012); a scalar relationship is often suggested between the degree of tenseness and the degree of anger expressed (Laver, 1980: 131-132).It is worth pointing out that some authors talk about harsh voice; this is typically understood as a more extreme setting of pressed voice.
According to Gobl and Ní Chasaide (2003), speakers with pressed voice may also be perceived as stressed, but also confident or even formal.Moisik (2012) argues that harsh voice quality is exploited as a means of representing social identity and stereotypes, namely racial stereotypes of Afro-Americans in the USA.
The survey of literature presented above shows that a speaker's voice quality may be an important tool in his or her self-presentation.On the one hand, the given voice quality may advance a speaker's status, but on the other hand, some voice qualities may affect listeners in a negative way.The aim of this study is to map the effect of the four types of phonation (modal, breathy, creaky and pressed) on the perception and ratings of speakers as job applicants in the job market, using the matched guise technique.Although most speakers use modal voice most of the time, this phonation type can be modified for various reasons.We examine hireability ratings in relation to the different phonation types and personality judgments (likeable, self-confident and trustworthy).

Stimuli
Our stimuli were produced by five male (M1-M5) and five female (F1-F5) speakers (average age 25 years, range 19-38 years).The choice of speakers, who were experienced students of phonetics or philology, and phoneticians, was based on their ability to mimic non-modal phonation types.Before recording the stimuli, all the speakers were thoroughly instructed and provided with examples of non-modal voice qualities.They were recorded while reading several repetitions of four different utterances (with an average duration of 15 seconds), each in one of the four different types of phonation (modal, breathy, creaky, pressed).The recordings were made at 48 kHz sampling frequency and 16-bit resolution using an AKG C4500 B-BC condenser microphone in the recording studio of the Institute of Phonetics, Charles University in Prague.
The utterances were designed by the authors so as to sound like answers to questions a job applicant is likely to be asked within a job interview context.Colloquial Czech features were thus used, as illustrated in the following example 1 : Angličtina co se týče takový tý běžný komunikace vůbec není problém.V němčině jsem si jistější, když píšu, než kdybych měl/a třeba s někým mluvit po telefonu.Ale třeba číst maily a odepisovat nebo tak, to je bez problémů; jenom prostě nejsem tak pohotovej/pohotová jako v tý angličtině.
All the recorded stimuli used in the perception test were inspected aurally and visually (using the waveform and spectrogram) by all three authors.The objective of this inspection was to choose each speaker's best rendition of each guise (i.e., phonation type), in other words to ensure the stimuli truly represent the respective voice qualities, as well as to ensure they were free from any speech errors and non-speech noise.An interested reader may find details about some acoustic analyses performed on the selected stimulimean F0 and spectral emphasis measured in [a aː] vowels -in the Appendix.The final set of utterances subsequently served as the basis for the perception test in which listeners evaluated the speakers' guises on various dimensions.

Perception test and participants
The perception test consisted of 40 stimuli (10 speakers x 4 stimuli) which were administered in one of four orders, in four blocks containing 10 stimuli each, with a short pause between the blocks.A short tone was used to signal the onset of each stimulus; the stimuli were then followed by a two-second pause and a desensitization sound.
The listeners were asked to record their ratings of speakers in an answer sheet which contained, for each item, four 7-point semantic differential scales: likeable / unlikeable; self-confident / unconfident; trustworthy / untrustworthy, and I would employ / I would not employ [the speaker], as illustrated in Figure 2. The test itself was preceded by three trial items in which the respondents familiarized themselves with the task.The listeners were instructed that they would hear recordings of various male and female candidates applying for a job position which requires interactions with customers.They were asked to try to rate the speakers based on the sound of their voice rather than the content of the utterances.The participants of the perception test were professional employment interviewers and executives from various companies located in Prague who conduct job interviews and make hiring decisions as part of their regular job routine.A total of 38 subjects, 10 men and 28 women (mean age 36.8 years; range 21-57 years), participated in the experiment and were offered 100 Czech crowns as compensation for their participation.
The perception test was administered by the authors of the paper; each participant performed the test individually, in a quiet room using high-quality Sennhesier HD 201 headphones.Praat (Boersma & Weenink, 2015) was used to play the files.
Subsequent statistical analyses and data visualisation were conducted in R (R Core Team, 2016), using the packages effects (Fox, 2003) and ggplot2 (Wickham, 2009).

Results and discussion
Overall, it can be stated that the voice manipulations performed by our speakers had a significant effect on the evaluation of the four characteristics, as shown by the results of a repeated measures ANOVA, with Phonation type being the independent variable within the variable Speaker: for likeability, F(3, 27) = 32.6;p < 0.001; for self-confidence, F(3, 27) = 48.0;p < 0.001; for trustworthiness, F(3, 27) = 49.2;p < 0.001; and for employ-ability, F(3, 27) = 48.1;p < 0.001.More detailed results are illustrated in Figures 3-6 for each personality characteristic.
The results suggest that, perhaps not surprisingly, modal phonation was perceived by our listeners as superior to all other phonation types (in other words, its ratings for likeability, trustworthiness and employability were generally higher; see below for self-confidence ratings).In addition, most of the non-modal guises are evaluated in a very similar way, especially in trustworthiness (Figure 5) and employability (Figure 6).
The most important exceptions to this general finding are visible in the self-confidence ratings (Figure 4).First, eight of the ten speakers were rated similarly for self-confidence in their modal and pressed phonation guise (i.e., modal and pressed phonation scores did not differ significantly); second, breathiness in one's voice impacted self-confidence ratings most negatively.It is interesting to point out that breathy phonation correlates with lower self-confidence ratings not only in male voices but also in female voices.This may be taken as lending indirect support to the study of Anderson et al. (2014) and others cited in section 1.2, according to which creaky phonation -located at the opposite end of the continuum between open and closed glottis configuration than breathy phonationis associated with confidence and authoritativeness.Table 1 shows the summary of posthoc pairwise t-tests, which were conducted for the ratings of the individual dimensions in the four guises.The data confirm what was mentioned above, namely that in most cases the rating of modal phonation significantly differs (p < 0.05) from the ratings of the other phonatory modifications, and that pressed phonation is rated differently from the other voice guises on the self-confidence dimension.
Table 1.Summary of posthoc pairwise t-tests, showing which voice qualities were evaluated significantly differently (p < 0.05) on which characteristics (L = Likeability, S = Self-confidence, T = Trustworthiness, E = Employability).Considering the studies mentioned in section 1.2 showing that breathy voice in women is perceived as more attractive than modal voice, we expected this non-modal voice quality to yield higher likeability scores for female speakers.According to our results, however, this is not the case suggesting that likeability and attractiveness do not appear to be simply interchangeable categories.Breathy voice, overall, is not considered as particularly likeable in the job market context; as indicated by informal responses of some of our subjects after the perception test, it rather implies a candidate's low self-confidence.Additionally, it appears to be perceived as projecting submission in men, which is likely to be considered an undesirable personality characteristic.An individual that sounds lacking in confidence or submissive might not be expected to be effective enough when performing his or her job, namely in the customer support branch.
Speakers with pressed phonation guise, on the other hand, were perceived as more self-confident than when using the other non-modal phonation guises, which is in line with Gobl and Ní Chasaide (2003).Given that pressed voice is also associated with anger and hostility, it might be expected to get lower ratings for likeability, and/or possibly for trustworthiness.However, our analysis did not reveal any significant differences in ratings for the two mentioned categories between the non-modal phonation types.Projecting self-confidence may appear to be an important feature in the job market context and could thus affect perceived likeability of the individual's voice.

Conclusions
In this study, we focused on voice quality as a means enabling an individual's personality projection and thus having an impact on employers' hiring decisions process.The main result of this study is the contrast in hireability ratings and perceived personal judgments between modal phonation on the one side and the three non-modal phonation types on the other.However, no significant differences were found within the non-modal voices cluster, except for the self-confidence rating.
Future research may thus focus on non-modal phonation types to further explore their effect on the speaker's ratings.It would also be interesting to investigate whether Czech listeners tend to perceive some of these phonation modifications differently from the listeners of different linguistic communities.Tykalová, T., Rusz, J., Čmejla, R., Růžičková, H. & Růžička, E. (2014).Acoustic investigation of stress patterns in Parkinson's disease.Journal of Voice, 28(1), 129.e1-129.e8. Wickham, H. (2009).ggplot2: Elegant graphics for data analysis (use R!).New York: Springer.

NOTES
1.The English version of the provided example of an utterance used in the perception test: My everyday communication in English, it's not a problem at all.And my German, well, I feel more confident when I write than when speaking to someone on the phone, you know.But, for example, I can read emails and reply to them, that's alright.It's just that I am not as prompt in German as in English.

APPENDIX
Mean fundamental frequency (F0) and spectral emphasis (SE) computed in Praat from 10 randomly selected [a aː] vowels in the four guises by individual speakers.We used a simplified SE measure: SE = SPL full -SPL 0 , where SPL full corresponds to the sound pressure level (SPL) of the full spectrum (0-8 kHz) of the given vowel and SPL 0 is the SPL of the low-frequency band cut off at a variable threshold of 1.5 * mean F0 in the vowel (Traunmüller & Eriksson, 2000)

Figure 1 .
Figure 1.Illustration of the four voice qualities described in this section.The horizontal line above each spectrogram corresponds to 10 msec.

Figure 2 .
Figure 2. A sample item from the perception test (labels translated from Czech).

Figure 4 .
Figure 4. Self-confidence scores for individual speakers in their modal, breathy, creaky, and pressed phonation.

Figure 5 .
Figure 5. Trustworthiness scores for individual speakers in their modal, breathy, creaky, and pressed phonation.