This is another in a series of blog posts written by PhD students in my Public Intellectuals class.

“Thinking through voice: Sound, identity, and race”

Edward B. Kang

If you’re like me, the pandemic-induced migration of social life to Zoom (the face-to-face parts at least) has really shed light on how jarring it is to hear random disruptions (silences) in speech. To be fair, my Internet connection sucks, but the effects of it were tolerable until now. It’s truly frustrating when I have to text my colleagues to ask what was just said, or to avoid being annoying, just listen through a patchy conversation in which my Internet sporadically glitches at just the right moments to make the discussion just the right level of incomprehensible. But beyond the frustrating disconnects that interrupt my ability to listen to others, I’m also made hyper cognizant about how my voicemight sound when I’m speaking on Zoom. I mean it probably goes both ways, right? Perhaps somewhat resonant with, but of course not nearly as enduring as, the way one is socially conditioned to feel self-conscious if s/he has a heavy accent or a distinct vocal timbre, my unstable Internet connection oddly manifests as a kind of temporary but still relevant and embodied ailment that mediates my voice in Zoom space.

I want to take some time here to thinkabout voice and all of the different things it stands for. As a budding scholar broadly interested in technological mediations of the voice and their manifestations in various sociocultural contexts from Voice ID, voice biometrics/analytics, interactive virtual assistants etc., I often force myself (and also pressured by the structure of academia itself) to locate specific sites in which my “intersection” of “voice/sound, identity, and technology” materialize, so I can analyze them for the purposes of producing a CV-worthy paper or a chapter for my doctoral dissertation. As rigorous and enlightening as this institutionalized method of critical and structured thinking can be, it can also take away from the practice of just thinking for thinking’s sake. Without having to “delineate my disciplinary boundaries,” “carefully lay out the limitations of my thinking,” “detail the methodological advantages of my objects of analyses,” “make interventions in current scholarly debates,” or write with the unavoidable factor of pleasing journal reviewers in mind, I want to take some time here to just thinkabout voice. Not “examine the ways, in which” or “drawing from the frameworks of.” Just think.

Voice is messy. And it can mean a lot of different things to different people in different contexts. We talk about “fighting for a voice,” by which we mean something along the lines of staking a claim to our political identities. A right to express our personhood. We “read other voices” as cues to interiority or as registers of well-being: “she said this, but I think she actually meant this.” “I can hear it in her voice.” We all have “inner” voices to think. Some of us have “outer” voices to speak. Someone might be the “voice of a community,” as an individual representing a collective. We also treat voice, in its most physical sense, as a kind of “sound object”, if you will. Something to be liked, circulated, compared, and bought, even. For vocalists, voice is something that can be trained, refined and, to some, maybe even perfected. It’s also the means by which they make a living. Think about the ways some talents are evaluated on the hit television show, The Voice. As William Cheng, Associate Professor of Musicology at Dartmouth University, observes in his book, Loving Music Till It Hurts, the contestants’ impressive singing voices become technologies of super-humanization or as romanticized correctives for those with disabilities. Voices deemed impressive can be deifying. But those deemed not can be crippling.

“The Blind Auditions: Dylan Marguccio sings ‘I Want You Back’ | The Voice Australia 2020”

As an ethnic and racial minority in America, I’m often told I don’t sound Asian. Without immediately denigrating these comments as ignorant, I’m inclined to say that the prevalence of these kinds of encounters for ethnic minorities acrossNorth America (actually probably across the world) really does speak to a larger cultural imagination (one that we are all responsible for) that affixes voice to identity. We talk about voices that are appropriatefor radio or opera. We often understand dialects as ways of categorization and identification. But also, in terms of ownership and authenticity. What do we make of Awkwafina’s “blaccent?”

“Crazy Rich Asians: Rachel Chu and Peik Lin Goh scenes”

Voice, as we know it, is raced, gendered, spatialized, and classed. It’s possible to have a voice in one sense but be completely devoid of it in another. It’s possible to have a voice that doesn’t “fit” you. It’s possible to useyour voice. It’s possible to have it taken away. Voice is not singular, but multiple.

Thinking about voice is complicated precisely because of this multiplicity. When a bank asks me to set up a Voice ID as part of a more secure two-factor authentication method, which part of my voice is it using as the ID? I don’t think it’s measuring my ability to express my personhood. I’m pretty sure those “without” much of a voice in this sense, can still technically set up a Voice ID at Charles Schwab. In fact, it’s been reported that prisons across the United States are coercing inmates to enroll into their voice biometric identification systems in order to maintain phone access. Let’s add “voice as object of control or surveillance” to the list as well.

It’s also probably not trying to identify hidden meanings that might be gleaned through the wayI say something. If anything, a reliable Voice ID should be able to match me with my voice regardless of whether I’m feeling down or excited, sick or well, right? That gets a little trickier because the actual tonality and the timbral qualities of our voices do change based on our emotions and health. And vocal timbre isactually one of the aspects of voice that gets factored into constructing a Voice ID. But the question is, how does it account for that inevitable variability inherent to vocal expression? Without getting into too much detail of how voiceprint technologies operate, I’ll just say that as a doctoral student researcher who’s been looking at patents of these kinds of technologies, they technically can’t, which is (1) why they are almost always used as supplementsand not alternatives to passphrases and (2) why there are numerous cases of expert impersonators deceiving these Voice ID systems.

“Dialect Coach Guesses Who Is Faking An American Accent”

Expert impersonators, voice actors, accent coaches, and even singers share a relationship to voice that really foregrounds that link we make between voice and identity. For one, they simultaneously riff on the singularity of voice as well as its collectivity. The fascination that follows a good vocal impersonation is based on the idea that we understand individual voices as just that – individual. And yet the perceptual similarity of the impersonation also questions that individuality. We’re confronted with a performance that questions the intimate relationship we have with our voices. If my voice is unique, why does that person sound exactly like me? Where do we locate the uniqueness of voice?

Accent coaches operate in a similar way. Without going into too much detail about the different ways that accents and dialects are positioned as sociocultural markers (Basil Bernstein or William Labov can tell you more about that elsewhere), we generally understand that they are often used to gauge other kinds of information about speakers. They are often linked to identity in ways that position the speakers as part of larger collectives (a Brooklyn accent, an Indian accent, an Oxford accent etc.) through which we try to gather additional sociocultural information.

And yet, the idea that we are able to gather such information by listening to accent or dialect is confounded by individuals who have learned to code switch effortlessly. I, for one, did not have the slightest clue that Alfred Enoch, who played Wes Gibbins on the American television series, How to Get Away with Murder, was a British actor until I watched this interview (and then I remembered he was Dean Thomas in theHarry Potterfilms).

“Alfred Enoch Shows Off His British and American Accents”

I find Enoch’s effortless switch from a British accent to an American one impressive, and based on the clip, I’d say the audience and the hosts seem to agree. But we need to remember that discussions around accent, dialect, and code switching inevitably also necessitate conversations around authenticity, ownership, and power. Where does one draw the line between code switching and cultural appropriation? At their most fundamental levels, both practices involve the adoption of different dialects or ways of speaking/voicing that presumably deviate from the way individuals might “originally” talk. Why do discussions around Eminem or Awkwafina’s cultural appropriation of the “blaccent” seem appropriate? And yet why does it seem odd to accuse Key & Peele of culturally appropriating White Americanness in this clip below?

“Key & Peele – White-Sounding Black Guys”

As Keegan Michael-Key and Jordan Peele “dial down their blackness” and speak in a way that “sounds whiter than Mitt Romney in a snowstorm,” they say they’re doing so with the hopes of not intimidating anybody, thus hinging their joke on a politics of respectability that tells Black Americans to police their own “intimidating” voices. More generally, this concept of respectability politics refers to a moralistic discourse that polices individuals from marginalized or minority groups to adhere to constructed standards of hegemonic “respectability.” In the context of language, this means that specific vernaculars are suppressed and replaced with what is generally understood to be a more “standard” – i.e. white – dialect. W.E.B Dubois in The Souls of Black Folkreferred to this “double consciousness” among Black Americans as the position in which one is forced to look and evaluate at one’s self through the eyes of others. This performance in the clip below by Keegan Michael-Key and Barack Obama also riffs on this same idea. Here Michael-Key is not only Obama’s anger translator, but also his vernacular code switcher.

“President Obama’s Anger Translator at White House Correspondent’s Dinner”

If we understand that Black Americans code switch in this way, as part of a larger system of oppression that necessitates a politics of respectability as a method of survival, how does this play into its separation from the flip side of that discourse in cultural appropriation? Perhaps we can try to unpack that difference by attending to the ways that Black Americans negotiate social pressures to conform to a standardized English at the moment in which they code-switch back. Ida Harris, writer and assistant editor for Blavity, talks about the shame she feels when she finds herself abandoning her “native tongue – African American Vernacular English,” in order to assimilate into the role of an instructor in a classroom. As a means of dealing with that shame, she references Derrick Harriel, Associate Professor of English and African American Studies at the University of Mississippi, suggesting that “the ability to code switch back into our Black selves is another way we subsist, feel whole and in some regard redeemed.”

Individuals who need to code-switch intowhat is standard American English (as opposed to those who just speak it) can, in this way, be seen to have an intimate relationship with the dialect that they switch backto. There is a sense of inwardness or affinity that only those who are burdened with the social pressure to code-switch share at the moment they return to their native dialect. At least, I know that that’s the case for me. It feels awkward (even strangely elitist or at least pretentious) to speak to other Korean people in English. It’s a space only available to us. Let’s cherish it. So, in addition to the exploitation, fetishization of culture as “exotica,” and the overall alienation that characterizes cultural appropriation, maybe on a more personal level, there’s also a sense of infringement on that intimate space of momentary redemption. When Jordan Peele says, “you never want to be the whitest sounding Black guy in a room,” it makes sense to me too.

But I also want to ask, without negating the above-mentioned dispossessions that follow cultural appropriation, what does policing the boundaries of those spaces of intimacy under the righteous duty to undo cultural appropriation, necessarily achieve? In simpler words, what does the negativity associated with cultural appropriation miss about the fundamental multiplicity of culture, and thus also the voices associated with them? Marxist intellectual and past Professor of International Studies at Trinity College, Vijay Prashad, in Everybody was Kung-Fu Fighting: Afro-Asian Connections and the Myth of Cultural Puritymakes the provocative suggestion that although as a defense tactic, laying claim to certain cultures and lineages may protect minority groups from the cruelties of racism, as a strategy for freedom, it only reifies culture as a separate and distinct artifact, thus taking away from the grander project of collective liberty which requires that we see all cultures as fundamentally interlinked. As Robin Kelley, Professor of American History at UCLA, asserted in 1999 for ColorLines Magazine, “All of us, and I mean ALL of us, are the inheritors of European, African, Native American, and even Asian pasts, even if we can’t exactly trace our blood lines to all of these continents.”

I’ve tried to trace this multiplicity of voice/identity/culture by sifting through the different ways it is sounded, taken, claimed, and replicated. And yet, I must admit it still weirdly makes sense to think of voice as something intimate and unique. And despite the inherent variability that arises even within an individual’s voice through emotion, age, culture, physical environment, and health, the idea of a voiceprint or Voice ID, which positions our voice as an invariant biometric identifier, is strangely seductive. Voice, like culture, oddly feels like something I can own as part of my identity.

Thinking about the voice is, in this way, an incessantly undulating and polymorphic process. It requires acknowledging the enormously variegated channels, abstract and concrete, through which it takes form and occupies our political, social, and cultural lives. It requires us to negotiate those irresistibly tempting understandings of voice as unique markers of identity with the equally accurate and critical perspectives that tells us voice, identity, and culture are never fixed but always rearranging according to the specific relations from which they emerge. This unruliness is precisely what makes voice such a difficult object/phenomenon/concept – thing– to study. But understood differently, this conceptual intractability is also what allows me to use it as the malleable mouthpiece through which I explore and comment on the multiplicity of culture, society, and politics writ large. It’s what allows me to link Schwab’s Voice ID to Key & Peele. Barack Obama to The Voice. And respectability politics to glitchy Zoom calls.

As Nina Sun Eidsheim, Professor of Musicology at UCLA, reminds us, we must resist the temptation to knowsound, and instead find ways to engage with it as a complex system of knowledge in and of itself.

Edward B. Kang is a PhD student at the Annenberg School for Communication and Journalism and Assistant Editor for the International Journal of Communication. His research concerns the social and cultural dimensions of digital technologies with a specific focus on the relationship between surveillance, race, and identity. Currently, he is interested in exploring the broader cultural imaginations around voice embedded into the operational logics of voiceprint technologies (voice biometrics, voice analytics). Apart from his own research, he has served as a committee member for Annenberg's annual Communication and Cultural Studies graduate student conference Critical Mediations, as well as led Music Production workshops for Annenberg's Critical Media Project with California Humanities.

Pop Junctions

Thinking Through Voice: Sound, Identity and Race