John Ashbery’s Reading Voice


Arts & Culture

75 at 75,” a special project from the 92nd Street Y in celebration of the Unterberg Poetry Center’s seventy-fifth anniversary, invites contemporary authors to listen to a recording from the Poetry Center’s archive and write a personal response. 

John Ashbery at 92Y in 1970 – Frank O’Hara Tribute reading (photo by Jack Prelutsky)

The Unterberg Poetry Center at the 92nd Street Y has a seventy-year archive of recordings—it began hosting readings in 1939 and recording them in 1949—and it offers a unique opportunity to study poets’ voices and reading styles. Between 1952 and 2014, John Ashbery made seventeen appearances on the stage of the Poetry Center. He read with other poets—Barbara Guest, Mark Ford, Jack Gilbert, John Hollander, J. D. McClatchy, W. S. Merwin, Kenneth Koch, Ron Padgett, and James Schuyler. He read with painters—Jane Freilicher and Larry Rivers. And he joined in readings honoring other poets—tributes to Frank O’Hara (1970), Elizabeth Bishop (1979) and Marianne Moore (1987). Ashbery, who made regular Poetry Center appearances from the ages of twenty-four to eighty-seven, is on a short list of poets whose Y readings spanned so many decades (others include W. S. Merwin, Gwendolyn Brooks, Adrienne Rich, Richard Wilbur, and Galway Kinnell).

As a scholar and poet who uses software to analyze performance style in poetry recordings, I was thrilled when Bernard Schwartz, the Poetry Center’s director, invited me to study the archive. The Ashbery readings seemed, to me, like a perfect corpus to begin with.

But even those who loved attending Ashbery’s poetry readings (I am one of them) might feel that he’s the last poet in the world whose performance style is worth studying. He typically read in a restrained, unassuming voice, and the unofficial consensus is that the performative energy of his poetry plays out not in the vocal delivery, but in the slippery syntax, the sly comedy of skewed idioms, the rich mixture of vocabularies and startling tropes, the momentum of swerving thought. His poems can elude the audience’s understanding in a live reading, and they elude many readers on the page as well.

Raphael Allison (as I discussed in Beyond Poet Voice: Sampling the (Non)Performance Styles of 100 American Poets) describes Ashbery’s reading style as “a performance of nonperformance.” He means this as a compliment, particularly in reference to a 1963 reading recorded at the Living Theatre. However, Richard Howard, who actually attended the reading that night, remembers Ashbery as having, in Allison’s words, “read with extreme dramatic flair.” Ashbery was “striding up and down, smoking, wreathed in clouds of smoke … on the set for The Brig [a play about a soldier that went up in May of the same year] behind a lot of barbed wire,” Howard remembered. “It wasn’t certain on that occasion whether the wire was to keep him from us or us from him.” Clearly Howard ascribes a certain power to Ashbery’s physical presence, while Allison has only the recording to judge from.

Here’s another perspective. “John Ashbery’s near monotone suggests a dreamier dimension than the text sometimes reveals,” writes Charles Bernstein. Once we have heard a poet like Ashbery read, he feels, “we change our hearing and reading of their works on the page as well.”

I witnessed Ashbery read on four occasions. At one of these, on April 8, 2001, he cleared the room—the beautiful Morrison Library reading room at the University of California, Berkeley. The reading began as standing-room only. My boyfriend and I were the last to squeeze in the door. Charles Altieri introduced Ashbery with sincere, abstrusely articulated enthusiasm, sat down, and soon fell asleep.

I watched as many in the audience become visibly, unduly mystified by the poetry, or Ashbery’s manner of reading it, or both. Or they were simply bored. The undergraduates, drawn by the aura of Ashbery’s name, streamed quietly out of the room in ones and twos and threes, until it was more than half empty. But I was committed to the end. I was writing my dissertation in part on Ashbery’s poetry, and his writing had changed my attitude to boredom, to poetry, to language itself.

It is common to be bored at a poetry reading, or at least under-stimulated, especially by poets esteemed in the academy. My own inarticulate pleasure in, and intense irritation with, certain poetry reading styles is what led me to research poetry performance in the first place.

If I’ve learned anything in this rather rarefied line of research, it’s that the voice is a slippery thing, and so is our perception of it. Speech scientists concur with Robert Frost that the “tone of meaning … without the words”—the intonation and rhythm of the voice—are often perceived as more important than the words. Whenever we listen to a voice, we bring all sorts of unconscious and half-conscious expectations and biases to the experience.

Those who walked out on Ashbery at Berkeley in 2001 probably thought, in some way, that he wasn’t reading the way a poet should, or that his poems were not what they thought poems should be. When we hear a voice, and especially when we listen to a disembodied voice—we listen with expectations and biases in regard to gender, age, race, ethnicity, class, sexual orientation, cultural or religious background, education, educational background, region, nationality, mood, et cetera. We try to pin down the speaker’s identity, and complain if they do not fit our expectations. The Berkeley undergraduates of 2001 might have expected Ashbery’s vocal delivery to sound more like a poet, or more queer, or more like a New Yorker, to correspond with whatever vocal stereotypes or conventions they had in mind for these roles or identities.

In her recent book, The Race of Sound: Listening, Timbre and Vocality in African American Music, Nina Sun Eidsheim gives us a term for the question we ask when we listen to a voice: Who is this? We don’t just ask this when we answer a phone call from an unknown number. When we listen to any stranger’s voice, we try to pin down the speaker’s identity—and thus radically reduce that voice’s individuality to conform to or be rejected by our expectations. Eidsheim calls this the acousmatic question—after Pierre Schaeffer, who “derive[s] the … root [of acousmatic] from an ancient Greek legend about Pythagoras’s disciples listening to him through a curtain.” She argues that it relies on fundamental misunderstandings of the human voice and our own listening practices, particularly in regard to vocal timbre. One of her case studies is the voice of Jimmy Scott, a jazz singer who was sometimes characterized as a freak (as he arguably was in Episode 29 of Twin Peaks). Though he was a cisgender male, Scott suffered from Kallmann syndrome (delayed or absent puberty), and had a limited career due in part to racialized assumptions about how a black man should sound.

Who is this? is often the wrong question, but we are always asking it anyway. The next time you listen to a recorded voice without knowing the speaker’s identity, ask yourself what assumptions you are making about their identity, and why.

In The Audible Past: Cultural Origins of Sound Reproduction, Jonathan Sterne advances a persuasive critique of conventional assumptions about hearing versus seeing, which he calls “the audiovisual litany,” including the notions that “hearing tends toward subjectivity, vision tends toward objectivity” and “hearing is a temporal sense, vision is primarily a spatial sense.”

Of course, seeing is no more objective than hearing, and both hearing and seeing operate spatially and temporally. But a poem holds still on the page when we study it. A recorded poem does not. Nor do our perceptions of what we’ve heard. And so what I call slow listening—listening repeatedly to the same recording, and making some attempt to analyze recorded voices as physical phenomena, to visualize their effects, and to analyze quantitative data about them can illuminate (there’s the hegemony of the visual for you!) what it is we have just heard. Slow listening serves as a refinement of, and sometimes a corrective to, our impressionistic perceptions; developing this technique has made me more aware of my own biases as a listener, and it has made me listen more precisely.

So what was Ashbery up to as a reader? Studied calm? Dramatic flair? Trance-inducing monotone? Was he an unusually inexpressive reader, not to say boring? And did he always read in a similar manner? What was characteristic of his voice, anyway?

When I analyze a poet’s voice, I start with pitch and timing patterns. Based on some linguistic research and our own intuitions about what makes a voice sound expressive, neurobiologist Lee M. Miller and I have developed a toolbox of prosodic measurements called Voxit.

Pitch is typically measured in Hertz, or cycles per second; with the human voice, this means the number of times the vocal cords vibrate per second. Among the fifty male American poets I sampled in “Beyond Poet Voice,” the average pitch was 115 Hz. (Richard Blanco, Carl Phillips, Ted Kooser, Robert Pinsky, Matthew Zapruder, Peter Gizzi, and Mark Doty ranged from 81 to 91 Hz, while CA Conrad, Amiri Baraka, Joshua Clover, Robert Hass, Juan Felipe Herrera, and Alberto Rios were at the upper end, from 139 to 151 Hz).

What about Ashbery? In a sampling of recordings drawn from his readings at the Poetry Center, his pitch ranged from 100 to 149 Hz. As a generalization, Ashbery seemed to use lower pitch when he was younger and higher pitch when he was older. A much larger sample would be needed to confirm this, but the finding aligns with the research: the pitch of male voices tends to rise with age.

People may raise their pitch when emotions become more intense, as when Ashbery read Elizabeth Bishop’s “Over 2,000 Illustrations and a Complete Concordance” at the Y’s Earth Day event in 1997. In The Last Avant-Garde: The Making of the New York School of Poets, David Lehman remembered that, “When [Ashbery] reached the last stanza, he cried.” As you can hear, Ashbery starts to sound hoarse and teary around line 54 (“asking for cigarettes”), part way through the second stanza. He recovers and breaks down again for much of the last stanza, beginning with “Why couldn’t we have seen / this old Nativity while we were at it?”

When a speaker changes pitch faster, either up or down—we measure this as pitch speed and pitch acceleration—they sound more expressive. In these terms, Ashbery uses his most expressive pitch—his fastest pitch speed—in his youth, and, not surprisingly, when he reads humorous crowd pleasers, no matter the year. For instance, the masterful sestina “The Painter” in his 1952 debut reading; or “The Songs We Know Best,” in both 1981 and 2008; or the comic sestina “Faust,” in 1967 and 2008, which was inspired (as Ashbery explains in 2008) by a comic strip about The Phantom of the Opera in the Montpellier newspaper. He uses pitch least expressively—his slowest pitch speed—when he reads Marianne Moore’s poem “Abundance,” at the tribute in 1987. “Abundance” is a highly formal poem of nine stanzas, and like many of Moore’s poems, the poem’s mood is one of quiet, restrained amusement; perhaps Ashbery reads it with rather flat intonation to enact a deadpan tone. Perhaps he does this all the time, to some degree.

What about rhythm? How quickly a poet speaks, how much their speaking rate varies, how often they pause, and for how long—these factors influence the perception of rhythm and how regular the rhythm is. Long pauses in speech create suspense, and, if they do not recur, they can break a rhythmic pattern. As a generalization, the more predictable a poet’s rhythmic complexity, the more formally they may read—whether the poem they are reading is written in a fixed form or not. In my research, I have found that Allen Ginsberg exhibits very low rhythmic complexity, or a predictable rhythm, in reading Howl, for instance, while a conversational poet such as Dean Young sometimes uses very high rhythmic complexity, or an unpredictable rhythm.

So when does Ashbery read most formally, in terms of regular rhythm? And when does he use a more irregular rhythm that is more typical of conversation than formal poetry? Does the rhythm he deploys in the reading of the same poem shift over time? Below are samples of the same two poems, “Faust” and “Rivers and Mountains,” from 1967 and 2008.

In the 1967 reading, Ashbery tended toward a more predictable, formal rhythm. Perhaps he was feeling rather formal that night, or in that era? Ashbery never wrote a great deal in fixed forms, and perhaps he moved away from them more as his poetry developed. But in 2008, he read a number of poems that use anaphora and other forms of verbal or rhythmic or even musical repetition and catalog (“He,” “Default Mode,” “They Knew What They Wanted” and “The Songs We Know Best”) and read them with a more conversational, less predictable rhythm than he might have in 1967. Of course, the use of irregular rhythm, shifting emphases and long pauses also play well for comedy and dramatic suspense.

On the 2008 recording, Ashbery sounds like he is good spirits—he decides to read two more, rather than one more, poem at the end. It’s as if he is having a lively conversation, albeit one-sided, with an appreciative, frequently chuckling audience. It reminds me of the best reading I ever heard him give—at the New School’s John Ashbery Festival in 2006—when he read “Litany,” a poem famously written in “two columns meant to be read as simultaneous but independent monologues” with Ann Lauterbach, James Tate, and Dara Weir. It was deeply funny and poignant at once—Ashbery at his best, feeding off the energy of conspiratorial collaboration.

The best way to appreciate Ashbery’s reading style is to listen, of course, yet Ashbery himself was not always sold on poetry in performance. In a 1966 interview (included in the Y’s 2017 memorial tribute to Ashbery), he said: “Well, I’m against poetry being read out loud. That may sound funny. When I hear poems read out loud, I really don’t get very much from them. I have to see the poem and hear it in my mind for it to really mean something. In fact, when I’ve read poems out loud, sometimes people will say, Oh I really understood that when you read it, I got a great deal more out of it, which is not what I want to happen. Because, I mean, if I had written the poem right, it should mean more when it was read on the page.”

Ashbery did not particularly like his own voice, or at least his native upstate accent. Of meeting Frank O’Hara for the first time, he remembered: “It was rather a surprise when I overheard a ridiculous remark such as I liked to make uttered in a ridiculous nasal voice that sounded to me like my own, and to realize the speaker was Frank … Though we grew up in widely separated regions of the Northeast, we both inherited the same twang, a hick accent so out of keeping with the roles we were trying to play that it seems to me we probably exaggerated it, later on, in hopes of making it seem intentional.”

My favorite way to read Ashbery is on the page, while listening to his recorded voice. His 1952 reading of “The Painter” is especially delightful. At the age of twenty-four, he reads “The Painter” with the broad vowels of his upstate accent fully intact.

Listening to Ashbery’s comparatively expressive early reading of “The Painter” reminds me how much different voices are crucial to his poetics, whether they are explicitly different characters in a poem or simply contending points of view within a single consciousness. It’s no surprise that his own voice and performance style changes—perhaps more than we would have thought—poem by poem and over the years.


Marit MacArthur is a lecturer in the University Writing Program and an affiliate faculty member in Performance Studies at the University of California, Davis.