AI is changing the way we write songs—but music has always embraced machine language.
Soon after I’d arrived in New York in the late nineties, I found a job in a vintage synthesizer shop (now gone) where I presided over restored Moog monophonic keyboards and was paid in rubber-banded rolls of twenty-dollar bills. I devised a lunch-break ritual: I’d walk a few blocks up to Gourmet Garage at Broome and West Broadway, where I would get a sourdough baguette and seltzer water. Then I’d head over to the bus shelter around the corner, where I’d sit and write in a spiral notebook.
The entry for October 16, 1998, has the title “Franchise a rock band.” Meaning: invent a logo, which would be both the band name and the brand; then write, record, and copyright a bunch of material, post it online as a step-by-step kit that anyone could download for the licensing and intellectual property, along with PDFs of lead sheets (shorthand scores with chord diagrams and notated melody), and some further specifications about instrumentation, lighting, sound effects, outfits, and so on. Anyone who had the kit could set up wherever they were—Orlando, Helsinki, Tokyo, Cairo, Ann Arbor, Madrid, Singapore—and perform the material as the band, just as someone with overhead and staff could open a Taco Bell or a Dunkin’ Donuts. Different locales would introduce shades of difference in performance—surely the Helsinki band would sound different from the Orlando band—and then live recordings of the different instantiations could be compiled and released in elaborate vinyl anthologies with liner notes featuring various experts discussing the nature of authenticity, the vexed relation between art and commerce, and so on.
This wasn’t about trying to get rich; I had no interest in making a profit. It stemmed rather from my desultory toilet reading in Andy Warhol’s POPism and also with my sense of the dreary uniformity of “indie rock”: always the same lanky guys (and the occasional girl) with carefully mussed hair looking identically “authentic,” dispensing more or less indistinguishable chords and melodies. Since my days not working in the Moog shop were spent making nine-minute songs with titles like “The Continuing Adventures of Cardinal Caterpillar” on a cassette multitrack recorder in a tiny room in Brooklyn, subsisting entirely on street-vendor coffee, bagels, SweeTarts, tap water, and Parliament Lights, the Franchised Band idea was a desperately contrived fantasy meant to achieve a conceptual sophistication along the lines of Warhol’s Brillo Boxes, but within the constrained format of the rock band. This all strikes me now as completely preposterous—and to some degree it’s been superseded by the hyperefficient Swedish studio wizards that crank out perfect megahits for Britney Spears, Katy Perry, et al. But at the time I thought it was revolutionary. I pitched it to a bunch of label people in New York. As I explained it, every last one of them started to giggle.
Now an algorithm has written some pop songs. This news led me initially to despair, then to scoffing disbelief. Computers writing songs! I imagined a monstrous aural mistake, as if Amazon’s coercive “suggestions” (We see you’ve purchased x—you’ll love y!) had spawned a disgusting musical tchotchke. This, I thought, was the sort of thing historians of the future will hold up as evidence of twenty-first-century humans in the grip of an insipid techno-utopianism; or, worse, as proof that we’d ceased to be messy, feeling creatures and had at last succeeded in turning ourselves into code.
When I finally got around to listening to “Daddy’s Car,” the Sony CSL “Flow Machine” algorithm’s pastiche of midsixties Beatles, I was—after swooning with awe—immediately brought back to the idea of franchising a band. The CSL algorithm analyzed a database of fifty Beatle lead sheets—of exactly the kind I’d imagined as part of the downloadable kit—and made an aggregate distillation of melodic and chordal patterns, so turning Beatles music into a computational object. Does this sound sinister? It doesn’t matter, because the resulting song is so eerie and strange and weirdly likeable, it will lead you to forget about whatever pieties you may be clinging to about computers making music. Eerie and strange first of all because the proportions are all wrong. The music is put together at odd, severe angles; conventional song structure buckles under the motiveless literalism of the algorithm. The parts roll out as if pulled haphazardly from a spool of Beatle fabric. And likeable because somewhere in there is that familiar ratio of Lennon sour to McCartney sweet: the primary colors of “Got to Get You Into My Life,” the dry leaves of “Nowhere Man,” the sugar high of “Here There and Everywhere,” the psych-fuzz of “She Said, She Said,” the wistful “Girl,” the normcore “Michelle,” all of it melted down and recast as modular Legos: blown up, reversed, inflated, glazed, airbrushed, cropped and lit. It sounds like the future of song.
Before I go totally over the top with love for “Daddy’s Car” (no doubt some kind of compensatory reversal of my original knee-jerk rejection) I ought to note that a human being is involved in all this: the French composer Benoit Carré, who took the raw data generated by the CSL and matched it to audio from existing recordings using something called the Rechord system, described as a “concatenative synthesis engine dedicated to the generation of accompaniment tracks.” As I understand it, the engine was made to collaborate with the style deduced from the lead sheets, at which point it started coming up with sui generis Beatle parts. Benoit then arranged, produced, and mixed the tracks that resulted from this collaboration. Carré has also written the words. Here is what he’s come up with for the CSL to sing:
In daddy’s car, it sounds so good
something new, it turns me on,
good day sunshine in the back seat car
wish that road would never stop
down on the ground
the rainbow led me to the sun
please mother drive
and then play it again.
Was Benoit setting out to fashion a collage of vaguely Beatles-y sentiment, or of the Technicolor free-association of LSD bards like Syd Barrett? I don’t know, but the effort to simulate AI style here ends up more like the anodyne art one finds in dentist-office waiting rooms—all those cookie-cut Kandinskys and muted Pollocks—and misses the off-kilter frisson of the genuinely artificial. (Notice I’ve achieved a complete 180-degree reversal of my initial recoil from AI music, which is now being held up as a benchmark for a new kind of authenticity.)
It’s not as though AIs haven’t shown a flare for verbal invention. In one recent algorithmic excursion into aesthetics, a neural network studied the whole array of Sherwin-Williams hues and invented some “new colors,” all of which look like they’ve been soaked in cat urine. But the names the AI came up with for the colors are incredible: Snowbonk, Stargoon, Grade Bat, Bank Butt, et cetera. Another AI reimagined the possibilities of gustatory pleasure with recipes for dishes like “Cream of Sour Cream Cheese Soup” and “Chocolate Chocolate Chocolate Cake” and “Chocolate Chicken Chicken Cake.” What would the lyrical equivalent of all this be? My sense was that “Daddy’s Car,” lyrically, could have benefitted from more of a “stargoon” vibe, like, say: “in Stargoon’s car / it sounds so snowbonk … ” It’s a better fit with the CSL song structure, and it taps into the side of the Beatles that began to evangelize about “The Eggman” and sang “Goo Goo Ga Goob!”
Another of the CSL’s compositions, the excellently titled “Ballad of Mr. Shadow,” is in the classic style of Tin Pan Alley. The song itself isn’t as immediately striking as “Daddy’s Car,” but it’s still fascinating. Facets of Irving Berlin, Jerome Kern, Hoagy Carmichael, and Cole Porter, along with a more anonymous open-range cowboy lyricism, seem to refract off one another. Something about the greater historical remoteness of the style makes the song sound creepier, and the vocal modeling is more aggressive: melodic intervals leap out with a glutinous, almost pornographic clarity. The video for “The Ballad of Mr. Shadow” shows a gray blob that looks like an undulating fingerprint riding a squiggly horse along a digitized beach.
But even pop songs of the early twentieth century were beginning to undergo translation into machine language. One of the great architects of Tin Pan Alley style, George Gershwin, recalled standing in front of a penny arcade on 125th Street at age six (1904), thrilled by the “peculiar jumps” of “an automatic piano leaping through Rubinstein’s Melody in F.” The player piano he heard was essentially a computer that ran on air: foot pedals pumped a current through a rotating, pneumatic cylinder whose raised teeth picked out patterns punched into paper rolls, these being the “software.” The gonzo auteur of player-piano music is Conlon Nancarrow, who composed almost exclusively for the instrument and with a perverse attentiveness to its nonhuman possibilities: he attached metal strips to the top of the hammers inside the piano so that each note had a piercing, ice-pick-to-the-forehead clarity.
Player-piano tech went fully digital in the 1980s with Musical Instrument Digital Interface (MIDI), a computer protocol for assigning numbers to notes, thus allowing different electronic instruments to “play” one another. If you’ve ever seen a MIDI “matrix” editor on a computer, you’ll notice immediately that it’s pretty much exactly like a player-piano roll, with notes represented as rectangles of different lengths cut out from a moving grid. “Black MIDI” computer musicians, who slam the matrix-note editor to the point where the digital “roll” is near-completely blacked out, are in essence the inheritors of Nancarrow’s vision. There were also, in the nineties, some piano nerds in Germany who programmed a Yamaha Disklavier—a huge beautiful Yamaha grand piano fitted with a MIDI interface—to read some of the player-piano rolls Gershwin himself had punched in the twenties. They made gorgeous recordings of “Sweet and Lowdown,” “Swanee,” “Rhapsody in Blue,” “I’ve Got Rhythm,” “An American in Paris” and a bunch of others. With the Disklavier acoustic tech goes needlessly, deliciously meta on digital, to produce a rich, scarily lucid machine music.
I can remember my first dawning intimation that music had something to do with numbers: the mornings spent looking at patterns on an overhead projector while I and my fellow second graders, each with a pair of wooden sticks, clicked and counted in Morse-like unison to notated rhythms projected on a screen. That year I was selected to be the “drummer boy” for the school’s Christmas concert, which meant I sat in front of the whole school and kept time on a ringing tom-tom.
A few years later, my parents allowed me to order a book called The Music of Frank Zappa, which arrived in the mail in a manila envelope after what felt like years of waiting. The book, which had been translated from French into choppy English, noted connections between Zappa’s music and something called “serialism.” The word was accompanied by an inset photo of an old, extravagantly bald man with a crisply defined blood vessel zigzagging along his left temple. Under the picture it said “Arnold Schoenberg.”
I had no idea what Zappa’s music could possibly have to do with this desiccated ghoul, so I went to the set of Encyclopedia Britannica in my father’s study, got out the volume covering “Schnook-Tirah” and flipped to “Schoenberg.” This led to a cross-reference, “Music, Western, Twentieth century”—some twenty-five pages of tiny print. I managed to learn that there were a bunch of guys in the twentieth century, mostly European, who began to write music based on a system of rules for treating all twelve tones of a diatonic scale equally, rather than subordinate them to conventions of harmony and voice leading. Some composers had taken this numerical determinism all the way into rhythm and timbre, making for a totally quantified yet mostly aleatory music. This game, the entry noted, represented for the “serialists” the most progressive advance in music after the chromaticism of Claude Debussy’s Prélude à l’après-midi d’un faune, which the article named as an important turning point in modern music.
Tucked away at the end was an enticing addendum: “Pythgoreanism” (Probescidia-Rubber). Beyond some captivating biographical details—Pythagoras hid behind a curtain whispering cryptic injunctions to his followers, among them to abstain from eating beans—was the idea that this pre-Socratic polymath was the first to codify relations between numerical ratios and harmonic intervals. In other words: it turned out music had been nothing but numbers for twenty-five hundred years.
The unnatural digitized terracing of the CSL’s vocal style bears some resemblance to the scene in 2001: A Space Odyssey when the HAL 9000 computer sings a song while having its mind turned off. HAL’s brain is an oblong room full of hundreds of glass cartridges, which slide out of their containers with dramatic slowness. “You are destroying my mind … I will become nothing,” HAL says as the one surviving crew member unplugs HAL and the brain cartridges slide out. As the voice slows to a gurgle, HAL sings, “Daisy, daisy, bring me your answer true”: a song, “Daisy Bell (A Bicycle Built for Two),” written in 1892 by Harry Darce and taught to the Bells Labs IBM 7094 computer in 1961, with vocoder programming by John Kelly and Carol Lockbaum.
As a kid, I found 2001 incredibly boring. We had a copy in the Sony “Beta” videocassette format and my dad would watch it about once a year. Its glacial pace didn’t register with my science-fiction sensibilities, shaped by the Star Wars franchise—not just films but toy replicas of every character and vehicle, T-shirts, posters, baseball hats, socks, trading cards, drinking glasses, swimming trunks, board games, and my first school lunch box: a squat, tin affair with an X-Wing battling a TIE fighter on one side, and R2-D2 and C-3PO on the other. By 1981, my friends and I could imitate most of R2’s repertoire of bloops and bleeps—not incomparable to the first-wave computer music of the fifties—which covered an expressive range, from plaintive reflection to disputatious nit-picking to sarcasm, sadness, boredom, and glee.
That R2’s computer speech had itself become a meme was evident in the era’s network TV, which may as well have been the churned out by a Stars Wars algorithm. Battlestar Galactica—the first one, not the reboot—featured a robot dog named Daggit, clearly an R2 knockoff. Another, Buck Rogers in the 25th Century, had a diminutive child butler named Twiki who would preface his cheeky one-liners wih the noise “Bi-dee-bi-dee-beee.”
I’ve since learned that 2001 is a great work of art, and that the Star Wars TV copies are kitsch at best, that the film’s power had a lot to do with music. The black slab of the monolith had its mysterious power infinitely enhanced by the vocal music of György Ligeti. Richard Strauss’s “Thus Spake Zarathustra” is inseparable from the proto-hominids’ ecstasy upon accidentally discovering tools. A real stroke of genius, though, was to cut the music entirely for the long, hypnotic middle section detailing HAL’s takeover of the ship—the AI itself a distant consequence of the apes’ momentous invention—with the exception of his pitch-shifted, a capella performance of “Daisy.”
Around the same time as the Star Wars immersion I would follow a routine during Christmas visits to my grandparents’ house in Madison, Wisconsin. When I got to their house (which smelled pleasantly of wool and coffee and oranges), I’d immediately run to the window-lined portico at the back of the first floor—the “sun room”—where there stood a rosewood upright piano, which my grandmother kept in tune with annual visits from a tuner, though she didn’t play herself. The keyboard reached to my neck, so that my hands, bent at the elbow, were positioned as if I were hanging off the edge of the piano. The tactile sensation of pressing the keys, and the way it led somehow to sound, was addicting. The first few years I always played the same thing; a pattern of alternating fourths and fifths in a see-saw rhythm in 4/4 time; simple steps I’d memorized and repeated like a recipe and which led, without fail, to pleasure. I had no idea what the intervals were, could not have named the notes, and did not know the tune was in 4/4, or any other time signature. On another Christmas visit a few years later there was a blizzard and the sky and ground formed a horizonless white wall, which my father ominously referred to as a “white out.” When I went to the piano to play the pattern the same chords sounded completely different, but also appropriate for white out conditions.
Sometime around the age of nine I had a serious fever and spent days in bed, alternating chills and sweats, eating very little beyond sour-tasting medicine administered to me in a measuring spoon. My sole entertainment was a boom box and three cassettes: AC/DC’s High Voltage, The Magic of Abba, and Let It Be, by the Beatles. The albums merged in the lurid prism of my fever. Composite musical dreams and hallucinations started to form: the power chords of AC/DC were rows of stone towers broadcasting search lights over an abandoned city; ABBA were chanting an ancient Nordic spell as they marched over a distant hill; Lennon and McCartney singing “Two of Us” and “Dig A Pony” and “Across the Universe” were a pneumatic pump inside my chest and head, as if rubber insolation contoured to the inner surface of my body were slowly inflating, then deflating, then inflating again. The cassette player had what was at the time a cutting edge feature called “auto-reverse” which meant that it would keep playing until you manually pressed Stop. I’d wake up and it was suddenly dark outside, the music having continued all the while, heightening the tricks the fever played with time.
An addendum to this fever world resurfaced in a recent dream—no doubt set off by work on this essay—about the exhibition of a Paul McCartney android at Madame Tussauds wax museum, in Times Square. McCartney’s robot is in the Let It Be rooftop concert look—puffy and scruffy. Before the performance, a man in a tuxedo who looks like Billy Bob Thornton emerged from a red curtain and gave a lecture about CSL Flow Machines: a sinister infomercial touting the singing specs of the android, with its “four-octave range, thirty-two-bit digital, 44.1 Kz sample rate; MIDI interface, auto-reverse; onboard Rechord system; all Snowbonk frequencies, Chocolate Chicken Chicken Cake…”
I’ve been thinking maybe now’s the time to resuscitate my Franchised Band idea. It wouldn’t need to be ostentatiously “conceptual.” Think of it as a natural extension of the iTunes-Spotify-YouTube archipelago—there wouldn’t even have to be a physical instantiation of the band, just an online theater in which the group knocks out new tunes on the fly, an endless set, a stretch of infinite pop. We could call the whole thing “Stargoon’s Car” and just let it run by itself; subscribers would log in whenever they felt like it. Stargoon would always be playing and the material would always be new. Soon there’d be a whole new species of fandom, new addicts and priests, an online academy of interpreters, delirious rabbis of the ever-expanding Stargoon oeuvre, elucidating hidden patterns, parsing rivers of numbers. Stargoon would be the AI pop prophet, its new consciousness a phase shift into a new medium: online art at last. Long live the new flesh.
Paul Grimstad is a writer and musician living in New York. His most recent films score is for Thirst Street, for which he also wrote the theme song. His writing on music and literature has appeared in Bookforum, n+1, the London Review of Books, Music and Literature, New Republic, The New Yorker, the Times Literary Supplement, and other journals and magazines.