The Time We Faced Doom: On the Dangers & Complications of AI-Generated Music

Evan Nabavian dives into the rapid growth of music soullessly created using Artificial Intelligence, which removes the creative aspect from the process entirely.
By    July 10, 2023

Image via Generative Fill prompts on Adobe Photoshop (Beta).


Show your love of the game by subscribing to Passion of the Weiss on Patreon so that we can keep churning out interviews with legendary producers, feature the best emerging rap talent in the game, and gift you the only worthwhile playlists left in this streaming hellscape.

Evan Nabavian‘s YouTube algorithm is chaotic good.


I.

Silicon Valley has a well-rehearsed bromide about the transformational power of its latest tech. You can find versions of it splayed across the websites of early stage companies in clipped sentences and sans serif typefaces. It is meant to elicit awe from customers and cash infusions from Sand Hill Road. Year by year, consumer and enterprise technology steadily advance to make life and work incrementally more convenient, but rarely in recent history have the shamans of northern California delivered on their lofty promises of world-changing tech. Perhaps they stopped even aspiring to it.

Artificial intelligence feels different. ChatGPT and its ilk perform tasks that only recently required the judgment, experience, and skills of a human. Computers used to be persnickety about the inputs you gave them. They would hassle you about order and syntax and they would demand that you learn an interface or a language to complete your task. No longer. A new crop of machines can field your query in plain English and sometimes respond in kind. They instantaneously furnish text, images, and sound that pass for human handicraft. Some people approach these new possibilities with glee. Others gird themselves for chaos.

The music cognoscenti gnash their teeth in distaste at artificial intelligence. The editor of this website used it as a shorthand for trendy pablum. Ice Cube called it demonic. Young Guru wrote in a since-deleted Instagram post that “we are in a very groundbreaking but dangerous moment.” Artists have to contend with the possibility that this technology will render them obsolete, that people and businesses will be able to procure recorded music without paying a person to make it. Rights holders face the dubious prospect of controlling how AI tools use their music. Rank and file listeners are left to wonder if technology will sap the soul and idiosyncrasy that make music special in the first place.

Stanford computer scientist John McCarthy defined artificial intelligence as “the science and engineering of making intelligent machines, especially intelligent computer programs.” In the 1980s, researcher Geoff Hinton pushed forward the idea that artificial intelligence should rely on systems called neural networks that mimic the human brain. Rather than retaining a massive list of explicitly defined logic, a neural net trains on a dataset and learns to identify patterns and relationships. Once trained, a neural net can make predictions and classify new data. In the 2000s, researchers had access to faster chips and more training data which led to more powerful “deep learning” networks. In 2017, a paper by a team at Google Brain introduced the transformer, a new type of neural net architecture wherein all of the elements in a system’s training data pay attention to each other. This was the breakthrough that gave AI its current moment. The “GPT” in ChatGPT stands for Generative Pre-trained Transformer.

That knotty, jargon-filled explanation besmirches these pages because it is necessary for understanding how someone can create an eerily convincing audio snippet of Frank Sinatra covering Lil Jon or Eric Cartman covering Jennifer Lopez. People have begun training AI models on the annals of recorded music. Today, amateurs trade vocal models of popular artists on Discord. Google and Meta have developed AI tools that can generate music from a sample melody or a text prompt like “An ‘80s driving pop song with heavy drums and synth pads in the background.” A service called Boomy lets people create music based on predefined inputs; hundreds of thousands of songs made with Boomy have been released on Spotify.

Technology has foisted seismic change onto music several times in recent history, notably with the introduction of file sharing and streaming. These were changes to the way music is distributed. Artificial intelligence represents a shock to the way music is made. The recent spate of novelty AI cover songs may presage bigger changes the likes of which recording artists, their labels, and their audience are only starting to consider.

II.

Questlove often shares the story of the first time he watched J Dilla make a beat from scratch. Dilla was listening to “Ain’t Got Time” by Roy Ayers Ubiquity which Pete Rock had previously sampled on an interlude on The Main Ingredient. Dilla listened to the record for the better part of an hour, looking for another portion he could sample, but he and Questlove lamented that no part was “clean” enough. Roy Ayers talks so much that it was impossible to find a part without vocals that Dilla could easily cut out and loop. Instead, Dilla meticulously chopped out more than a dozen sections of the record, each a fraction of a second in duration, and then reassembled them into a seamless instrumental loop using the pads on his MPC 3000. Here is a visual demonstration of this feat. Black Star later used the beat for the song “Little Brother.” Questlove and others point to Dilla’s painstaking process as evidence of his genius.

A software called AudioShake uses artificial intelligence to separate the elements of an audio track. You input an audio file and it automatically distinguishes the drums, bass, vocals, and guitar and separates them into isolated tracks. Such tools have been around for decades, but AudioShake is the first that can isolate an individual guitar. They accomplished this via a deep learning network. Last year, Rodney Jerkins used AudioShake to isolate Ol’ Dirty Bastard’s vocals from a VHS tape so Jerkins could sample them on “Forgiveless” by SZA. Paul McCartney announced that he used source separation software to retrieve John Lennon’s vocals from an old demo to make a “new” and final Beatles song.

Technology that aids, simplifies, or automates the creation of music is perennial. Drum machines, synthesizers, ProTools, FL Studio, and Serato are tools that streamline or eliminate laborious tasks in the process of making music. Many, if not all, were met with skepticism or opprobrium from the old guard when they were introduced. In the 1890s, composers worried that player pianos would kill demand for sheet music; this culminated in a lawsuit by a music publisher that went before the Supreme Court in 1908. One hundred years later, KRS-One rapped, “​​If you got Serato, bravo / But if you can’t cut vinyl records, you won’t be able to follow.” And then a few bars later, “F*ck the computer, it’s you and me.”

The sample library Splice has a tool that uses AI to automatically suggest sounds to producers. Its website insists that the tool is meant “to help, not replace, the creative process.” Masterchannel and LANDR use AI to automate the process of mastering songs. Atlas 2 automatically organizes your sample library. These companies are at pains to demonstrate that AI can be an asset to musicians and perform tasks other than the wholesale creation of music. Even AI generated vocals of famous artists have their defenders. Timbaland played a snippet of a song he made with AI generated Biggie vocals and he says he’s working on a startup to commercialize the idea. AudioShake CEO Jessica Powell wrote that AI generated covers might not be so different from remixes and samples.

AI tools could have saved Dilla a lot of time when he was trying to flip that Roy Ayers record. Had he used this shortcut, no one would have written a reverential post on the Okayplayer board celebrating his deed. “Little Brother” likely would have sounded different or Dilla might not have challenged himself to make it in the first place. One after another, the analog challenges of the twentieth century are vanquished by ingenious software. It remains to be seen what will happen to music when it becomes so much easier to make. The old methods may become a niche for purists and auteurs. Music will take new and unpredictable forms when it can be made with a few keystrokes and an idea.

III.

There are approximately nine ways for a recording artist to make money in 2023:

1. Sell digital or physical copies of your music.
2. License your music to a streaming service who will pay you according to the number of times someone listens to your music and some other opaque factors.
3. License your music for use in other media like movies and TV. Or license your music to other artists who want to sample it.
4. Work for hire; e.g. work as a session musician or a songwriter.
5. Get paid to perform your music live.
6. Sell merchandise related to your music.
7. Get patrons to sponsor you.
8. Use your music knowledge and expertise to help others, for example, as a consultant, A&R, or manager. Or start your own record label.
9. Use your persona or fame to enter a business adjacent to music; e.g. Beats Electronics, Fenty Beauty, the seven Fast and Furious movies that have Ludacris in them.

AI generated music is poised to compete with human artists for the first four of those, possibly more, though to varying degrees. In April, Universal Music Group CEO Lucien Grainge told investors, “[T]he recent explosive development in [our] generative AI will, if left unchecked, […] increase the flood of unwanted content hosted on platforms.” For AI generated music to represent meaningful sales or streams, it has to stand next to the output of human artists in terms of quality; also people making music with AI have to figure out how to market and distribute it. The size and scope of these challenges suggests that it will be a long time before music created entirely with AI encroaches on the sales and streams of human-made music. Music publishers may have more to worry about if businesses discover that they can avoid the cost of licensing music by generating it using AI. Likewise, audio engineers and other people involved in making music may have cause to worry if AI drives down the cost of their skills.

One wonders what the Spotify brass are talking about behind closed doors right now. No one is satisfied with the economics of streaming music except perhaps the listener. Spotify is beholden to a handful of licensees for most of the music on its service which gives rights holders enormous leverage. Spotify pays approximately 70% of its revenue to rights holders. Apple Music faces the same challenges, but it is a loss leader for a company with other interests. The unit economics of streaming music can change dramatically if streaming platforms own the music themselves – or if no one owns it. Spotify would keep 100% of every dollar it earns on streams of a royalty-free song. They haven’t indicated if they’re flirting with this idea. During their Q1 earnings call, CEO Daniel Ek said the company’s focus with regard to AI is “allowing innovation of creative works” and “protect[ing] the creators and the artist ecosystem.” He touted AI DJ which creates a personalized DJ set for you using its existing catalog, complete with AI generated DJ banter.

Jessica Litman of the University of Michigan Law School said on a panel in April that AI presents three problems for copyright law. First, is the output of generative AI protected by copyright? Second, if an AI tool produces a song that sounds like an existing song, is that copyright infringement? Third, is it copyright infringement to train an AI tool on existing works of music? For now, the answers to those questions are, respectively, no, yes, and no. The Financial Times reported in April that UMG told the streaming services to block AI tools from training on their catalogs. Courts will adjudicate lawsuits brought in the past twelve months that may change these rules and further complicate who can do what with music, legally speaking.

Ben Thompson wrote last year that generative AI unbundles the creation of an idea from the substantiation of that idea. You can unearth a rare funk break and dream up a great way to sample it, but you can’t make that idea a reality unless you know how to use a DAW or an MPC. AI has the potential to let anyone substantiate their ideas, regardless of their familiarity with instruments and equipment. Thompson writes, “relatively undifferentiated creators […] will be reduced to competing with zero marginal cost creators for attention generated and directed from Aggregators; highly differentiated creators, though, who can sustainably deliver both creation and substantiation on their own will be even more valuable.” For music, this means voice, perspective, and craft may still be worth something.

IV.

Detroit impresario Hex Murda once relayed a story about Baatin from Slum Village. In 2006, Slum Village was booked to perform at the Montreux Jazz Festival in Switzerland. (“That’s some prestigious shit right there,” Hex wrote.) Phat Kat and Slum Village performed and then Pete Rock came out for the headline set. Between their sets, Baatin drank a bottle of wine which was risky because he was on medication. Pete Rock started performing a masterful DJ set. Suddenly, Baatin ran out on stage and started doing the Errol Flynn in front of the crowd. Hex went out and pulled Baatin back to the side while Pete looked at them askance. Hex turned his back and Baatin threw up everywhere before running back out on stage. Hex pulled Baatin back a second time and asked him what was wrong with him. Baatin grinned and said “It’s the music.”

Guru’s non sequitur endorsement of lemonade on “DWYCK.” ODB saluting the 52 states on “Shimmy Shimmy Ya.” DJ Quik’s errant Bollywood sample on “Addictive.” The Fugees sparring with a conspicuously Chinese restaurateur on “The Beast.” Nate Dogg’s acapella call to arms at the end of “The Next Episode.” Nas revealing a predilection for Barbra Streisand on “It’s Mine.” Joyce wrote, “A man of genius makes no mistakes. His errors are volitional and are the portals of discovery.” Artificial intelligence that triangulates a song from a corpus of recorded music cannot match the caprice and fecundity of the human mind. And when AI does hallucinate and produce something unexpected, it doesn’t mean anything. It’s not there because an artist made an unusual choice but because the technology happened to produce a faulty simulacrum of your prompt.

We interrogate music by considering its authors and their geographic, social, economic, and political circumstances. We trace music’s lineage and debate its influences. We enumerate the decisions and mistakes that gave it form. We marvel at technique. None of this is possible when music has no author. Understanding the provenance of music generated by AI becomes an engineering problem. You don’t interrogate AI generated music, you debug it. Its only value is in its surface level aesthetics. Maybe it can perform a function like helping you exercise or meditate. But there is a hard limit to how much it can mean.

Writers who pass their time compiling Toru Takemitsu soundtracks shouldn’t presume to predict the tastes of the masses. According to the IFPI, the most streamed songs in 2022 globally belonged to Harry Styles, Glass Animals, The Kid Laroi & Justin Bieber, Elton John & Dua Lipa, and The Weeknd. With the exception of Glass Animals, all of these acts had meaningful chart performance prior to 2022. It would seem people want music to come from artists they recognize or artists they can grow to love. AI can ingest the zeitgeist and return an adequate pastiche. Pop stars do this too, but they used to have to do it manually. AI will make sameness easier to manufacture, but mass market music will still need a face.

Outside of boardrooms, some follow developments in generative AI with trepidation or visceral disgust because they see the gifts of AI as a perversion of something they love. It cedes some or all of the creative process to a thoughtless black box that feigns sentience. It is a step change in technology that has the potential to remake centuries-old practices. The angst from this group of devotees is warranted, but overblown. Music will meet this moment as it did industrialization, the rise and fall of empires, and the internet. People will continue to make and listen to music. Most of it will be immediately forgotten and some will remind you of your humanity.


We rely on your support to keep POW alive. Please take a second to donate on Patreon!