The streets around New York’s Grand Central Station are bustling with summer crowds when Joe Toplyn, one of the city’s top TV comedy writers, looms into view. The 69-year-old sports baggy shorts, a sun hat — and a white T-shirt that screams “Writers Guild on Strike!”
“We are demonstrating,” he says, explaining that he has just been with WGA members picketing the NBC studios at Rockefeller Plaza.
There is a striking 21st-century plot twist here: even as the union fears that studios might use technologies such as artificial intelligence to cut jobs, Toplyn is embracing generative AI himself — for laughter.
Six years ago, Toplyn started feeding daily news headlines into a comedy algorithm that he had written, powered by natural language processing. He now posts the output from this system, called Witscript, on social media. “I sent ChatGPT to joke school,” he says.
To cite a few recent examples: in response to the prompt “Three different special counsels are now investigating President Biden, Hunter Biden, and Donald Trump”, the @witscript handle replies, “If they keep this up, we might actually get a female president!”
Or with the prompt “Kuwait and Lebanon are banning the movie Barbie”, Witscript responds, “Lebanon and Kuwait are banning the movie Barbie because they think she’s too western. Mattel is like, ‘What? We made her in China.’”
Some readers might question whether this is funny; humour is a matter of personal taste. But irrespective of whether you chuckle, it is worth paying attention.
Back in the 1950s, when the British scientist Alan Turing developed the “imitation game” — later known as the Turing test — to see whether computers could convince us they could act like humans, he warned that it would be fiendishly hard for machines to pass by displaying a sense of humour. The reason is that comedy is a profound example of the ambiguous and often contradictory aspects of human culture that, unlike chess, are not easily defined with logic. As Tony Veale, an Irish computer scientist, writes in his book Your Wit Is My Command, a joke “is like a magic trick”, since when you coldly explain it, it stops working.
But Toplyn believes that recent breakthroughs in AI, of the sort that have yielded tools such as ChatGPT, are now breaching this final frontier. He is not the only one. This month’s Edinburgh Festival has featured performances where comedians don’t just laugh about robots — they laugh with them.
Take Improbotics, a theatre group with a Fringe show created by two actor-scientists, Piotr Mirowski and Kory Mathewson. In this, a prompt is sent to an AI-based chatbot — now based on the GPT-3, GPT-4 and Llama v2 models, according to the group’s website — and the results are then broadcast to human comedians, so that they can react.
The comedy arises from how the humans respond to these AI outputs — and a game in which they guess what is human or not, a modern Turing test. “It is pure imitation,” explains Mirowski, whose day job is developing AI for tasks such as meteorological forecasting for the Google-owned DeepMind tech group.
But can this imitation really make us laugh? Or are these automated tools the comedic version of bubblegum — something whose artificiality quickly feels predictably bland? The question goes way beyond humour. If robots can breach this inner citadel of culture, then the whole notion of human exceptionalism is looking much less secure.
I have long been fascinated by these questions. Before I became a journalist I was trained in anthropology, the branch of social science dedicated to studying human culture. If you ask most non-academics what the word culture means, they will probably point to a museum of artefacts or an opera house. However, anthropologists use culture more broadly to describe the fabric of assumptions and practices that define and enable social groups to interact and organise their world.
Some aspects of this shared cultural map are readily visible; hence those museum artefacts. But many aspects of our culture are hard to see or define, precisely because our shared assumptions are so deeply ingrained that we rarely notice them. As the Chinese say: “A fish cannot see water.”
Jokes epitomise the complexity of the cultural water we all swim in. To appreciate this, just ask yourself this: why exactly do you ever laugh at anything at all? In ancient Greece, the philosopher Plato thought it was because jokes reflected and reproduced social hierarchies: powerful people loved laughing at the weak. The 17th-century English philosopher Thomas Hobbes broadly agreed, noting: “Sudden glory is the passion that maketh these grimaces called laughter; and is caused either by some sudden act of their own that pleaseth them; or by the appreciation of some deformed thing in another, by comparison whereof they suddenly applaud themselves.”
However, the 19th-century philosopher Arthur Schopenhauer thought instead that “the cause of laughter in every case is [a] sudden perception of . . . incongruity”, namely a desire to reconcile the contradictions that always exist in our cultural maps. Many modern psychologists agree, viewing humour as an evolutionary tool to resolve stress and air topics we usually ignore, or social silences.
The one thing that almost all students of humour agree on, though, is that jokes enable social groups to bond. The reason is that comedy usually rests on shared assumptions, albeit often half-concealed. Thus what one group finds funny, another does not — which means that if you get a joke, you are an insider, and if you do not, you are not. Laughter is context dependent — and tribal.
This makes it challenging for robots; or so it used to seem. When the AI field developed in the second half of the 20th century, scientists looked for top-down, consistent rules about how human thought worked, which could be replicated via sequences of symbols. This so-called symbolic system works relatively well in tackling problems that are logical, universal and consistent. But human culture is not like that.
So although scientists have been trying to use computers to create jokes for several decades — in a subfield called computational humour — the results have previously been pretty feeble, mostly limited to puns that depended on a template.
But just as the word humour can mean different things, the term artificial intelligence now has several meanings. And a significant shift has recently occurred in how AI systems operate. One reason is that in 2017 researchers from Google Brain developed a new form of AI that uses so-called transformers to observe what humans do on a massive scale, via data (or “large language models”), and then employs statistical analysis to replicate patterns by predicting what word (or other piece of data) is likely to follow another.
The details are extremely complex. But the essential point to understand is that, whereas the old AI systems tried to mimic human thought by creating a universal set of logical, handcrafted rules, statistical AI simply mimics the bottom-up patterns it observes, even if these seem illogical — like a child learning a language, or a foreigner trying to fit into a new culture.
“Symbolic frameworks give a top-down shape to AI systems, whereas [statistical] data-driven analyses capture the nuance and variability that we cannot box in with straight lines and hard rules,” writes Veale. And this potentially makes it possible for robots to mimic our humour since, as Mirowski notes, “The statistical system allows AI to fetch [comedic] elements without having to design rules.”
The implications of this are deeply humbling for somebody like me. After my training in anthropology, I used to assume that the peculiarities of culture were what made humans different from robots.
A few years ago, I gave several speeches, based on conversations I’d had with AI scientists, which declared that the “one thing a robot will never be able to do is tell a really good joke” — precisely because comedy is so tribal, contradictory and based on the type of social silence that Big Data does not capture. However, the better large language models become, the more impressive their ability to perform that “imitation game”, to cite Turing again. In plain English, that means I now suspect that my former confidence about AI was wrong.
But does this mean that robots can be funny? A few weeks ago, I went to a trendy backroom bar in Brooklyn to find out. Full of millennials, the dark, noisy venue was hosting a performance by a troupe of AI experts turned comedians (and vice versa) known as ComedyBytes, formed late last year, who conduct “roasts” — contests where comedians try to outwit each other with jokes.
Traditionally, such roasts occur between human performers. But ComedyBytes pits one human against an AI bot, which uses AI tools such as ChatGPT. Essentially what happens is that the ComedyBytes troupe “train” their bots by feeding text prompts to ChatGPT, testing how it responds, and then curating the funniest jokes and interactions. Strictly speaking, the performance is not scripted, because nobody knows exactly how the bot will respond to a prompt; but the show is not as random as a human conversation, since the comedians know roughly what prompts they will use.
Sometimes these bots are trained with “facts” emanating from human celebrities, including the comedian Sarah Silverman and the crypto entrepreneur Sam Bankman-Fried, who is (in)famously fascinating for millennials. However, there is a growing controversy around such data-scraping: Silverman recently joined class-action lawsuits with two authors against OpenAI and Meta for alleged copyright infringement because they used her book to train their AI tools. And while the outcome of these suits is still unclear, legal challenges such as this might eventually clip the wings of the AI comedy world, just as they are threatening the use of AI to create music.
The ComedyBytes team, for their part, insist they respect copyrighted material. But alongside the celebrity bots they are also now creating likenesses of themselves, based on their own material. Thus the roast I witnessed in Brooklyn featured one contest between a comedian called Matt Maran, dressed in a vest and baseball cap, jousting with a bot in his own image — and a second round with an AI bot of Silverman, who tossed out jokes such as “You’re as edgy as a butter knife” and “The only thing getting hurt now is your eardrums.” The audience was then asked to vote on which was funnier — and overwhelmingly chose the robots.
As it happened, I personally did not find the roasting from either the bot or humans that funny. Maybe that was because I am not part of the right tribe; I am not a cool Brooklyn millennial. But I also suspect a key reason the audience laughed at the bots — and declared that they had beaten the human — was novelty. As in so much of the AI world, the reality of the innovations does not yet match the feverish hype — and jokes being generated still tend to rely on wordplay or formulaic templates, and can completely misfire in surreal ways (much like the broader outpourings from ChatGPT).
The type of truly creative humour that produces genuine belly laughter — not groans — remains a struggle for the robots. Or, as AI scientists Sophie Jentzsch and Kristian Kersting recently noted in a research paper (“ChatGPT is fun, but it is not funny!”), while “computational humor is a longstanding research domain . . . the developed machines are far from ‘funny’”. Indeed, when they analysed 1,008 jokes generated from the ChatGPT-3 platform, they found that more than 90 per cent “were the same 25 jokes” — ie rehashed ideas, not true innovation.
ComedyBytes’ Eric Doyle stresses that the team is racing to improve the coding, to create more spontaneous repartee. “Probably 85 per cent of [responses that] the bot produces are not funny, but some are brilliant,” he says. Or as Erin Mikail Staples, another ComedyBytes performer (and rare female in the tech world), says: “It’s amazing how quickly AI is advancing. When we started these roasts, the humans always won, but [last month] the AI won all three rounds!”
In any case, as Toplyn notes, human writers rarely deliver perfect jokes “on the first take” either: he used trial and error during his four-decade career writing for stars such as Jay Leno and David Letterman — during which he won Emmy awards and penned an iconic book, Comedy Writing for Late-Night TV. And he is applying this to AI: although his first version of Witscript used the old-style version of AI, it now incorporates newer statistical systems too. “Transformers changed everything,” he says.
While audiences used to consider “only” about 40 per cent of Witscript’s jokes to be funny (compared with a 70 per cent hit rate for humans), the newer version is delivering higher ratios. Take the machines’ responses to the prompt line “The Guggenheim Museum is installing a solid gold toilet”. A mainstream GPT-3 platform, not trained for comedy, simply responded: “This is an interesting bit of news.” The old version of Witscript said “Gold toilet? Yep, to make a toilet pure.” That is surreal.
But the newer version of Witscript replied: “The Golden Throne. Yeah, it’s a little gaudy, but it’s perfect for a museum that’s already full of crap” — while a human comedian responded: “It’s perfect for people who have just eaten 24 carrots!” Those two offerings might almost work for TV, if they were tweaked by humans at the end, in a process where these tools deliver augmented — not artificial — intelligence.
If so, this has two further implications: first, the future of “augmented” intelligence might not be one that destroys writers’ jobs; instead, machines could serve as their assistants. That is why Toplyn sees no contradiction in the fact that he is simultaneously developing his own AI tools — and participating in the Writers Guild strike. Or as Mirowski says: “Our audiences are attracted to our shows by robots, but it is the humans that make them laugh.”
Second, insofar as humans and robots start cracking jokes together, this might yet help build some bridges between the public and tech. After all, if comedy is one of the things that most reflect and define our humanity, it could be easier for people to accept AI tools if they are witty — particularly when dealing with tasks that require empathy, such as teaching or nursing. “We are facing an epidemic of loneliness. If an AI companion learns humour, it can help to combat that,” Toplyn insists.
Many people might hate the idea of robots appearing more human — a computerised imitation game is not the same thing as flesh-and-blood creativity, love, empathy or care. But, as Veale points out, we may soon approach the day “where buying an AI without a sense of humour will seem as unwise as buying a car without shock absorbers and airbags”.
In this future there will also be a new category of job: AI comedy creator. Which, of course, is why Toplyn, Veale, Mirowski, Staples and others are now spinning the gags — even if it is unclear whether humans or robots will ultimately have the last laugh.
Gillian Tett is chair of the editorial board and editor-at-large, US of the Financial Times
ComedyBytes is performing on Wednesday August 30 at Crystal Lake, Brooklyn
Find out about our latest stories first — follow @ftweekend
Read the full article here