A lot of talk these days about AI putting creative people out of work. But here’s a question: How many humans does it take to put one robot on stage?
“Five!” answers Piotr Mirowski. He’s an AI researcher in London, who also does improvisational comedy. And his experience in getting AI to be a creative improviser holds some important lessons for how AI can be used in general.
I met Mirowski at an event I ran recently called AI as Performance. It included other performers in dance and experimental sound, plus artist Federica Ciotti, who did a live performance, producing the drawings that illustrate this article.
An AI system created by Mirowski also performed live, taking in his speech and producing absurd responses which effectively turned a serious academic discussion into a comedy routine.
“Google, in 2005,” he explains, “introduced a machine translation system based entirely on predicting what comes next. Given five words, what word comes next? That was enough to unlock speech recognition and machine translation, and take it to a new level.
“I had also been doing improvisational comedy, as I was starting my computer science education. I thought about connecting the two: language modelling—predicting what word comes next—and improv. But it’s only around 2014 that the technology was running fast enough to be brought on the stage.”
Early results were promising, and in 2016 he launched Improbotics, with another researcher, Kory Mathewson.
“The real motivation for me,” Mirowski says, “was to mimic, in improv, some of the processes that happen in statistical language modelling.
“When we take improv classes, we’re told to stop thinking – not to be in our head, just to use our intuition, our cultural memory, and use whatever is in front of us—a stage partner, our audience—to predict what should come next.
“By design, a statistical model—a machine learning system that predicts—is doing that automatically. So for me, the similarity was so glaringly obvious. (Okay, the similarity stops there—humans are not like machines.) But I started training a language model to perform improv comedy with, to be a stage partner. I say something, and it generates a response.”
He trained it on open subtitles – a dataset of user-contributed subtitles of films. “Why? Because in improv, we basically tell stories that are like films or theatre,” he says.
Gradually, he started meeting collaborators. In 2018 it was Boyd Branch, who pulled the project in new directions of virtual reality and online performance. He was at my event too, and he set up the room to effectively put all of us inside of an AI system.
Branch picks up the story. “Through the process of figuring out how to put a robot on stage, we went through lots of different iterations. It started with text-to-speech—a robot voice. And then, we started using earpieces for cyborg theater.”
“The thing with chatbots,” Mirowski chimes in, “is that our interactions with them are stilted, and somewhat delayed. You say something, the machine converts the sound waves into text. Then it sends it to a language model, which might take half a second—or several seconds, depending on your connection, and how big the model is.
“On stage, this is an eternity! The first tries we had with a robot were painful – to watch, and to work with.
“So we had to do something. One idea was to partially replace the robot interpreter—the actor—with a human. That’s something we [humans] are very good at—acting. But acting with a script, that is written live, as the story unfolds on stage.
“So instead of having a robot play the role of a robot on stage, we have two humans. But one is a cyborg, taking lines from the AI, adding tiny interpretations, and physicality.
“It’s constantly generating a flow of lines,” Mirowski goes on. “And that changes our interaction with the chatbot. We tend to see them as oracles, and there is one lesson I want everyone to walk out with from our shows: AIs are not oracles, and what they produce is not truth. These are stochastic parrots, to use one expression – essentially generators. Like good improvisers, when you hit ‘new choice’, you get something else. It all depends on statistics.
“So what we’re using in the show is a system that keeps generating different choices, continuously.
On cue, Branch steps into the cyborg character. “‘But here’s the twist, dear Piotr. Unlike those, I strive to keep my circuits clean.’”
Mirowski waits a beat, and then, “I try to keep my circuits clean as well. I am using the help of human curators, who will keep it clean.”
Laughter all around.
The issues extend from brain to body. I had been reading Seeing, Naming, Knowing by Nora Khan. So I ask if they think that AI systems tend to treat all bodies as the same. “In other words, they’re seeing, naming, claiming to know something about the bodies of the people that they see. Is there a sameness in what they’re seeing and interpreting?”
“Partly it comes down to the training,” replies Branch, “what it’s trained to see. Just like humans—we’re trained to see certain things and not others.”
“It’s a design decision that you make when you create your dataset,” adds Mirowski. “What do you include, and exclude?
“You may have heard about the Gender Shades project. There has been a lot of work in trying to de-bias datasets—making sure that they perceive skin tones, for example. And the same in terms of language.
“However, there is another problem that can arise when you overcompensate. You’re still making a design choice that can be hegemonic. That’s something I did as part of my day job —we interviewed comedians who interact with language models for ideation, for writing comedy sketches. And they noticed that the language models tend to censor anything that was not in the mainstream, as in, a Silicon Valley-based mainstream.
“For instance, the safety training of language models was to avoid that they say racially aggravated insults. But as a consequence, a Filipino-American comedian was playing with a language model, trying to get some material about being an Asian-American woman, and faced a language model that simply refused to answer, because it was trained to be safe. The same thing applied to a White American woman was deemed safe.
“The same thing happened with some LGBTQ videos being flagged as ‘toxic’. A prospective API for moderating speech on Twitter in 2018 systematically qualified African-American vernacular English as having a high factor of toxicity or offensiveness. And this was done through originally trying to de-toxify the language on Twitter. But it didn’t know that the same word, used by different people, has different meanings. The relational context—who is speaking, to whom, in which way?”
“It’s a problem we never have in the comedy club,” Mirowski continues. “We know who is in the audience, we are responsible for what we say. If we say something offensive, we are going to get booed.
“You can try to insert some metadata [into an AI system] about the context of a conversation. But even then, you would have to choose some thresholds, a whole value system. And that value system might change, based on the time of day.”
“We want our AI to be ethical and moral, and make the right choices,” says Branch. “But we can’t even do that ourselves!
“I am of the school of thought,” ponders Mirowski, “that believes that AI cannot proceed without research in robotics—around perceptions and actions, rather than just manipulating symbols.”
Branch relates this back to comedy. “You have to keep the tension. It has to be connected somehow, to something that we understand. We may not be able to codify it in language, but there’s something there – that poetry that Piotr spoke of. I want to see how far a machine can go, so I know where not to go!”
“There was some headline-grabbing research from Facebook,” Piotr recalls, “that trained communicating agents who had the possibility to change the language. And so they ended up finding new combinations of words that were more efficient for communicating. This was on a particular task, with a goal. But the headlines were that AI invented a new language that humans don’t understand.”
You can see how it’s easy to extrapolate from there to the paper clip scenario: An AI system tasked with producing as many paper clips as possible, and given complete autonomy and power to do so, might expend every resource on Earth to fulfil this task—right down to harnessing the iron inside human bodies. So, human extinction. (No shortage of paper clips, though.)
Look closely at that scenario, and the problem isn’t the AI as such, it’s the prompt and the power given to it. We always need to follow the money, to find the profit motive behind the people building AI systems.
“The question of authorship, of credit, is also a political framing,” says Mirowski. “When you work for a big company that has lawyers and people working on ethics and engineers developing tools, everything you do has implications, how you share innovation, how society perceives it.
“When I was working in reinforcement learning, building agents for navigation in strange environments or mazes, it was all about emergent behaviours. And that means giving them agency. That’s what we tried to program into then.
“Then, as those tools started to be adopted, the question of authorship became an economic question. Looking at the writers’ strike, and the actors’ strike, it’s not a statement about authorship by AI, it’s about how we use it and view it, from a legal point of view.
“If you work on ethics, you think about the ethics of assistants, chatbots, even GPS. We try to dehumanize it as much as possible, to avoid cases where it is treated like a human. AI companions that develop unhealthy relationships with users.”
“And when I talk about ethics,” Mirowski adds, “I think for artists, it’s a different story. Because we establish a contract with a tool, or agent, that we build and use on stage. You want to see it as a collaborator, not a tool, because it would devalue your own work otherwise. It’s so much more useful to treat it as an entity with agency. In theatre, we suspend our disbelief.
“It creates interesting questions about moderation. When is it okay to have an AI that discusses murder, for example? If you’re playing a video game, and you’re playing a hit man, that’s the whole point of the game! Find creative ways of assassinating people: drop a chandelier on them, explode a Formula One car. And it’s fun. It’s not great, on the other hand, if you use an AI system to help you build a real bomb.
“So most of us are able to navigate between these worlds, and we’re trying to find how to integrate a system that’s not simply ethical—whatever that means—but it lacks context, so we have to create the context.
Behind Branch and Mirowski, the AI continues to riff off of their dialogue. Mirowski takes a pause to explain.
“This is a language model, running on my laptop, not over the internet. The model is called Gemma. It occupies maybe a third of the memory of my laptop, it has 27 billion parameters.
He explains how his and Branch’s audio, captured by lapel microphones, is sent into the laptop, where it’s transcribed into text, then sent into the model, which generates responses based on its training.
“It’s not perfect,” he admits, “it hallucinates. And it’s responding in the character of Alex, this very peppy robot—“
“‘Hey, at least I can dance,’” replies Branch, as Alex.
“Sometimes it’s on the nose,” says Mirowski. “But I find Alex somehow literal.”
“‘It makes you wonder who’s really pulling the strings!’”
“Well, that’s the person who programmed it, that’s who plays Alex.”
“‘It’s unsettling.’” Alex’s personality comes through.
“I get that it’s unsettling,” replies Piotr calmly.
“‘And that’s where Piotr hangs out!’”
“I do hang out, because I’m speaking to a machine, but I’m also speaking to Boyd at the same time.”
“‘But we’re talking about AI and art.’”
“Yes, we’re talking about agency of AI.” Ever the straight-man, Piotr.
“‘Wait, what are they called again?’”
“The agency of AI.”
“‘Assisting the emergence.’”
“Yes. Emergent systems.”
It’s becoming clear that comedy—like a lot of creativity—can be about misunderstanding, more than understanding. We can constantly correct it, or we can just go with it. Maybe we should do the same with AI.
“‘Cool, right?’” Alex is going off the rails.
“Grounding,” Piotr tries in vain to pull the discussion back to earth. “The idea of grounding perception with words, and vision and speech and hearing and everything. Just like we are grounded in our perception of the world, with our five senses.”
“‘Humans are hilarious!’” Alex gets the last word. Will AI always?
Want more? You can read a longer version of this article here.








