Recently, several disinformation campaigns have been spotted using deepfake avatars. In February a pro-CCP campaign attributed to “Wolf Media” was reported by Graphika (pdf). In January similar, albeit somewhat more rudimentary, videos were circulating in support of the junta in Burkina Faso. And most recently a YouTube ‘news’ channel supporting Venezuelan government interests was taken down. What’s interesting is that all three were created using the paid synthetic video service Synthesia, which is based in Britain.

It’s worth reading up on the Venezuela campaign in particular since it helps complicate our Great Power and Cold War obsessed narratives about disinformation, who pushes it, and what ends it can be meant to serve. Here are a couple more good links for further reading. It’s also worth seeing how Synthesia describes their own services (shout out to the link “we take pride in being an ethical deepfake app”.. which directs you back to that same page).

But more importantly, while the world is busy fetishizing how realistic synthetic video is becoming, it’s worth taking a look at the source… how realistic are we actually talking here? And how does it actually function as a conveyer of information? I mean, are we talking War of the Worlds radio broadcast levels of realism here, or just Avatar 2?

The Youtube channel that hosted the original Venezuela campaign was pulled down shortly after I got a look at it, in the wake of the reporting. It had kind of a CNN-meets-Vitamix thing going on. But Synthesia also offers potential customers the ability to create a free demo video to get a sense of their service. I used it to ask my boss for a raise, which did not work. But I’ve included that video above.

What’s hilarious about this is how utterly unpersuasive it is as a substitute for real video. I’m sure you could find someone out there who would think that voice was coming out of that avatar, but it’s not likely to win the day at Crufts. This was pretty much the case regarding the original Venezuela, Burkina, and Wolf News campaigns as well. While it is definitely the case that synthetic video, audio, and text are becoming more and more realistic and difficult for people to differentiate from organic content, the creators of these campaigns chose to move forward despite their material being obviously janky. Or as Graphika put it, “low-quality and spammy in nature.” Which begs the question: what was their strategy?

One plausible explanation is that they were aiming for the Max Headroom effect - pushing content that sits in the uncanny valley between cartoon and reality and trades on a sci-fi cool factor for credibility and hopefully virality with its audience. It certainly seems to have worked with a certain slice of the US commentariat! Another is that Synthesia’s deepfake service simply lowers costs so much as to make targeting the gullible worthwhile.

But these videos are also kind of.. hilarious? And via that humor they feel familiar and “true” in whole other registers than realism. Give that video above another watch and think cringe, or shitpost. Think “Tiktok voiceover explaining someone’s doofy pet”. Think Wikihow, think meme. These videos aren’t realistic. But in exchange they kinda work because they play into an array of tropes and aesthetics that are not only familiar but can actually signal authenticity online. IYKYK. Lulz. They don’t know I’m a deepfake marketing bot. Yah we do. We literally invited u.

I’m writing this all very much in the spirit of an essay, but I think the core idea here is a pretty important one. If the people involved in relevant public policy debates - be it free expression, content moderation, media literacy, public health, or a hundred other topics - can’t move past a model of truth and trustworthiness that depends on a positivist understanding of how knowledge and expression work that was groundbreaking two hundred years ago, we’re in for a world of hurt. Realism matters, but realism is just one genre among a thousand. And that’s going to matter a great deal, even after synthetic content crosses the valley.