A spectre is about to haunt the Near East, the spectre of deepfakery. Deepfakery is the ability to deploy a particular kind of advanced machine-learning technology to literally put words in the video-propelled mouths of well-known people without their participation or knowledge for a variety of imaginable purposes, most of them nefarious. The same spectre haunts much of the rest of the world too, of course, but to the best of my knowledge no one has yet given much thought—in public, at least—to how the challenge might manifest, and to what effects, among Arabs, Iranians, Turks, Israelis, Kurds, Amazigh, and assorted others.
Now, deception and false-flagging is nothing new. States have been trying to mislead each other and otherwise manage impressions since the time of the Armarna documents—in the 14th century BCE! They have sometimes even managed to hide their own hand in the deception of others, though attempts to do this have been more frequent than successes. But the advance of technology has now reached a point where non-state actors—small groups and even individuals—can deploy technologies in ways that can have massively disproportional effects on whole societies, and where the anonymity of actors is far easier to achieve.
This is due to the characteristic defining qualities of the present age of cybernetic technovelty: disintermediation and hyperconnectivity. Disintermediation refers to the elimination of layers, or filters, between any actor and the actor’s intended targets of influence. Donald Trump prefers tweets to press conferences to communicate to the American people because it allows him to bypass both the bureaucracies of the Executive Branch as well as the media, both of which he considers his adversaries. Hyperconnectivity refers to the linking of very many individual technical platforms together in near real time. More than five billion people now have mobile devices, over half of them smart phones that connect reliably to the internet. Considering that the iPhone came on the market only in June 2007, and took a decade to reach initial market saturation, this is a very fast development.
At present it is true that hyperconnectivity pertains mainly to more affluent, better-educated, and younger people, but the volume of communication in such cohorts, who tend to be more politically aware anyway, is sufficient to cause nationwide effects. Note too that cybernetic hyperconnectivity tends to leapfrog over legacy landline communications systems in developing countries, so that mobile device ownership and use is not typically lower in poorer countries than in rich ones. Even more important, deepfake videos can be communicated through television by either complicit or duped media as well as through mobile devices. So societies that suffer a relative deficit in deep literacy, along with average educational sophistication and a related tendency to credit conspiracy theories, tend to be more vulnerable to efforts at deception and destabilization than more literate, better educated societies.
Moreover, the speed with which such effects can be generated is blinding, and the relative ease of acquiring and learning to use the relevant technologies well enough to do mischief is great. The ubiquity of access gives a decided advantage to offense over defense. This is particularly so in countries that are less technologically sophisticated. Technologically advanced countries at least have some chance of using legal and regulatory means to deter, defeat, and limit the damage done by various kinds of disinformation attacks, whether they originate from within or from outside. That makes countries like Mali and Egypt more vulnerable to disinformation one-off attacks and sustained campaigns than countries like Germany and Israel.
Because of the sharp technological asymmetries among states, implications for international relations abound. Some technologically weaker states may elect to “hire” or ally with technologically stronger states to help them defend against malign actors, and to do so weaker states may have to cede or limit some aspects of their freedom of action. Other states may secretly “hire” services from technologically advanced states not for defensive purposes but for offensive ones. A future global communications environment in which cyber-deception is rife could therefore be a world primed, among other things, to reimport limited forms imperialism in the form of cybersecurity insurance, or capabilities augmentation, from more adept powers.
One more piece of prolegomenon is necessary before coming to examples and possible Near Eastern implications. As already noted, technical trickery in service to political agendas has been going on for a very long time. It has been part and parcel of espionage, military-tactical misdirection, false-flagging subterfuge, and more for centuries on end. But again, such techniques have typically been aimed at governing and policy implementing elites, not at mass publics. Their aim has been to convolute decision-making at upper levels of authority, not to destabilize societies from below. That is now changing, and changing fast.
Techniques of mass deception have a history, like everything else. Roman colosseums were technological marvels of their day, but the number of listeners who could hear a speaker in them was still very limited. The advent of mass literacy and print widened considerably the audience for potential impression management, but not until the invention of radio in the 20th century did the next order-of-magnitude innovation appear in this regard. It was possible to falsely attribute statements and intents on the radio to mass publics, and Mussolini’s fascist regime was in this regard a pioneer in the mid-1920s. The Nazis soon followed, as did pro-Nazi propagandists like Father Coughlin in the United States. The creation in the United States in 1934 of the Federal Communications Commission testified to the concern of democracies over the power of radio to summon mobs onto the street.
Sound is one thing, but sight is another. Whether wisely or not, people tend to believe their eyes before they believe their ears. And now we have a situation in which the manipulation of mediated visual images has made very great leaps forward, even as the purposes to which those images are put has become decidedly blurred.
We’ve come a long, long way since early 1984, when Max Headroom made his debut as a fictional artificial intelligence character beamed to viewers through their television sets. The technology was primitive compared to today, but two things are noteworthy about it: It was once hijacked; and its purpose was mere entertainment.
As to the latter, no one beyond the age of, say, nine watching Max Headroom thought he was real in any literal sense, and no attempt to suggest otherwise existed. As to the former, on November 22, 1987, two Chicago TV stations had their signal hijacked by some unknown individuals, one of whom wore a Max Headroom look-alike costume. The fake Max rambled on for about 90 seconds or so contemning the real Max’s commercial endorsements, concluding with a pair of exposed buttocks being whacked by a fly swatter before normal programming resumed. The culprits were never apprehended or even identified.
What does this tell us? For one thing, it tells us—as if we did not already know—that life imitates art, even bad art. What started as part of an entertainment industry innovation has now migrated into deadly serious domains—just the opposite of the pattern half a century ago, where military and space-race technologies tended to migrate into the commercial sector. From the very beginning, from Max Headroom, adjuncts to entertainment fare were vulnerable to repurposing for political messaging—in the November 1987 case for anti-commercial purposes. Moreover, the commercial nature of entertainment media makes separating what we may generously call art from advertising artifice a somewhat artificial enterprise. Since politics necessarily involves marketing, and since the trend over time has been for commercial marketing techniques to bleed into political marketing techniques, you can readily see the complications building.
From Max Headroom we rapidly passed to anime and CGI technology. Computer-generated graphics really wowed us in the beginning, and depending on the framing of the experience in which CGI plays a role it matters, or not, that many viewers cannot tell the difference between what is real and what isn’t. If you’re sitting down to watch some escapist fantasy movie, you are inclined to conspire in the CGI-enhanced fiction because that helps you become engrossed in it. When you watch a movie like Avatar or the new Disney Lion King, it is downright stupid to constantly be calling attention to the technological slights of hand that are going on. Calling attention to the brackets that frame the fiction ruins the fun.
But CGI is child’s play compared to the technology behind deepfakes. The advent of GANs technology—generative adversarial networks—since 2014 promises to take the matter to an entirely new level.
GANs technology has many uses, many of them promisingly positive in physics, medicine, construction, and other areas. Basically, GANs technology works as a form of machine learning by pitting a generator against a discriminator in a game-like environment in which each can “learn” from the other by honing the combination of the algorithms and data sets they are given. Both the generator and the discriminator are neural networks, and the interchange between them works very fast. GANs technology is capable aiding both unsupervised and supervised learning. It can generate new photographs from real ones that look authentic to human observers. It can create images of imaginary fashion models, making hiring real ones unnecessary. It can improve astronomical images by filling in statistically what real cameras cannot capture. It can generate showers of particles for high-energy physics experiments. It can be used to visualize industrial design. It can reconstruct three-dimensional images of objects from two-dimensions photos of them. It can be used to generate photographic images from voice recordings. It can be used to visualize motion in static environments, which could help, for example, find people lost (or hiding) in forests or jungles. It can improve climate change modelling. In 2016 GANs technology was used to generate new molecules for a variety of protein targets in cells implicated in fibrosis, inflammation, and cancer. In short, it has medical research potential with major clinical outcomes.
Alas, along with the upside comes a downside. GANs can be used to generate photographic images of people who do not exist, or doctor images of people who do exist to make them seem to be someone else. It can match video images to voices in ways that seem authentic to human perception, and so can be used to get people to say things they never said or would say. It can even be made to create pornography that cannot be distinguished from the real kind, and it can use images of actual people in the process without their knowledge. That prospect has led the State of California to propose criminalizing such activity.
The implications for politics, and international politics, are or should be pretty obvious. Again, there are some upsides imaginable. GANs technology could reduce the dangers posed by “gray” or ungoverned spaces that are used by terrorists and criminal syndicates to stage their activities. But the downsides seem more attention-arresting. Russian state actors’ and contractors’ placement of bots and trolls on social media for the purpose of influencing the November 2016 U.S. election is one thing. Getting political figures to “say” things they did not say is another. There is already on the loose a GANs-generated version of U.S. House of Representatives Speaker Nancy Pelosi saying things, in a fake drunken stupor, she never said.
It’s a bad fake, easy to recognize. But the technology is readily available, not wildly expensive, and the threshold of skill needed to use it effectively is not inordinately high. So, naturally, concerns about the prospective integrity of the November 2020 election in the United States are generating significant anxiety and attention to the problem. DARPA’s Media Forensics program is studying ways to identify and counteract fake video media. Senator Ben Sasse (R., Nebraska) and Representative Yvette Clark (D., NY) have introduced legislation designed to combat malicious use of GANs and other technologies. More effort will follow, no doubt, but it will lag predictably behind reality and, remember, thanks to the nexus between disintermediation and hyperconnectivity advantage seems inherently to favor offense. By the time authorities find and detect a fake, even if they never identify its source, it may be too late to warn people before mobs or incensed individuals do something drastic and violent.
This is another case of where, as Bob Dylan once wrote, you don’t need a weatherman to know which way this wind is blowing. In this regard two recent examples, more or less innocuous in and of themselves, serve as harbingers.
T
his past May, not long before I embarked for a year in Singapore, Calvin Klein ran an ad in which supermodel Bella Hadid engaged in what amounted to a brief lesbian make-out with Lil Miquela. Criticism rained down almost immediately. LGBTQAI++ warriors accused Calvin Klein of “appropriating” lesbian behavior for clickbait attention, and of depicting lesbian habits and other forms of “queerness” in a surreal atmosphere. Calvin Klein tried to defend itself but ended up having to apologize.[1]
Just when the general malarkey seemed to die down, however, another facet of the episode rose to take center stage: Lil Miquela, it turned out, was not real. She was a from-whole-cloth AI creation. This mattered, to some, because Lil Miquela had been used by “her” creator to “raise awareness” of causes such as Black Lives Matter and Planned Parenthood. She even engaged with a fictional pro-Trump avatar in a stunt meant to mimic the nation’s political polarization, weighing in exclusively on one side. For all “her” troubles, Lil Miquela acquired a Facebook following of multiple thousands of presumably real people who thought she too was real. In short, Lil Miquela was a kind of pop-cultural video troll, and her creator used her for what were clearly political-advocacy purposes.
What to make of this? Well, an answer could be a very long story, but it won’t be here and now. Suffice it to say that if, thanks to Citizens United, corporations are entitled to political voice as if they were individual citizens, creating and using influencers like Lil Miquela could well fall under the free speech protections of the First Amendment, despite the obviously deceptive nature of its technique. It is not innocent, but neither does it obviously buck up against the agreed limits of First Amendment rights as famously defined by Supreme Court Justice Oliver Wendell Holmes, Jr.: shouting “fire!” in a crowded theater. It is somewhere in between incitement to violence and harmless, and by current U.S. legal logic, whatever you may think of it, is an arguable case.
The larger point here is that, once again, law and regulation lag far behind technological innovation, creating gaps exploitable by bad actors domestically. Technological countermeasures to enable us to identify fakes and to communicate such identifications to relevant audiences in near-real time also lags, and here the implications go far beyond domestic legal frameworks in any given country. Hacking and planting trolls on social media for purposes of propaganda and influence require very simple technology compared to GANs, which is analogous to a “nuclear option.” And even this simpler form of interference has proven effective, as the Calvin Klein example shows.
Then, in early August, when I had just gotten over jetlag here in Singapore, I saw an item in the Straits Times that reminded me of Lil Miquela.[2] A popular Chinese vlogger named Qiaobiluo Dianxia, who used anime graphics—also far less sophisticated than GANs technology—to present herself as a beautiful young woman, was accidentally unmasked during live-streaming on July 25 as a frumpy 58-year old woman. Qiaobiluo Dianxia performing as a gaming goddess of sorts had amassed a following of more than 130,000 fans on DouYu, a Chinese live-streaming platform that allows viewers to donate money to streamers, sort of like crowdsourcing used in the United States and Europe. She became quite popular among ethnic Chinese in Singapore, who make up nearly three-quarters of the population. Qiaobiluo posted that she would meet her followers in real life for 100,000 yuan each (about $14,540), but that she sought to gain 100,000 such customers before she would reveal herself in the flesh.
After her unmasking, most of her duped younger fans expressed dismay, and many tech-savvy critics mocked the duped by pointing out that many vloggers artificially enhance their looks via anime technology. Indeed, the internet “love” scamming that goes on in China and in Singapore, known colloquially as “catfishing”, works by the same basic method. But many older women became fans anew, sending her following up to 400,000. She has become known affectionately as Granny, and at last mention was said to be working on a rap track. No harm done?
Only a few days later a third example, of sorts, came to light involving the Jewish-American comedienne Sarah Silverman. The technology involved in the fakery was primitive compared to GANs, but the example is interesting because it illustrates a multiparty incident.
As many know, Sarah Silverman has lived her professional life on the edge of outrage. One of her routines is called “Magic Jesus.” She does the routine “in character”, so it is clear to all who see it in context that it’s comedy, not serious stuff. It is not even as blurred into partial reality as Sacha Baron Cohen’s bizarre genius in tricking people into becoming part of his antic comedy. And it is not meant as serious satire as the ill-fated staff at Charlie Hebdo learned to their regret in January 2015.
But a rightwing provocateur made a meme of a joke from Silverman’s routine and presented it as if it came from a press conference. Silverman is pictured in the fake saying, “I’m glad the Jews killed Jesus. I’d do it again!” As it happened, a mouth-foaming anti-Semitic Baptist preacher from Florida named Adam Fannin saw the fake, was taken in by it, and publicly called for Silverman’s death. Fannin labeled her a “God-hating whore of Zionism”, adding that, “She is a wicked person and she is like a perfect representation of religious Judaism. . .” As a result, Silverman has had to hire extra protection for her performances, fearing that some nutcase will take the misled Fannin seriously. It’s an example of what Americans call “whistle down the wind”, a way to track the progression of gossip and innuendo from one person to another in a sequence. Here the sequence works like this: Silverman-the-comedienne to the-recontexting-faker to Fannin-the-gullible-anti-Semite to some potential random would-be murderer.
Human beings can be mistaken about things even when no one is trying to deceive them. They can be mistaken about things by means of self-deception. They can be mistaken about things because of conformity-inducing peer pressure. They can be mistaken about things because someone has unwittingly misled them. They can be mistaken because someone has knowingly lied to them for one or any number of purposes, which is different. They can be mistaken for combinations of the forgoing reasons—and all of this has been going on for millennia. But only in the past few years has it been possible for governments, small groups of non-state actors, and even clever willful individuals to delude millions of targeted people near-instantaneously through the manipulation of mediated images. This is new, and very scary.
E
veryone who speaks any Arabic or Hindi knows what a fakir is: an indigent mystic who survives on alms. By an amusing twist of accidental language concordance, fakir sounds like faker, and of course some fakirs have been fakers. (A new movie called The Extraordinary Journey of the Fakir illustrates the comedic potential of the accident.) Not so funny is the potential of those who think of themselves as poor, dispossessed victims of stronger powers who aim to extirpate them to attack those stronger powers by means of mass deception and incitement.
ISIS and al-Qaeda both have proven to be more technologically adept than was at first thought. Jama’a al-Islamiyah and Abu Sayyef less so, for now, but they can learn. With GANs technology terrorist organizations can potentially incite riots from Karachi to Fez and everywhere in between simply by faking various national leaders saying things at tender moments (of which there is no shortage), as carried by the internet and television, that are anathema to public sensibilities.
Recent protests in Egypt have not been stimulated by Muslim Brotherhood GANs-driven deception. They originated in social media videos posted from Spain by Mohamed Ali, a military construction contractor who showed evidence of massive corruption. But the next time they might be, and the effects could well drive Abdel Fatah al-Sisi from power—if the current protests fail to do that.
One can easily imagine the Qatar-Turkish alliance using GANs deceptions to attack the Saudi-Emirati partnership, and the other way around, by putting fake words in the mouths of the Ahl-Thani, the Ahl-Khalifa, the Ahl-Saud, the Ahl-Sabah and so on.
Or imagine audiences throughout the Near East waking up one morning to video of Crown Prince Mohammed bin Salman claiming that Saudi military forces, having secretly obtained three nuclear weapons from Pakistan, were preparing to bomb Tehran, Qom, and Bandar Abbas within 24 hours unless the Supreme Leader confessed his sins and relinquished power. Is it real or not? Will publics and governments believe it is real? If governments are not sure, how will they behave? Who will be charged to find out if it’s real? What will third parties, like Israel or the United States governments, do if they know? Who will and will not believe them?
What if we encounter seemingly competitive fakes that actually are not? For example, what if some unidentifiable miscreants put incendiary words in the mouths of both Recep Tayyip Erdogan and Abdullah Ocalan in hopes of stimulating mass violence between Turks and Kurds in eastern Turkey or northern Syria?
Israel is regularly accused by Arabs, governments and media alike, of amazing nefarious feats of which it is not capable. Some years ago Egyptians blamed Israeli skullduggery for an increase in shark attacks around Sharm al-Sheikh. Ayaan Hirsa Ali reminds us than in Somali traditional culture nearly all ill-fated events are ascribed by kneejerk reflex to “al-Yahud.”[3] And this is true beyond Somalia. So the Israeli government should expect to be the target of many kinds of fakery in the future; but, of course, Israeli technical means enables Israeli authorities to respond in and beyond kind. The problem thus falls into a widened spectrum of deterrence requirements for Israel. Additionally, Israel’s formidable technical capabilities in this and other areas may make it a more appealing partner for some of its neighbors who feel themselves at a disadvantage relative to aggressive others.
The possible permutations of mischievous fakery are not endless, but they are vast. GANs-generated fakes could come from inside societies aimed at government or from governments aimed at societies. They could come from Russia or China, India or…..well, from anywhere. Attribution could be claimed, hidden, disavowed, imported or exported. They could be regular and systemic or rare or even singular. They could be designed to stimulate instability and violence or achieve those ends accidentally. You see the scope of the problem.
Solutions? They come in three flavors. First is technological. Responsible actors, in government and out, need to develop means of identifying fakes and communicating about them quickly to publics. Second is legal. Responsible governmental actors have to define criminality in the misuse of various technologies and find ways to erect deterrence to malicious acts. And third is political. Malicious use of GANs and other technologies often depends on a variety of grievances believed in need of redress. Some of these grievances may be fanciful, but many are real enough. To the extent that governments are willing and able to address real grievances, within their societies and between neighboring nations, they can cut into the motives that generate deepfakes.
All three modalities of redress are tall orders at present. Offense has the advantage over defense for now, and may hold that advantage for quite some time. There is work to be done for, it is safe to say, the Near East already has enough problems without having to search for more. But it is for deepfakes as Trotsky once said about strategy: “You may not be interested in strategy (read: deepfakes), but strategy (read: deepfakes) is interested in you.”
Dr. Adam Garfinkle, a regular al-Mesbar columnist, is current a Distinguished Visiting Fellow at the S. Rajaratnam School of International Studies at Nanyang Technological University in Singapore. He is also Founding Editor of The American Interest.
[1] For more details, see Emilia Petrarca, “Calvin Klein Apologizes for Bella Hadid and Lil Miquela Campaign,” The Cut, May 20, 2019.
[2] See Kimberly Anne Lim, “Popular Chinese vlogger posing as young women exposed as 58-year old after livestream glitch” Straits Times, August 1, 2019, p. 3.
[3] See Ayaan Hirsi Ali, “Can Ilhan Omar Overcome Her Prejudice?” Wall Street Journal, July 12, 2019.