May 9, 2024
Ross Lazerowitz
Co-Founder and CEO
Lately, there’s been a ton of buzz about deepfakes in the news. Honestly, I’m not sure if they are going to destroy democracy and all that, but I am convinced of one thing: trying to detect them is a lost cause. My argument has three prongs: first, detection might be impossible in the near term; second, even if we could detect them, integrating the detection mechanism and UX is untenable; and lastly, our focus should shift towards verifying authenticity instead of attempting to detect fakeness.
Detection: a transient ability
Deepfake detection may go the way of the GPT text detector OpenAI hosted before GPT-4. After its release, they took it down, stating that they no longer had a reliable way to detect text generated by their model. As the model advanced, its ability to mimic natural language became increasingly sophisticated. The model's broader distribution of writing styles, topics, and linguistic nuances made detection impossible. The contextual understanding and adaptability improvements allowed GPT-4 to produce outputs virtually indistinguishable from human writing.
The state-of-the-art deepfake detection system typically uses a neural network to find artifacts from the generation process. This could be in audio, video, or images. For example, the early iterations of Midjourney were terrible at creating hands, so the models automatically learned to look at artifacts there. When Midjourney V6 came out, it got much better at generating realistic hands, and the detection models had to weigh on another feature. There’s a good reason to think this process will continue until we hit a GPT4-like state where the foundation models have almost no discernible artifacts. At that point, the efficacy of detection would go down.
There are open-source detection models, but the supposed best ones from companies like PinDrop and Reality Defender are closed-source. That means we have to take them at their word for benchmarks. If they did open-source their models, attackers could train another neural network to add noise to deepfakes and potentially beat the models. You are going to have AIs trying to trick other AIs. Don’t just take my word for it. I spoke with Lily Clifford, the CEO and founder of the text-to-speech company Rime Labs, and someone much more intelligent than me about this. She said, “The reason why you see model-as-a-service companies like OpenAI taking a multi-pronged approach to content provenance, including metadata, is that deepfake detection is notoriously hard. In addition, because of the way generative models work, it’s easier to tell if an image or audio clip comes from the model that generated it, provided you have access to the model. In an ecosystem where both the models and the ‘deepfake detectors’ are closed-source, it may be impossible.”
An integration albatross
Let’s say for a moment that it was even possible to have a detection model with great accuracy, even with improved models. How would we integrate it, and what would the user experience look like? Consider robocalls. Are users going to have something that taps their phone calls and processes the audio, looking for a deepfake? This could violate two-party consent laws as it considers a call recording. How would the user get a notification about a deepfake, and what would they do about it? For images and video, will all social media platforms adopt these models and add notices to posts?
It would be unfair of me to say there aren’t some use cases where the models could be integrated. Call centers or platforms that confirm user identity and KYC could be used. But I’m bearish that it will be effective when a single person is targeted.
A flawed approach
Imagine I handed you a $100 bill and asked if it was counterfeit or genuine. Would you compare it to every fake $100 bill ever produced, looking for artifacts? No, you wouldn’t detect that it’s fake. You would instead confirm that it's real by looking at the microdots under UV light. This would confirm that it came from the U.S. Mint since no one else possesses the capability to add them.
Detecting fakeness is the wrong approach to dealing with a world where multimedia can be easily faked. We need to rely on the tools to confirm trust and identity. Mobile devices can also host private keys and contain high-end biometric sensors. There have to be better ways to verify user identity. When it comes to mass and social media, there’s a promising new standard for creating media traceability, as shown by the Coalition for Content Provenance and Authencity, or C2PA. Backed by Google , Microsoft , Adobe , and other tech giants, it seeks to create an easy way to sign and verify content. It’s very early, but if the major social platforms and smartphone makers adopt it with tight integration with Adobe's creative tools, it might make the web more trustworthy.
The Future
Trying to detect depfakes might seem like a practical response to them, but it’s proving to be a losing battle. Much like AI-generated text, deepfake detection technology for other modalities like images, video, and or even voice is quickly becoming a transient ability. Just as OpenAI's detection models struggled to keep pace with the sophistication of generated text, deepfake technologies have evolved to a point where they are often indistinguishable from genuine content. Moreover, the challenges of integrating detection systems into daily digital interactions raise significant privacy and usability concerns.
I don’t have the exact solution, but I know it must start with people. This is our approach at Mirage, and why our first product focuses on attack simulations and training. All the pieces exist for a technology solution like C2PA, but the hard part is assembling them in a way a five-year-old can use. If we’re not careful, the verification process could follow the path of PGP—after 34 years, it remains largely unusable for most. Deepfakes are a novel problem and require a novel solution.
Try Mirage
Learn how to protect your organization from spearphishing.