The Other AI Arms Race: “Defender” AI vs Deepfakes

With Ben Colman - CEO of Reality Defender


Episode description:

Welcome to In Reality, the podcast about Truth, Disinformation, and the Media. I’m Eric Schurenberg, a long-time journalist and media exec, now the founder of the Alliance for Trust in Media.

Just when you thought it could not get harder to recognize truth in your newsfeed, along comes artificial intelligence. Now it’s child’s play for bad actors to create fake news, fake videos, fake pornography at digital scale and in quality all but impossible to detect. As deepfakes multiply, reality itself becomes just one of several options your algorithm can serve, and not necessarily the most convincing one. 

Our guest today leads a company that offers an intriguing defense to this dystopia. Ben Colman is CEO of Reality Defender, whose technology exposes deepfakes in real-time across voice, images and text. I’ve seen the demo, and it’s impressive. 

Ben’s background is in cybersecurity. We’ve seen this in other disinformation fighting technologies: Cybersecurity is a logical foundation in many ways for information security. 

Ben and Eric cover some chilling real-world examples of deepfakes, Ben explains how Reality Defender’s technology works, and why usability is critical to scaling truth’s defenses. They’ll also discuss whether deepfake detection could strengthen trust in traditional media.

 

Transcript

Eric Schurenberg (00:03.576)
Ben, welcome to In Reality.

Ben Colman (00:06.376)
Thank you. Thank you, Eric.

Eric Schurenberg (00:08.566)
It’s nice to have you here. You’re the CEO of Reality Defender, which uses AI to detect deep fakes and as your name implies, to defend information integrity. But before we get into the company, let’s talk about you. Your background is in cybersecurity. The first company you started was in that field. You worked with Goldman Sachs in that field as well. And now things have changed since those days. The gold rush is on in artificial intelligence and… with your background, could have done anything in AI. Why did you choose to take on the role of kind of AI cop?

Ben Colman (00:45.23)
Yeah, I think that I’ve, I’ve really been obsessed with cybersecurity for many years. I think that, you know, what’s interesting about cybersecurity is that it’s truly equal opportunity. You everybody is at risk, everybody is a target. And so I’ve done a number of projects.

Some have turned it to companies, some have just remained as really interesting science projects that are involved and focused on either finding or protecting information, whether it’s for consumers, whether it’s for their voting record, whether it’s for companies, whether it’s the personal data that might be leaked out. But for me, this is just a continuation of a singular journey, as my wife says, into the dark world of cybersecurity.

Eric Schurenberg (01:31.682)
There have been a lot of horror stories about deep fakes already since AI began its march towards Armageddon or wherever it’s going. There was, for example, the Hong Kong company that sent $25 million to scammers based on an order from an AI-generated deep fake executive. then of course, non-consensual pornography is a horrible problem. But when it comes to…

Ben Colman (01:56.302)
Dread.

Eric Schurenberg (01:57.63)
misinformation and the news media and the media’s ability to sort of do its role as a sort of source of information in the public interest. I hear conflicting things about how widespread deep fakes are. So you can imagine a lot of mayhem being created there, but up to this point, my impression is that an awful lot, the bulk of misinformation is accomplished with old fashioned.

Ben Colman (02:08.585)
Right.

Eric Schurenberg (02:25.272)
photo altering or just simply making stuff up on a video and sending out false claims over social media. You don’t need deep fakes to spread rumors about floods in North Carolina, for example, or to claim that.

Ben Colman (02:33.114)
Right.

Ben Colman (02:36.591)
Yeah, I wouldn’t say that AI singularly is creating different kinds of fraud. I think it’s just probably accelerating and expanding what’s already existed. If you think about it, AI is just a fantastic form of massive automation. And so whereas before, whether you’re trying to exaggerate something or create some kind of a different narrative, maybe false, maybe not false.

You could really just do so much at a time, but now you could do such a massive expansion in content and context. But kind of taking a bit of a step back here is, know, the real focus I just wanted to start with is that, know, AI is changing the world. AI is dramatically accelerating creativity and productivity. And the vast majority of use cases for AI are truly positive for the world. You know, the challenge here with generative AI, particularly around defects is that…

…in the absence of simple, you know, kind of AI first regulations, as we call it, bad actors are really the best product managers because they will use new tools to do their job better, faster, whether it’s committing fraud or defaming people or creating disinformation.

Eric Schurenberg (03:56.206)
Let’s talk about Reality Defender then, the company that you’ve created. What are the use cases and who are your clients?

Ben Colman (04:09.807)
So we started Reality Defender, really started as a science project, partially as a nonprofit. And our focus was on really looking wherever AI can adjust, manipulate, or generate different types of information. And so while we’re focusing across audio and video and images and text, both in terms of files asynchronously, but also real-time communications…

…our focus has moved more toward the real-time communications. And so thinking about scanning real-time phone calls for, you know, choose your favorite bank or scanning real-time video, for example, on zoom or school backs for teams or other. Our view is that that is the number one threat in the world right now. That’s measurable. It’s also the number one threat that companies understand and can measure in terms of financial risk. And so there’s a whole additional world in terms of non-consensual AI pornography or different kinds of online social media information or misinformation, depending on how you think about it. But until there are requirements for platforms to scan for those, choose your favorite social media platform. There’s not gonna do it. Or worse, they’re gonna format out to an offshore team and it’s gonna be done,

Eric Schurenberg (05:32.269)
Mm-hmm.

Ben Colman (05:38.063)
not exactly consistently. So for us as a company, our focus today and into the future is going to be focused on securing real-time communications that can be manipulated with general media and protecting identities, whether it’s on a phone call, on a Zoom, online as well. That’s only going to expand as the technology expands as well.

Eric Schurenberg (06:04.654)
A lot of the work you do, maybe not the bulk of it and certainly probably not the most monetizable portion of it, is helping media companies verify the images that they receive before they put them on TV or onto their website. I was reminded of going to a journalism conference and seeing a presentation from the Washington Post about how…

…they avoid being fooled by fake images. This was a couple of years ago and the advice was, well, look at the hands and see if there’s six fingers and things like that. I think your technology is a lot more sophisticated than that. Certainly the ability of generative AI to create verisimilitude.

Ben Colman (06:35.32)
Mm-hmm.

Ben Colman (06:41.473)
Right.

Ben Colman (06:49.207)
Right. It’s absolutely much more. Yeah. And so, if we have this conversation a year ago, I’d go through all the telltale signs of deepfakes. We look at different types of symmetry or hourglassing or pixelation or anti-aliasing. But the technology has gotten so fantastic that even the PhDs in our team, you know, we’re a team of almost 50 people. The PhDs in our team can’t tell the difference between real and AI generated. And so what does it mean for an average person?

Don’t stand a chance.

Eric Schurenberg (07:20.536)
Tell us a bit about the technology that Reality Defender uses. How is it different from, say, the technology that was embraced a while ago by the Biden administration about watermarking images?

Ben Colman (07:35.373)
is that there’s this kind of classical dilemma, insecurity around provenance versus inference. With provenance, you’re focusing on watermarking or hash values, effectively fingerprinting a piece of media or real-time communication stream. The challenge with provenance is it’s either yes or no, either you’re Eric or you’re not. And so when you’re wrong, you’re completely wrong.

Eric Schurenberg (07:43.095)
Hmm.

Ben Colman (08:05.591)
It also requires a lot of personal data, know, choose your favorite bank or online brokerage, the claim that your voice or your password means they’re retaining your voice print or doing facial recognition, they’re retaining your face print. You know, a face printer, a voice printer, a fingerprint, you know, once it’s stolen, you can’t just reset it like a password to your email. And so we also assume that any piece of media, for example, a video or audio view, Eric, has gone through all kinds of different…

…codec or compression conversions. Maybe you uploaded it on a single platform, went to TikTok, to YouTube, to WhatsApp, each time changing the fidelity and changing the compression and changing how exactly the file has been created. And what that means is that any kind of provenance might not be persistent. And worse, most bad actors might just ignore it completely. And you’re seeing media come out around different earnings announcements or different geopolitical events.

It’s spreading on telegram or WhatsApp, which doesn’t actually care if there is or isn’t a watermark or the fact that you can upload any piece of media onto Adobe and add a watermark. It’s not saying it’s real. It’s just saying that at one point it was added through Adobe. so putting all that aside for the moment, our view is that Genevieve AI is so challenging and the ways it can be propagated are so broad that…

…we do what’s called inference, which is more probabilistic. We’re making calculations on probabilities that different features we’re able to extract out of a piece of media, what’s real-time voice or a image of a face. What is the probability that those different features indicate a specific known generative model or unknown model, but a technique we believe is indicative of generative AI? It means we’re never 100%.

Eric Schurenberg (09:57.237)
Mm. Mm.

Ben Colman (09:59.631)
You know, our highest confidence level is, know, 99%. and that’s because we never have the ground truth. We never have the original photo, video image or audio of Eric. It also means we could tell our clients, we don’t touch any personal data. Don’t need it. Don’t want it. There’s no, API calls to even send it to us. and so that’s how we were able to move really quickly with, you know, who’s your favorite, zero one bank majority of them, either working with us or testing our solution to begin.

Eric Schurenberg (10:24.381)
huh.

Ben Colman (10:28.719)
working with us shortly.

Eric Schurenberg (10:31.448)
How successful is the inference model? What percentage of fakes can evade detection? And also, how do you avoid turning up false positives?

Ben Colman (10:43.279)
I don’t want sound like a marketing commercial, but we’re state of the art. We’re consistently the most accurate in the market across accuracy and precision and recall, which measure true versus false, positive or negative as well. And what that means is that we can demonstrate using a measurable benchmarking result how accurate we are.

Any company could claim anything. know, a lot of companies claim they’re a hundred percent accurate, is impossible. What’s much more important is when our clients test our solution with their own benchmarking data set or other companies like Accenture or IBM or Booz or KPMG or Deloitte do it on behalf of clients. You know, that’s really where the rubber meets the road. You know, that way we prove we don’t see the answers before the exam. And our clients are consistently getting…

Ben Colman (11:39.673)
…you know, accuracy, you know, very close to a hundred percent and FOSS positives, you know, less than one 10th of a percent. Now that’s a constant, opportunity to continuously retest. we’re pushing out updated benchmarking results monthly and our clients are doing that as well. And I think the results speak for themselves. The fact that, you know, we have top five global banks wanting to get on the phone, all the competitors, to say, you know, here’s our results from Renata Defender because they don’t see this as a competitive differentiator. This is an industry challenge against trust, which as I mentioned before, fraud is equal opportunity and trust should be universal.

Eric Schurenberg (12:17.698)
Yes.

The reality defenders model is, as I imagine artificial intelligence has to be trained on data. And where do your data sets come from?

Ben Colman (12:34.255)
mean, if any of your listeners are data professionals, we love to talk to them. We partner with many, many different groups. We internally have a team building the models. have over a dozen patents and white papers. we have been consistently selected for peer review at all the major research conferences, ECCV, AAAI, NeurIPS, CVPR. Excuse me, but that’s only half of our R &D.

Other half is the data side. And so we have dedicated data professionals since very early in our company. We’re certainly focused on building repositories of millions, tens of millions, both real and ad-generated media across all dimensionality of cross-border variables. And that’s age, gender. We don’t use race. It’s not going to be used among skin tone schedule up at Stanford. It’s a gradient…

…and so choose every language, every dialect, every region, we’re trained every combination of all those. we’re also partnering with industry. so, you know, we might remember, the deep fake audio president Biden that was unfortunately used, led by 11 labs or sorry, about after he’s 11 labs to create it. we partnered with 11 labs, and respeacher and, about a half a dozen others that have come to us to effectively form a data partnership and giving us early access to their models and their data sets…

…which allows us to say, with even more accuracy, we’re confident when you detect something and they could then prove to the world they’re thinking very carefully ahead of regulations that are gonna start requiring them to really limit how and where people can use their solutions. So I think that, very pro AI, very pro AI innovation, I think that we have some room to go in terms of super regulations that still support AI innovation. But a lot of the platforms that are mass marketed are realizing that they need to do something as well. And they’ve partnered with us to really kind of think long and hard about solving that problem.

Eric Schurenberg (14:39.554)
Okay, that’s interesting. By the way, just to refer back to the Joe Biden deep fake that you were referring to was during, I think the New Hampshire primary where a voice that sounded just like him, Robo called a number of democratic voters, giving them false information about how to vote for him.

Ben Colman (14:57.539)
And what’s wild about that one is that it was like a one to many recorded once shared with many, but it wasn’t real time. what’s possible right now using any same tools as you could do a real time voice to voice deep fake. So it’s not so much worrying about, you know, president Biden calling you cause that’s not going to happen. but what happens if, you know, very junior election workers get a phone call from their boss saying, we need you somewhere else across town. what if my wife calls me and says, you know, you home early.

Eric Schurenberg (15:10.414)
Hmm.

Eric Schurenberg (15:15.022)
Mm-hmm.

Ben Colman (15:27.119)
What my boss makes a claim to me to do a wire transfer. These things don’t need to be recorded. They can be in real time and actually react in real time as well. And the next step is, the massive scale of it. So you can real time to millions of people at the same time.

Eric Schurenberg (15:41.826)
You’re scaring me, What a media company. That’s right,

Ben Colman (15:44.592)
There is a solution here. Our clients and media, let’s kind of split it up. In telecom, our end state goal is to be on device, similar to a spam block. So you get a phone call. Maybe you’re okay talking to an AI version of me because it’s my AI, my agent. Or maybe when you get a phone call that’s from your airline asking you to upgrade your ticket, you say, you know what? It’s a robo call, I don’t want to take it.

Eric Schurenberg (15:55.256)
Uh-huh.

Ben Colman (16:11.811)
With media, it’s a lot more comprehensive in the sense of without naming names, we work with two of the four biggest broadcasting networks. And they’re thinking about this in terms of all media that they receive from everywhere else. They’re showing constant different news stories from around the world. How do they prove what they see or what they hear, what they’re talking to are actually real? And so we’re partnering with them. They’re building what’s called these

Ben Colman (16:41.199)
It’s kind of a fact-checking dusks where digitally they’re scanning all media, very similar to antivirus software. And to be very, very, very, very frank here, we’re not saying whether it’s true or false. All we’re saying is whether or not there’s AI used in the generation of it.

Eric Schurenberg (16:56.142)
Well, speaking of content moderation and fact checking, that action, that service, valuable as it may have been, always, it seemed to be a whack-a-mole problem, a competition in which the bad guys had all the advantages because the false information tended to be more dramatic than the true information and so on.

Cybersecurity is also an arms race and what reality defender is protecting against seems like an arms race between the bad guys and the good guys. this like content moderation, one in which the bad guys have an advantage?

Ben Colman (17:43.535)
You know, I think the question is, is this whack-a-mole? Is this a cat and mouse game? It certainly is, but so is all of cybersecurity. is absolutely an arms race, but I’d argue it’s much more a feature than a bug. Given our research-focused strategy leading in research, we tend to be able to ensure that different types of modalities and different types of data are in our data set.

before we see in the real world. so, you know, thinking about the latest generative techniques like diffusion, when we saw the explosion of the Pentagon, which we later found out was fake, but had really led to a, I think it was a hundred billion dollar flash crash in the market. We’d never seen diffusion based imagery in the real world, but our models were able to identify that something was anomalous. And so, you know, I think it’s our opportunity to always be on the forefront in terms of research, always in terms of partnerships as well. But if you think about it,

Eric Schurenberg (18:35.159)
Interesting.

Ben Colman (18:42.575)
we’re looking for indicators of both fakeness, but also realness. And, you know, the fakeness is changing, getting better, but the realness isn’t changing. And so go back to your example of, you know, does Ben have six fingers or not? Um, you know, we wrote a paper and have a patent on what we’re calling common sense reasoning, which is to say, if we see that Eric has six fingers, um, what is the probability, uh, that you have six fingers? You know, there’s certainly people in world that have six fingers, but it’s obviously a risk metric that should be.

Eric Schurenberg (18:51.234)
Mm-hmm.

Ben Colman (19:11.257)
fed into kind of a larger score. And so we’re doing kind of thousands of these calculations and then weighting how important each one is and then simplifying it down to a number that a non-technical user can use to decide whether to block or flag a user or action or piece of media or a wire transfer request or identity verification.

Eric Schurenberg (19:34.754)
okay, good. So the decision makers who are using the service are not technicians, not technologically savvy themselves. They are people who are administrators or reporters or bank employees.

Ben, you’ve consulted with governments and technology companies and financial services companies around the world. How is the U.S. stacking up in its ability to defend citizens against deepfakes?

Ben Colman (20:07.321)
You know, I’d say we’re certainly leading the world in terms of AI innovation, even avoiding some of last few days in terms of DeepSeek. On the regulatory side, we’re very optimistic that we’ll have some simple AI innovation first regulations. But right now, we have some room to grow. And I think some of the challenges that, whether it’s a state

or a federal focus area. Obviously the internet doesn’t stop at a state lines, different states rulings don’t really have a issue or a huge opportunity to protect you. I had some identity theft, I went to NYPD and then I didn’t really know what to do. It was probably someone different state or different country. So I think there’s really an opportunity here to lead in terms of regulations and I’m optimistic that over the next three months we’ll start seeing a lot more activity.

Eric Schurenberg (21:07.564)
Hmm. Well, I hope you’re right about that. I have not seen it myself, but I’m certainly open to the possibility and rooting for it to happen. question about this is whether in some kind of unexpected way, unanticipated way, whether the prevalence of deep fake tools and the ability to create

false information using artificial intelligence is going to benefit traditional media brands, which have the wherewithal to afford deepfake detection like Reality Defender and the incentive to be a brand where you can believe what you see. Do you think that the developments, the future will push news users back to traditional media?

Ben Colman (21:45.287)
Okay.

I consume a lot of media online. I’d say it’s a combination of traditional media, social media, and everything else in between. I think we’re at an interesting inflection point where a lot of folks don’t know what or where to look for things. My parents, they’ll send me things and I’ll say, that looks a little bit funny. And then I’ll look at the hyperlink and I’ll say, I’ve never heard of that. It’s not a name brand.

platform, it’s not, know, choose your favorite TV channel, choose your favorite newspaper. And so, you know, having small kids myself who’ve had an experience legacy media of past 20 years like I have, I think it’s a real challenge, but also an opportunity to educate folks on understanding kind of how and where and when they’re getting their information from.

Eric Schurenberg (22:49.656)
Let me ask you in parting here to look ahead to the future and let me posit a couple scenarios. One is that the arms race between the good guys and the bad guys is in balance more or less and commerce and media survive the evolution of AI. Or another scenario is that technology favors the bad guys and that we get much of what some would argue

we did with social media in that, you know, kind of just can’t believe anything you see and you get the dystopia that Hannah Arendt’s just used to apply to the Soviet Union. It’s like everything is possible and nothing is true. Which of those scenarios do you think is more likely?

Ben Colman (23:34.475)
Mm-hmm.

Ben Colman (23:39.621)
Everything is possible and nothing is true. I certainly think everything is possible. And I don’t believe that nothing is true. But I think that there’s different levels of experience for most people. The biggest challenge that we think about in terms of our software is while we’re selling it to large companies, we’re selling it to kind of an enterprise consumer.

Eric Schurenberg (23:48.686)
Why not?

Ben Colman (24:02.595)
You know, we don’t sell to an expert in AI or cybersecurity. They may use it, but we also need to ensure that, you know, the most, the busiest mid-level, risk analysts or somebody in a call center, that are the time to become experts at these things. I might not even know what a deep fake is. And so the opportunity with our solution around defendant, also really most, security software is to make it understandable to everybody.

My seven-year-old should have general sense of things you’ll fake. My seven-year-old parents should as well. And it’s been a journey for us. technology and security companies start off with very technical software with huge metrics and report cards. Now it’s moved to color coding, data upgrading, zero to 100 scoring. And so I think that taking a product view, product first view in terms of usability, is fundamental because even if everyone should use it, if they don’t know how to use it or don’t understand how it works, it doesn’t really mean anything. And so I’m excited for technologists and product-focused platforms to really take complicated tools and make them usable and insightful to everybody.

Eric Schurenberg (25:20.054)
All right, good. Well, let’s leave it there. Ben, thank you for joining us on In Reality and thank you for the work you’re doing at Reality Defender.

Ben Colman (25:29.743)
Thank you for the opportunity.


Created & produced by: Podcast Partners / Published: Feb 13 2025


Share this episode:

All episodes are streaming on these platforms: