Computer scientists cheated for the first time during the WACV 2021 conference, which took place online from January 5 to 9, 2021, to design systems that detect deep fakes – videos that manipulate imagery through artificial intelligence.
Researchers have shown that detectors can be defeated by inserting inputs called contrast samples into each video frame. The conflicting examples are slightly manipulated inputs that cause artificial intelligence systems such as machine learning models to make a mistake. In addition, the team showed that the attack still works after videos have been compressed.
“Our work shows that attacks on burglary detectors can be a real threat,” said Shehzeen Hussain, a UC San Diego computer engineering Ph.D. said. student and first co-author of the WACV paper. “More worryingly, we demonstrate that it is possible to make robust contradictions, even if an opponent may not be aware of the operation of the machine learning model used by the detector.”
In theft, the face of a subject is modified to create convincingly realistic footage of events that never really happened. Consequently, typical depth detectors focus on the face in videos: follow it first and then send the cut facial data to a neural network that determines if it is real or fake. Eye blinking, for example, is not well reproduced in reproduction, so detectors focus on eye movements as one way to make that determination. Modern Deepfake reporters rely on machine learning models to identify fake videos.
XceptionNet, a deep false detector, describes a conflicting video created by the researchers. Credit: San Diego, University of California
The widespread distribution of fake videos through social media platforms has raised significant concerns worldwide, particularly hampering the credibility of digital media, the researchers said. “If the attackers have some knowledge of the detection system, they can design inputs to direct and bypass the detector’s blind spots,” said Paarth Neekhara, the paper’s first co-author, and a computer science student. of the UC San Diego said.
Researchers have created a conflicting example for each face in a video frame. Although standard operations such as compressing and resizing videos usually remove conflicting examples from an image, these examples are built to withstand these processes. The attack algorithm does this by estimating over a set of input transformations how the model judges images to be true or false. From there, it uses this estimate to transform images in such a way that the conflicting image remains effective even after compression and decompression.
The modified version of the face is then inserted into all the video frames. The process is then repeated for all frames in the video to create a deep-fake video. The attack can also be applied to detectors working on entire video frames, as opposed to just crops.
The team refused to release their code so that it would not be used by hostile parties.
High success rate
Researchers tested their attacks in two scenarios: one where the attackers had full access to the detector model, including the face extraction pipeline and the architecture and parameters of the classification model; and one where attackers can only query the machine learning model to find out the probability that a frame will be classified as genuine or false.
In the first scenario, the success rate of the attack is above 99 percent for uncompressed videos. For compressed videos, it was 84.96 percent. In the second scenario, the conversion rate was 86.43 percent for uncompressed and 78.33 percent for compressed videos. This is the first work that demonstrates successful attacks on the latest Deepfake detectors.
“To put these deep-seated detectors into practice, we argue that it is essential to evaluate them against an adaptable adversary who is aware of this defense and is deliberately trying to advance this defense,” ?? write the researchers. “We show that current modern thief detection techniques can be easily circumvented if the opponent knows the full detector or even part of it.”
To improve detectors, researchers recommend an approach similar to the introduction of resistance training: during training, an adaptable adversary continues to generate new depths that can bypass the current modern detector; and the detector continues to improve to detect the new deep.
Adversarial Deepfakes: Evaluating the Vulnerability of Deepfake Detectors as Conflicting Examples
* Shehzeen Hussain, Malhar Jere, Farinaz Koushanfar, Department of Electrical and Computer Engineering, UC San Diego
Paarth Neekhara, Julian McAuley, Department of Computer Science and Engineering, UC San Diego