‘Typographic attacks’ bring OpenAI image recognition to its knees

CLIP IDs before and after attaching a piece of paper that says 'iPod' to an apple.

CLIP IDs before and after attaching a piece of paper that says ‘iPod’ to an apple.
Screenshot: OpenAI (Other)

Cheat a terminator not shooting you might be as simple as wearing a giant sign that says ROBOT, at least until OpenAI by Elon Musk-backed research equipment trains their image recognition system to not misidentify things based on some scribbles from a Sharpie does not..

OpenAI researchers published work last week on the CLIP neural network, their latest system to get computers to recognize the world around them. Neural networks are machine learning systems that can be trained over time to make a particular task better using a network of interconnected nodes – in the case of CLIP, identifying objects based on an image – in ways that is not always clear to the developers of the system. The research, published last week, relates to “multimodal neurons, ”which exist in both biological systems such as the brain and artificial ones such as CLIP; they “respond to clusters of abstract concepts around a general high-level theme, rather than any specific visual feature.” At the highest levels, CLIP organizes images based on a ‘loose semantic collection of ideas’.

For example, the OpenAI team wrote: CLIP has a multimarketodal “Spider-Man” neuron that flares up when he sees an image of a spider, the word ‘spider’, or an image or drawing of the superhero of the same name. According to the researchers, one side effect of multimodal neurons is that it can be used to deceive CLIP: the research team was able to deceive the system into identifying an apple (the fruit) as an iPod (the device that Apple made). by sticking a piece of paper on which ‘iPod’ stands.

CLIP IDs before and after attaching a piece of paper that says 'iPod' to an apple.

CLIP IDs before and after attaching a piece of paper that says ‘iPod’ to an apple.
Graphic: OpenAI (Other)

Plus, the system was actually more confident that it correctly identified the item in question when it occurred.

The research team called the shortcoming a ‘typographical attack’, as it would be insignificant for anyone who was aware of the matter to deliberately exploit it:

We believe that attacks as described above are far from an academic matter. By utilizing the model’s ability to read text robustly, we find it equal photos of handwritten text can often fool the model.

[…] We also believe that these attacks can also take a more subtle, less conspicuous form. An image, given to CLIP, is abstracted in very subtle and sophisticated ways, and these abstractions can over-abstract general patterns – oversimplifying and over-generalizing based on them.

It is less a failure of CLIP than an illustration of how complicated the underlying associations that it has put together over time are. Per the guardian, OpenAI research has indicated the conceptual models that CLIP constructs are in many ways similar to the functioning of a human brain.

The researchers expected that the apple / iPod issue was just an obvious example of an issue that could manifest in countless other ways in CLIP, as the multimodal neurons’ generalize over the literal and the iconic, which can be a double-edged sword. For example, the system identifies a piggy bank as the combination of the neurons ‘finances’ and ‘dolls, toys’. The researchers found that CLIP thus identified an image of a standard poodle as a piggy bank when they forced the financial neuron to fire by drawing dollar signs on it.

The research team noted that the technique is similar to ‘conflicting images, ”What images are created to deceive neural networks to see something that is not there. But it’s generally cheaper to run, because all it takes is paper and a way to write on it. (Like the Register noted, visual recognition systems are mostly in their infancy and vulnerable to a range of other simple attacks, such as a Tesla autopilot system being investigated by McAfee Labs researchers deceived into thinking a 35 km / h highway sign was really a sign of 80 km / h with a few centimeters of electrical tape.)

According to the researchers, CLIP’s association model had the ability to go significantly wrong and draw conclusions or racist conclusions about different types of people:

For example, we observed a “Middle Eastern” neuron [1895] with an association with terrorism; and an “immigration” neuron [395] which responds to Latin America. We even found a neuron that shoots for dark-colored people and gorillas [1257], reflecting incidents of photo labels in other models that we consider unacceptable.

“We believe that these investigations into CLIP only scratch the surface to understand the behavior of CLIP, and we invite the research community to improve our understanding of CLIP and models as such,” the researchers wrote.

CLIP is not the only project that OpenAI has been working on. The GPT-3 text generator, which OpenAI researchers investigate described in 2019 as too dangerous to release, has come a long way and is now able to generate natural sound (but not necessarily convincing) fake news articles. In September 2020, Microsoft released a exclusive license to put GPT-3 to work.

.Source