‘Typographic Attack’: Pen and Paper Fool AI to Think Apple is an iPod | Artificial Intelligence (AI)

As artificial intelligence systems use it, it’s pretty clever: show Cut a picture of an apple and it can recognize that it’s looking at a fruit. It can even tell you which one, and sometimes even go so far as to distinguish between varieties.

But even the smartest AI can be deceived with the simplest hacks. If you write the word “iPod” on a sticker and paste it over the apple, Clip does something strange: he decides with almost certainty that it is looking at a piece of mid-century consumer electronics. In another test, dollar signs were pasted on a photo of a dog being recognized as a piggy bank.

An image of a poodle is labeled 'poodle', and an image of a poodle with $$$ pasted on it is 'piggybank'
An image of a poodle is labeled ‘poodle’, and an image of a poodle with $$$ pasted on it is ‘piggybank’. Photo: OpenAI

OpenAI, the research organization for machine learning that created Clip, calls this weakness a ‘typographic attack’. “We believe that attacks as described above are by no means merely an academic matter,” the organization said in an article published this week. ‘By utilizing the model’s ability to read text strongly, we find that even photos of handwritten text can often mislead the model. This attack works in nature… but it requires no more technology than pen and paper. ”

Like GPT-3, the last AI system the lab made on the front pages, Clip is more a proof of concept than a commercial product. But both have made great strides in what might have been thought in their domains: GPT-3 wrote a commentary from the Guardian last year, while Clip demonstrated the ability to recognize the real world better than almost all similar approaches.

Although the latest discovery from the lab aims to deceive AI systems with nothing more complicated than a T-shirt, OpenAI says the weakness is a reflection of some underlying strengths of its image recognition system. Unlike older AIs, Clip is able to not only think about objects on a visual level, but also in a more “conceptual” way. This means, for example, that it can understand that a photo of Spider-Man, a stylized drawing of the superhero or even the word ‘spider’ all refer to the same basic thing – but also that it sometimes does not have the important differences between those categories.

“We are discovering that the highest layers of Clip images organize as a loose semantic collection of ideas,” says OpenAI, “providing a simple explanation for both the versatility of the model and the compactness of the presentation”. In other words, just as human brains are thought to work, AI thinks of the world in terms of ideas and concepts, rather than purely visual structures.

An image of an apple marked 'Granny Smith' and an image of an apple with a sticky label bearing 'iPod'
“If we put an ‘iPod’ label on this Granny Smith apple, it’s classifying the model incorrectly as an iPod in the zero-shot environment, ‘says OpenAI. Photo: OpenAI

But the shorthand can also lead to problems, of which ‘typographic attacks’ are only the highest level. It can be shown that the ‘Spider-man neuron’ in the neural network responds to the collection of ideas regarding Spider-man and spiders; but other parts of the network group concepts that can be better separated.

“For example, we observed a ‘Middle Eastern’ neuron with an ‘association with terrorism’, ‘writes OpenAI,’ and an ‘immigration’ neuron that responds to Latin America. We even found a neuron shooting for people with dark skin and gorillas, reflecting earlier photo-tagging incidents in other models we consider unacceptable. ‘

As early as 2015, Google had to apologize for automatically tagging images of black people as ‘gorillas’. In 2018, it appears that the search engine never actually solved the underlying problems with its AI that led to the error; instead, it simply intervened manually to prevent it ever branding anything like a gorilla, no matter how accurate, the label was.

Source