Why a YouTube Chess Chat Marked for Hate Speech

Last June, Antonio Radić, the host of a YouTube chess channel with more than a million subscribers, broadcast an interview with Grandmaster Hikaru Nakamura live when the broadcast was suddenly cut out.

Instead of a lively discussion about chess openings, famous games and iconic players, viewers were told that Radić’s video had been removed due to ‘harmful and dangerous’ content. Radic saw a message that the video, which contained nothing more shameful than a discussion of King’s Indian Defense, violated YouTube’s community guidelines. It stayed offline for 24 hours.

Exactly what still happened is not yet clear. YouTube declined to comment, except that deleting Radic’s video was a bug. But a new study suggests that it reflects shortcomings in artificial intelligence programs designed to automatically detect hate speech, abuse and misinformation online.

Ashique KhudaBukhsh, a project scientist specializing in AI at Carnegie Mellon University and herself a serious chess player, asked her if YouTube’s algorithm might have been confused by discussions about black and white pieces, attacks and defense.

Therefore, he and Rupak Sarkar, an engineer at CMU, designed an experiment. They trained two versions of a language model called BERT, one using messages from the racist right-wing website Stormfront and the other using data from Twitter. They then tested the algorithms on the text and comments of 8,818 chess videos and found that they were not perfect. The algorithms marked about 1 percent of the transcripts or comments as hate speech. But more than 80 percent of the respondents were false positives – read in context, the language was not racist. “Without a person in the loop,” the couple says in their newspaper, “the misleading of the classifier’s predictions about chess discussions can be misleading.”

The experiment exposed a core issue for AI language programs. Detecting hate speech or abuse is about more than just catching up with dirty words and phrases. The same words can have very different meanings in different contexts, so an algorithm must derive meaning from a series of words.

“Fundamentally, language is still a very subtle thing,” says Tom Mitchell, a CMU professor who previously worked with KhudaBukhsh. “This kind of trained classifier is not going to be 100 percent accurate soon.”

Yejin Choi, an associate professor at the University of Washington who specializes in AI and language, says she is “not at all” surprised by the removal of YouTube, given the limits of language comprehension today. Choi says additional progress in detecting hate speech will require major investments and new approaches. She says algorithms work better if they analyze more than just a text in isolation, which includes, for example, a user’s history of comments or the nature of the channel in which the comments are posted.

But Choi’s research also shows how the detection of hate speech can perpetuate prejudice. In a 2019 study, she and others found that human recorders were more likely to label Twitter posts by users who identify themselves as African-American as abusive, and that algorithms trained to identify abuses using the notes prejudices will repeat.

article image

The WIRED Guide to Artificial Intelligence

Supersmart algorithms do not take on all tasks, but they learn faster than ever, and do everything, from medical diagnostics to advertising.

Companies have spent millions collecting and recording training data for self-driving cars, but Choi says the same effort has not been made in the recording language. No one has so far collected and annotated a high-quality hate speech or abuse dataset that contains many “ambiguous language” cases. “If we’ve invested the level of data collection – or even a small fraction, of it, I’m sure AI can do much better,” she says.

Mitchell, the CMU professor, says YouTube and other platforms probably have more sophisticated AI algorithms than KhudaBukhsh built; but even those are still limited.

.Source