Facebook’s next big AI project exercises its machines on users’ public videos

One of the most difficult challenges – and the biggest potential breakthroughs – in the world of machine learning is learning to understand AI systems that happen in videos as fully as one can. Today, Facebook announced a new initiative that it hopes will give it a head start in this resulting work: training its AI on public videos of Facebook users.

Access to training data is one of the biggest competitive advantages in AI, and by gathering this resource from millions and millions of users, technology giants like Facebook, Google and Amazon have been able to continue in different fields. And while Facebook has already trained machine vision models for billions of images collected from Instagram, it has yet to announce projects of similar ambition for video comprehension.

“By learning from worldwide streams of publicly available videos spanning nearly every country and hundreds of languages, our AI systems will not only improve accuracy, but also adapt to our fast-moving world and the nuances and visual cues in different cultures. and recognize regions, “the company said in a blog post. The project, titled Learning from Videos, is also part of Facebook’s “broader efforts to build machines that learn like humans do.”

The resulting machine learning models will be used to create new content recommendation systems and moderation tools, Facebook says, but could do so much more in the future. AI that can understand the content of videos can give Facebook unprecedented insight into the lives of users, enabling them to analyze their hobbies and interests, preferences in brands and clothing, and countless other personal details. Of course, Facebook already has access to such information through its current ad targeting action, but the video analysis by AI can add an incredibly rich (and intrusive) source of data into its stores.

Facebook is vague about its future plans for AI models trained in user videos. The company said The edge such models can be used, from captions of videos to the creation of advanced search features, but does not answer the question of whether they will be used to gather information for advertising targets. Similarly, when asked whether users should allow their videos to be used to train AIs from Facebook or to extract them, the company only responded by noting that its data policy states that users can use uploaded content becomes for ‘product research and development’. Facebook also did not respond to questions asking exactly how many videos will be collected for the training of its AI systems or how there will be oversight of access to this data by company researchers.

In its blog post announcing the project, the social network did point to one future, speculative use: the use of AI to capture ‘digital memories’ captured by smart glasses.

Facebook plans to release smart consumer glasses for consumers sometime this year. Details about the device are vague, but it is likely that these or future glasses will include integrated cameras to capture the wearer’s view. If AI systems can be trained to understand the content of the video, it can record users to recordings in the past, just as many photo programs allow people to search for specific places, objects or people. (By the way, this is information that is often indexed by AI systems trained in user data.)

Facebook has released images showing the prototype of its smart glasses with augmented reality.
Image: Facebook

As video recording with smart glasses ‘becomes the norm,’ says Facebook, “people need to be able to remember specific moments from their vast digital memory just as easily as capturing them.” It gives the example of a user doing a search with the phrase ‘Show me every time we sang for Grandma’s happy birthday’, before receiving relevant tracks. As the company notes, such a search would require AI systems to establish connections between types of data, learning it “to match the phrase ‘happy birthday’ to cakes, candles, people singing different birthday songs, and more. ” Like humans, AI will need to understand rich concepts that consist of different types of sensory inputs.

Looking to the future, the combination of smart glasses and machine learning can enable the ‘world scraper’ to record detailed data about the world by making smart glasses wearers in stray CCTV cameras. As the practice was described in a report last year The guardian: “Every time someone searched a supermarket, their smart glasses would record real-time price data, stock levels, and browsing habits; every time they opened a newspaper, their glasses would know what stories they were reading, what advertisements they were watching, and what famous beach pictures were holding their gaze. ‘

This is an extreme outcome and not an avenue of research that Facebook is currently investigating. But it does illustrate the potential significance of pairing advanced AI video analytics with smart glasses – which the social network apparently wants to do.

By comparison, the only use of its new AI video analytics tools that Facebook is currently revealing is relatively commonplace. Along with the announcement from Learning from Videos today, Facebook says it has used a new content recommendation system based on its video processing in its TikTok clone roles. “Popular videos often consist of the same music tuned to the same dance steps but created and performed by different people,” says Facebook. By analyzing the content of videos, Facebook’s AI can suggest similar clips to users.

However, such content recommendation algorithms are not without problems. A recent report from MIT Technology Overview highlighted how the social network’s emphasis on growth and user engagement has prevented its AI team from fully addressing how algorithms can disseminate misinformation and encourage political polarization. Like the Technology overview article says: “The [machine learning] models that maximize engagement also favor controversy, misinformation, and extremism. This creates a conflict between the duties of AI ethics researchers on Facebook and the company’s credentials to maximize growth.

Facebook is not the only large technology enterprise that performs advanced AI video analysis, nor is it the only one that uses the data of users to do so. For example, Google maintains a publicly accessible research dataset containing 8 million composite and partially tagged YouTube videos to “help accelerate large-scale video comprehension research.” The search activity of the search giant can also benefit from AI that understands the content of videos, even if the end result simply provides more relevant ads on YouTube.

However, Facebook believes that it has one particular advantage over its competitors. Not only does it have adequate training data, but it also pushes more and more resources into an AI method known as supervised learning.

Usually, when AI models are trained on data, the inputs have to be marked by humans: for example, to mark objects in photographs or to transcribe audio recordings. If you’ve ever fixed a CAPTCHA that identifies fire hydrants or pedestrian crossings, you’ve probably tagged data that helped train AI. But supervised learning removes the labels and speeds up the training process, and according to some researchers, it leads to deeper and more meaningful analyzes as the AI ​​systems teach themselves to join the dots. Facebook is so optimistic about supervised learning that it is called ‘the dark matter of intelligence’.

The company says its future work on AI video analytics will focus on semi- and self-supervised learning methods, and that such techniques “have already improved our computer vision and speech recognition systems.” With such an abundance of video content available to Facebook’s 2.8 billion users, it certainly makes sense to skip the labeling part of AI training. And if the social network can teach its machine learning models to understand video seamlessly, who knows what they might learn?

Source