Facebook has taken on a project called Learning from Videos. It uses artificial intelligence to understand and learn audio, textual and visual representations in public user videos on the social network.
Learning from videos has a number of objectives, such as improving Facebook AI systems related to content recommendations and policy implementation. The project is at an early stage, but it is already bearing fruit. Facebook says it has already used the technology to improve the recommendations of Instagram Reels, such as compiling videos of people doing the same dance to the same music. The system also shows improved results in speech recognition errors, which can enhance auto-captioning features and make it easier to detect hate speech in videos.
Facebook says the project will help AI researchers not rely on tagged data, and it’s part of efforts to build systems that learn in the same way as humans. As such, learning from videos will “enable entirely new experiences.” The company did not provide much detail about it, other than a possible feature that enables AI to find digital memories, including recordings captured through augmented reality glasses. For example, you could ask such a system to show you “every time we sing for grandma”, and it could pop up the tracks. This is especially Facebook.
The company says the project watches videos in hundreds of languages and in almost every country. This aspect of the project will make AI systems more accurate and enable them to “adapt to our fast-moving world and recognize the nuances and visual cues in different cultures and regions.”
Facebook says it keeps privacy in mind when it comes to learning from videos. “We are building and maintaining a strong privacy foundation that uses automated solutions to enforce privacy on a large scale,” it wrote in a blog post. “By embedding this work at the infrastructure level, we can continuously apply privacy requirements to our systems and provide support such as AI. This includes implementing technical warranties during the data lifecycle.”
Understanding what is happening in videos can be a very difficult task for AI systems. This can include obstacles such as background noise that make it difficult to understand speech and language change. Even less than a year after the start of the Learning from Videos project, Facebook takes what the system has learned, and uses it practically in other areas.