AI: Facebook’s new algorithm trained on one billion Instagram photos

Facebook’s researchers have introduced a new AI model that can be learned from any random group of unmarked images on the Internet.

Image: Facebook AI

Facebook’s researchers have launched a new AI model that can learn from any random group of unmarked images on the Internet, in a breakthrough that the team, although still at an early stage, expects a “revolution” in generate computer vision.

Called SEER (SElf-SupERvised), the model was fed one billion publicly available Instagram images, which had not been manually compiled before. But even without the labels and notes that usually go into algorithm training, SEER was able to work independently through the data set, learn as it went, and eventually achieve the highest levels of accuracy on tasks such as object detection.

The method, called self-supervised learning, is already well established in the field of AI: it consists of creating systems that can learn directly from the information they receive, without relying on carefully named datasets to teach them how to perform a task such as recognizing an object in a photo or translating a block of text.

Supervised learning has received a great deal of scientific attention in recent times, as it means that much less data has to be labeled by humans – a meticulous time-consuming task that most researchers prefer. At the same time, a self-monitoring model can work without greater composition of a data set with larger and diverse data sets.

In some fields, especially natural language processing, the method has already led to breakthroughs; algorithms trained in increasing amounts of unmarked text have made advances in applications such as question answering, machine translation, natural language derivation, and more possible.

On the other hand, the computer vision is not yet fully focused on the supervised learning revolution. As Priya Gopal, software engineer at Facebook AI Research, explains, SEER is a first place in the field. “SEER is the first computer vision model that is fully supervised and trained on random internet images, compared to existing computer vision that is supervised and trained in the highly composite ImageNet dataset,” she tells ZDNet.

ImageNet is, in fact, a large database of millions of images tagged by researchers and opened up to the larger computer vision community to advance the development of AI.

The project’s database was used as a benchmark by Facebook researchers to evaluate the performance of SEER, which found that the self-supervised model outperformed the latest AI systems under the supervision of tasks such as low- shot, object detection, segmentation and image classification.

“SEER surpasses the existing models that are self-supervised by practicing only on random images,” says Goyal. “This result essentially indicates that we do not need such collected datasets as ImageNet in computer vision, and even surveillance of random images yields very high quality models.”

With the degree of sophistication learning that is under the supervision of students, the researchers’ work was not without challenges. In terms of text, AI models are the task of giving words meaning; but with images, the algorithm must decide how each pixel matches a concept – while taking into account the different angles, views, and shapes that a single concept can take in different photos.

In other words, the researchers needed a lot of data and a model that could deduce every possible visual concept from this complex pool of information.

To accomplish the task, Goyal and her team have adapted a new algorithm from the existing work of Facebook AI in self-supervised learning, called SwAV, that combines images that show similar concepts in different groups. The scientists also designed a convolutional network – a deep-learning algorithm that models the connecting patterns of neurons in the human brain to assign importance to different objects in an image.

With a billion-strong Instagram-based dataset, the scale of the system was, to say the least, huge. Facebook’s team uses V100 Nvidia GPUs with 32 GB of RAM, and as the model size increased, the model had to fit within the available RAM. But Goyal explains that further research will be helpful in making sure the computer capabilities adapt to the new system.

“As we train the model on more and more GPUs, communication between GPUs must be fast for faster training. Such a challenge can be tackled by developing clear software and research techniques that are effective for the given memory and runtime budget. , “she says.

Although there is still some work to be done, Goyal can therefore be used before using cases in the real world, but Goyal argues that the impact of the technology should not be underestimated. “With SEER, we can now make further progress with computer vision by training large models in large quantities of random internet images,” she says.

“This breakthrough could enable a learning revolution in supervised computer vision, similar to what we have seen in processing natural language with text.”

Within Facebook, SEER can be used for a wide variety of computer vision tasks, ranging from automatically generated image descriptions to identifying policy-breaking content. Outside of the enterprise, the technology can also be useful in fields with limited images and metadata, such as medical imaging.

Facebook’s team has called for more work to be done to push SEER into its next development phase. As part of the research, the team developed a PyTorch-based library for self-monitoring learning called VISSL, which is open source to encourage the broader AI community to test with the technology.

Source

Share this:

Related