Meta Announces New Advances in AI Technology, Featuring Robots That Learn From Interaction With Humans

Meta Announces New Advances in AI Technology, Featuring Robots That Learn From Interaction With Humans
The logo for Meta (formerly Facebook) is seen on a sign at the company's corporate headquarters location in Menlo Park, Calif., on Nov. 9, 2022. (Josh Edelson/AFP via Getty Images)

Meta Inc. has announced new developments in AI technology. According to a statement, the company has facilitated new advancements for robots to learn through direct interactions with real-world humans.

The advancements include the performing of “challenging sensorimotor skills.”

The company made the statement in a press release on March 31. In it, the company specified that the new approach involves “training a general-purpose visual representation model (an artificial visual cortex) from a large number of egocentric videos.”

The videos originate from Meta’s open-source dataset. It depicts people engaging in daily routine activities, including grocery shopping and preparing a meal.

Part of the approach involves the training of robots through the development an artificial visual cortex. This would be created by Meta’s Facebook AI Research team (FAIR), who are working to recreate the region of the human brain that converts vision to movement.

In the dataset known as Ego4D, the robots will learn a vast variety of tasks from cooking to sports, cleaning, and craft work. The dataset consists of several thousands of hours of wearable camera video, featuring people taking part in the research, engaging in daily activities which the robots aim to replicate.

“To study the effect of pre-training data scale and diversity, we combine over 4,000 hours of egocentric videos from 7 different sources (over 5.6 [million] images) and ImageNet to train different-sized vision transformers using Masked Auto-Encoding (MAE) on slices of this data,” according to Meta.

The company added in the press release, that its research team has devised a visual foundation model known as “CortexBench.” It features 17 different sensorimotor tasks in simulation, “spanning locomotion, navigation, and dexterous and mobile manipulation.”

“The visual environments span from flat infinite planes to tabletop settings to photorealistic 3D scans of real-world indoor spaces,” the company says.

“Although prior work has focused on a small set of robotic tasks, a visual cortex for embodied AI should work well for a diverse set of sensorimotor tasks in diverse environments across diverse embodiments,” the company added.

The model was tested on a Boston Dynamics’ Spot robot, the company went on to say. The robot was then tasked with re-shuffling an array of objects in a 1990-square-foot apartment and a 700-square-foot university lab, “using adaptive (sensorimotor) skill coordination (ASC).

“The robot must navigate to a receptacle with objects, like the kitchen counter (the approximate location is provided to it), search for and pick an object, navigate to its desired place receptacle, place the object, and repeat,” according to Meta’s release.

Meta says that the experiment achieved “near perfect” results and was successful with over 98 percent accuracy. The Spot robot effectively overcame challenges including “hardware instabilities, picking failures, and adversarial disturbances like moving obstacles or blocked paths.”

According to the research team, the feat was achieved by instructing the Spot robot to maneuver in an unseen house, collect objects which were out-of-place and place them in their correct positions. This test allowed the robot to effectively gain a concept of what houses look like.