Inside Nvidia's GEAR Lab, Robots Now Train Themselves While AI Coding Agents Run the Whole ExperimentAI
3 hours ago· 3

Inside Nvidia's GEAR Lab, Robots Now Train Themselves While AI Coding Agents Run the Whole Experiment

Eight robot arms at Nvidia's GEAR lab taught themselves to insert pins, seat graphics cards and cut zip ties using AI coding agents, hitting a 99% success rate across four real-world tasks.

When the Robots Run Their Own Experiments

At Nvidia's GEAR lab, a fleet of eight robot arms spent the past few weeks figuring out, entirely on their own, how to insert pins, seat graphics cards and cut zip ties. The only humans in the picture were the ones who sat down afterward to write the paper.

That capability came from ENPIRE, a framework laid out in a paper published Tuesday by researchers at Nvidia, Carnegie Mellon University and UC Berkeley. ENPIRE hands the entire job of training a robot over to AI coding agents, the same software that already writes and tests its own code, and then lets that process run directly on physical hardware.

Dragging the Loop Off the Screen

Coding agents such as OpenAI's Codex, Anthropic's Claude Code and Moonshot's Kimi Code have spent the past year doing what researchers call autoresearch, writing code, testing it and rewriting it without a person in the loop. Until now that loop has mostly lived on a screen, where restarting a failed experiment costs nothing at all. ENPIRE pulls it into the physical world, where resetting an experiment means physically moving a real robot arm.

How ENPIRE Splits the Job in Two

The system breaks the work into two stages. In the first, a human walks the agent through building two permanent tools. One is a reset routine that returns the workspace to a fresh starting position, and the other is a reward function that watches camera footage and scores how well a task went, essentially a referee that never blinks and never breaks for lunch. That groundwork happens just once and is then reused for every attempt that follows.

Once those tools exist, the agent takes over completely. It digs through published research for ideas, chooses among training methods like imitation learning, reinforcement learning or hand-written rules, rewrites its own code and tests the outcome on the robot. None of that requires a person to watch, which feels either liberating or faintly unsettling depending on how you feel about a robot holding a pair of scissors unsupervised.

A Fleet That Shares What It Learns

Nvidia ran the experiment across eight bimanual robot stations, each with its own hardware, computer and coding agent. The stations swap progress through Git, the same tool coders use to merge code, so a winning idea spreads across the whole fleet within minutes.

The Numbers Behind the Speed-Up

Researchers measured the payoff on two tasks. The first was “Push-T,” where a robot slides a T-shaped block into a target zone using only pushes, and the second was pin insertion, where it threads pins into 4-millimeter holes. Going from one robot to eight cut the time to master Push-T from roughly five hours down to two, and pin insertion from more than 90 minutes to about 40.

Across the four real-world tasks tested, the agents drove their policies to a 99% success rate, according to the paper. On pin insertion, the agents reached near-perfect reliability faster than a comparable human-in-the-loop method, the kind that still needs someone to show up every morning.

What the Researchers Say

Nvidia's Jim Fan, the GEAR Lab co-lead who directs the company's AI research, described the project as an effort to enable AutoResearch in the physical world for the first time. Fan said the team handed the agents a fleet of robots, a GPU allocation and a token budget, then stepped back and let the robots take over. On June 16, 2026, he wrote:

Today, we enable AutoResearch in the physical world for the first time! Introducing ENPIRE: we give 8 Codex agents a fleet of robots, an allocation of GPUs, and generous token budget. We set them free with a simple goal: solve the task as quickly as possible, keep the robots busy…

Where Simulation Stops and Reality Begins

The gap between simulation and reality showed up almost immediately. According to the paper, all three coding agents solved Push-T inside a simulator, yet two of the three failed once the same task moved onto a physical robot. Simulators do not have friction problems. Real tables do.

Nvidia also put ENPIRE through RoboCasa, a simulated kitchen benchmark that grades robots on chores like opening cabinets or turning off stoves by success rate, mercifully with no risk of burning the place down. There, ENPIRE beat both Nvidia's own end-to-end model GR00T and CaP-X, a tool-using agent that skips the autoresearch loop entirely.

From Eureka to Real Hardware

ENPIRE builds on an idea Nvidia first floated with Eureka, a 2023 system that used a language model to write reward functions for robots inside a simulator instead of relying on human engineers to do it by hand. ENPIRE takes that self-improvement loop off the simulator and onto real hardware, with the agent designing its own tests rather than just its own rewards.

A Race Taking Shape Across the Industry

The release arrived the same week Alibaba unveiled its own embodied-AI push, the Qwen-Robot Suite, a trio of foundation models for robot navigation, manipulation and physics simulation. Alibaba is building software brains for robot bodies it does not manufacture, while Nvidia is testing whether agents can run the entire research loop on hardware it owns end to end. Both point to the same trend, that physical robots are fast becoming the next arena for coding agents to compete in.

Questions & Answers

What is ENPIRE and who built it?
ENPIRE is a framework that hands the entire job of training a robot over to AI coding agents. It was developed by researchers at Nvidia, Carnegie Mellon University and UC Berkeley.
What did the robots learn and how successful were they?
Eight robot arms learned to insert pins, seat graphics cards and cut zip ties, reaching a 99% success rate across four real-world tasks.
How much did scaling from one robot to eight help?
Mastering Push-T dropped from roughly five hours to two, and pin insertion fell from more than 90 minutes to about 40.
What difference showed up between simulation and real robots?
All three coding agents solved Push-T in a simulator, but two of the three failed on a physical robot because real tables have friction problems that simulators do not.
TrendKia Rewards

Read the news, earn real rewards

Every article you read earns points — redeem for gifts up to ₹10,000. Free to join.

Register free & start earning
250Mobile Recharge
12,500 · ≈ 12,500 reads
Start earning
500Gift Voucher
25,000 · ≈ 25,000 reads
Start earning
1,000Gift Card
50,000 · ≈ 50,000 reads
Start earning
2,000Gift Card
1,00,000 · ≈ 1,00,000 reads
Start earning
3,000Shopping Voucher
1,50,000 · ≈ 1,50,000 reads
Start earning
5,000Cash / UPI
2,50,000 · ≈ 2,50,000 reads
Start earning
PREMIUM7,500Cash / UPI
3,75,000 · ≈ 3,75,000 reads
Start earning
PREMIUM10,000Cash / UPI
5,00,000 · ≈ 5,00,000 reads
Start earning
PREMIUM15,000Mega Cash
7,50,000 · ≈ 7,50,000 reads
Start earning

Comments 0

No comments yet — be the first.

Citizen journalism

Become a TrendKia journalist

Voice of the people

Share news, photos and videos from your area with TrendKia and let your voice reach the nation. Every citizen a journalist.

Join now
Citizen journalistCitizen journalist
Citizen journalist
Citizen journalist