New Gradient-Based Planner 'GRASP' Overcomes Long-Horizon Challenges in World Models

Urgent: Researchers Unveil GRASP, a Breakthrough for Long-Horizon Planning in Learned Dynamics

In a major advance for artificial intelligence, a team of researchers has introduced GRASP—a novel gradient-based planner designed to make long-horizon planning in learned world models practical for the first time. The planner tackles the fragility that has plagued optimization over extended time horizons in high-dimensional latent spaces.

New Gradient-Based Planner 'GRASP' Overcomes Long-Horizon Challenges in World Models — Source: bair.berkeley.edu

"Planning over many steps with modern world models has been surprisingly fragile, with optimization becoming ill-conditioned and prone to bad local minima," said Amir Bar, co-author of the study. "GRASP fundamentally changes this by restructuring how gradients flow through the model."

Background: The World Model Challenge

Learned world models have grown increasingly powerful, capable of predicting long sequences of future observations in high-dimensional visual spaces. These models generalize across tasks, resembling general-purpose simulators rather than task-specific predictors.

However, using these models for control and planning over long horizons has remained a critical bottleneck. Traditional gradient-based methods suffer from ill-conditioned gradients, non-convex optimization landscapes, and subtle failure modes unique to high-dimensional latent representations.

The research team, including Mike Rabbat, Aditi Krishnapriyan, Yann LeCun, and Amir Bar, developed GRASP to address these fundamental issues.

How GRASP Works: Three Key Innovations

GRASP introduces three core techniques that together enable robust long-horizon planning. First, it lifts the trajectory into a set of virtual states, allowing optimization to be parallelized across time steps.

Second, the planner adds stochasticity directly to the state iterates during optimization, enhancing exploration and helping to escape bad local minima. Third, it reshapes gradients so that action signals remain clean while avoiding brittle "state-input" gradients through high-dimensional vision models.

"With GRASP, the optimization landscape becomes much more navigable," the team explained. "We effectively bypass the fragile pathways that made previous approaches unreliable."

What This Means for AI Planning

This breakthrough has immediate implications for robotics, autonomous systems, and any domain requiring long-term predictive control. GRASP could enable more reliable planning in tasks such as manipulation, navigation, and video game playing.

By making gradient-based planning practical for longer horizons, the work narrows the gap between powerful learned simulators and their effective deployment in real-world decision-making. The researchers believe this will accelerate progress toward general-purpose agents capable of complex, long-term reasoning.

Further details and implementation notes are expected in an upcoming publication. The team has made the code available for academic use. For more information, see the original research blog.

New Gradient-Based Planner 'GRASP' Overcomes Long-Horizon Challenges in World Models

Urgent: Researchers Unveil GRASP, a Breakthrough for Long-Horizon Planning in Learned Dynamics

Background: The World Model Challenge

How GRASP Works: Three Key Innovations

What This Means for AI Planning

Related Articles

Recommended

Discover More