I-MPPI for Multi-Segment Soft Robots:
A Minimal Model-Learning Approach

ABSTRACT

While soft robots offer intrinsic compliance and safety, controlling them poses fundamental challenges due to high dimensionality, strong nonlinear and time-varying dynamics, and partial observability. Many learning-based controllers tackle these challenges with complex architectures, yet often suffer from multi-solution ambiguity, distribution shift, and compounding error over long horizons. We pursue a deliberately minimal alternative. We learn a simple dynamics model directly from experimental data for compliant motion, and embed it into a novel iterative Model Predictive Path Integral controller (I-MPPI). In each control cycle, I-MPPI performs multiple short-horizon, uncertainty-aware sampling updates that progressively refine the control sequence via importance weighting, mitigating model bias and long-horizon error accumulation while keeping computation lightweight. This simple two-component design—model learning with iterative sampling-based MPC requires few assumptions, avoids large policy networks or differentiable simulators, and is easy to implement and tune. Experiments on a multi-segment soft robot show real-time, robust, and high-precision trajectory tracking under disturbances and parameter variations, highlighting the practicality and deployability of minimal learning-based control for soft-robotic systems.

METHOD

The proposed I-MPPI architecture

The overall architecture consists of a learned simple model and an iterative MPPI controller. In each control cycle, multiple short-horizon sampling and weighted updates are performed, and only the first action is executed. This iterative sampling progressively reduces the exploration range, improving stability and accuracy.

Simple model

The model is a compact one-step forward predictor (MLP), trained directly from robot interaction data. The dataset includes steady-state responses, random excitation, and trajectory tracking runs to ensure sufficient coverage of the operating envelope.

Iterative MPPI

The controller builds upon MPPI, but performs multiple refinement iterations per cycle and applies a warm-start strategy to reduce variance and accelerate convergence.

Iterative annealing

In each iteration, the sampling noise scale is gradually reduced, transitioning from coarse exploration to local fine-tuning. This annealing mechanism reduces jitter and improves steady-state precision.

Hardware of soft robot

The experimental platform is a multi-segment rigid–flexible soft robot, equipped with nine pneumatic actuation units and corresponding drivers, as well as an optical tracking system for real-time feedback.

EXPERIMENTS ON REAL ROBOTS

Trajectory tracking (1×)

We evaluated circular, square, and figure-eight reference paths, comparing theoretical trajectories with actual execution. The results show that the method achieves millimeter-level accuracy across all paths, with the best performance on smooth references.

Moving target tracking (2×)

Eight sequential targets were placed near the workspace boundaries. The robot converged to each target quickly, with average errors on the order of a few millimeters. Errors were larger for diagonal targets, consistent with stronger coupling and larger actuation ranges, but overall performance followed expected modeling trends.

VIDEO

The complete video demonstrates the deployment of the method on multi-segment soft robots, including model training, I-MPPI control process, and real-world task execution.