ABSTRACT
While soft robots offer intrinsic compliance and safety, controlling them poses fundamental challenges due to high dimensionality, strong nonlinear and time-varying dynamics, and partial observability. Many learning-based controllers tackle these challenges with complex architectures, yet often suffer from multi-solution ambiguity, distribution shift, and compounding error over long horizons. We pursue a deliberately minimal alternative. We learn a simple dynamics model directly from experimental data for compliant motion, and embed it into a novel iterative Model Predictive Path Integral controller (I-MPPI). In each control cycle, I-MPPI performs multiple short-horizon, uncertainty-aware sampling updates that progressively refine the control sequence via importance weighting, mitigating model bias and long-horizon error accumulation while keeping computation lightweight. This simple two-component design—model learning with iterative sampling-based MPC requires few assumptions, avoids large policy networks or differentiable simulators, and is easy to implement and tune. Experiments on a multi-segment soft robot show real-time, robust, and high-precision trajectory tracking under disturbances and parameter variations, highlighting the practicality and deployability of minimal learning-based control for soft-robotic systems.
METHOD
The proposed I-MPPI architecture
The overall architecture consists of a learned simple model and an iterative MPPI controller. In each control cycle, multiple short-horizon sampling and weighted updates are performed, and only the first action is executed. This iterative sampling progressively reduces the exploration range, improving stability and accuracy.
Simple model
The model is a compact one-step forward predictor (MLP), trained directly from robot interaction data. The dataset includes steady-state responses, random excitation, and trajectory tracking runs to ensure sufficient coverage of the operating envelope.
Iterative MPPI
The controller builds upon MPPI, but performs multiple refinement iterations per cycle and applies a warm-start strategy to reduce variance and accelerate convergence.
Iterative annealing
In each iteration, the sampling noise scale is gradually reduced, transitioning from coarse exploration to local fine-tuning. This annealing mechanism reduces jitter and improves steady-state precision.
Hardware of soft robot
The experimental platform is a multi-segment rigid–flexible soft robot, equipped with nine pneumatic actuation units and corresponding drivers, as well as an optical tracking system for real-time feedback.
EXPERIMENTS ON REAL ROBOTS
Trajectory tracking (1×)
We evaluated circular, square, and figure-eight reference paths, comparing theoretical trajectories with actual execution. The results show that the method achieves millimeter-level accuracy across all paths, with the best performance on smooth references.
Moving target tracking (2×)
Eight sequential targets were placed near the workspace boundaries. The robot converged to each target quickly, with average errors on the order of a few millimeters. Errors were larger for diagonal targets, consistent with stronger coupling and larger actuation ranges, but overall performance followed expected modeling trends.
VIDEO
The complete video demonstrates the deployment of the method on multi-segment soft robots, including model training, I-MPPI control process, and real-world task execution.