Optuna PID Tuning — Design

Date: 2026-03-10 Status: Approved

Problem

The AR4 arm's PID gains for gravity-enabled Gazebo simulation were tuned manually. Joint 5 (wrist pitch) had 27 degrees of sag before joint dynamics were injected. Current manual tuning achieves ~1.5 degrees error, but systematic optimization can do better and cover trajectory tracking.

Architecture

+-----------------------------+     +----------------------------------+
|  Optuna Optimizer Container |     |  Sim Container (existing)        |
|  (standalone, no ROS)       |     |                                  |
|                             |     |  Gazebo + ros2_control + MoveIt  |
|  optuna + zenoh + pycdr2    |     |         | ROS 2 DDS              |
|                             |     |  Tuning Bridge Node (ROS+Zenoh)  |
|  Publishes:                 |     |    - reconfigures controller     |
|    ar4/tuning/gains         |<--->|    - executes trajectories       |
|    ar4/tuning/command       |     |    - reports status and errors   |
|  Subscribes:                |     |                                  |
|    ar4/joint_states         |     |  zenoh-bridge-ros2dds            |
|    ar4/tuning/status        |     |    - bridges /joint_states       |
+-----------------------------+     +----------------------------------+
            |                                    |
        Zenoh Router (eclipse/zenoh:latest)

The optimizer is fully decoupled from ROS 2 — it communicates only via Zenoh. This follows the same pattern used in turtlebot-maze for the YOLO detector and SLAM containers.

Trial Sequence

Each Optuna trial runs this sequence:

  1. Optimizer samples 24 parameters (P, I, D, i_clamp x 6 joints)
  2. Publishes gains JSON to ar4/tuning/gains
  3. Publishes reconfigure to ar4/tuning/command
  4. Bridge node deactivates JTC, updates params, reactivates JTC
  5. Bridge node publishes ready on ar4/tuning/status
  6. Phase 1 (Home hold): Optimizer observes ar4/joint_states for ~10s sim time, computes steady-state error at home position (all zeros)
  7. Optimizer publishes move_to_reach to ar4/tuning/command
  8. Bridge node sends trajectory goal to reach pose via JTC action
  9. Phase 2 (Trajectory tracking): Optimizer observes joint_states for ~20s sim time, computes settling time and steady-state error at reach pose
  10. Optimizer computes cost, returns to Optuna

Optimization Parameters

Search Space (per joint, 24 total)

ParamRangeScale
P100 - 10000log-uniform
I10 - 2000log-uniform
D5 - 500log-uniform
i_clamp10 - 500log-uniform

Objective Function

cost = w1 * sum(steady_state_error_per_joint) + w2 * max(settling_time_per_joint)
  • steady_state_error: mean absolute error over last 2s of each phase
  • settling_time: time for joint error to stay within 0.01 rad (~0.6 degrees) of target
  • Weights: w1=10 (prioritize accuracy), w2=1 (secondary: speed)

Reach Pose

A joint configuration that exercises all joints within safe limits:

reach_pose = [0.5, 0.3, -0.4, 0.8, -0.5, 0.3]  # radians

Zenoh Interface

Keys

KeyDirectionFormatPurpose
ar4/tuning/gainsoptimizer -> bridgeJSONPID gain vector (24 params)
ar4/tuning/commandoptimizer -> bridgeJSONTrial commands: reconfigure, move_to_reach, move_to_home
ar4/tuning/statusbridge -> optimizerJSONTrial state: ready, moving, settled, error
ar4/joint_statesbridge -> optimizerCDRBridged from ROS /joint_states

Zenoh Storage

In-memory storage for ar4/tuning/* keys and ar4/joint_states, configured in zenoh/zenoh-storage.json5.

Components

ComponentLocationDependenciesROS?
pid_optimizer.pyar4_skills/tuning/pid_optimizer.pyoptuna, zenoh, pycdr2, numpyNo
tuning_bridge_node.pyar4_skills/ar4_skills/tuning_bridge_node.pyrclpy, zenoh, controller_manager_msgsYes
Dockerfile.tuningdocker/Dockerfile.tuningpython:3.12-slim baseNo
zenoh-storage.json5zenoh/zenoh-storage.json5-No
docker-compose servicesdocker-compose.yaml--

Docker Compose Services

zenoh-router:
  image: eclipse/zenoh:latest
  network_mode: host
  command: --config /config/zenoh-storage.json5
  volumes:
    - ./zenoh:/config:ro
 
zenoh-bridge:
  extends: overlay
  command: zenoh-bridge-ros2dds -e tcp/localhost:7447 --mode client
  # requires zenoh-bridge-ros2dds installed in overlay image
 
pid-tuner:
  build:
    context: .
    dockerfile: docker/Dockerfile.tuning
  network_mode: host
  command: python pid_optimizer.py --connect tcp/localhost:7447 --n-trials 300
  volumes:
    - ./data/tuning:/data/tuning:rw

Persistence

  • Optuna study: data/tuning/optuna_study.db (SQLite, resumable)
  • Best gains: data/tuning/best_gains.yaml (written after study completes)
  • Trial logs: data/tuning/trials.jsonl (per-trial metrics for analysis)

Run Configuration

  • Trials: 300
  • Sampler: TPE (Optuna default, good for 24-dim)
  • Estimated time: ~30s per trial, ~2.5 hours total
  • Resumable: yes, Optuna SQLite storage allows stopping and resuming

Success Criteria

  • All joints hold home position within 0.5 degrees steady-state error
  • Trajectory to reach pose settles within 5 seconds sim time
  • No oscillation (velocity stays below 0.1 rad/s after settling)

On this page