Architecture

AR4 Physical-AI is a VLA (Vision-Language-Action) platform layered on top of the AR4 ROS driver and LeRobot. The key architectural decisions:

  1. LeRobot-native — uses lerobot-ros as the bridge between ROS 2 and LeRobot for recording, training, and inference (see Research: LeRobot Integration)
  2. Submodule for driver — upstream ar4_ros_driver stays independently updateable
  3. Docker-first — multi-stage GPU containers (base → overlay → dev) with docker-compose orchestration
  4. Simulation-first — physics-enabled Gazebo world with gravity, contact properties, and graspable objects for policy development
  5. Zenoh middleware — decouples non-ROS components from DDS for future inference pipelines
  6. LeRobot dataset format — episodes recorded via lerobot-record, stored in LeRobot v3.0 format

System Overview

LeRobot Integration

The platform uses lerobot-ros by the same author as the AR4 ROS driver. It provides a LeRobot Robot plugin (AnninAR4) that bridges ROS 2 topics to LeRobot's recording, training, and inference pipeline.

See Research: LeRobot Integration for the full investigation and rationale.

Data flow

  1. Recordinglerobot-record calls ROS2Robot.get_observation() (subscribes to /joint_states) and ROS2Robot.send_action() (publishes to MoveIt Servo or trajectory controller). Episodes are stored in LeRobot v3.0 dataset format.
  2. Traininglerobot-train trains an ACT, Diffusion Policy, or VLA model from the recorded dataset.
  3. Inferencelerobot-evaluate runs the trained policy through the same ROS2Robot interface back into Gazebo or real hardware.

VLA backend progression

PhaseBackendNotes
v1LeRobot ACTImitation learning baseline, trained on AR4 teleop data
v1.5Cross-embodiment transferFine-tune SO-101 policies on AR4 data
v2Pi0 / GR00T N1.5Foundation VLA models, zero-shot or fine-tuned

Simulation

The annin_ar4_gazebo package (from the upstream vendor submodule) provides the Gazebo Harmonic simulation. Launch with:

ros2 launch annin_ar4_gazebo gazebo.launch.py ar_model:=mk5

Or via Docker:

docker compose up sim-tabletop moveit

Docker Infrastructure

All services run in Docker containers orchestrated by docker-compose.yaml. The Dockerfile uses a multi-stage build:

ServiceImagePurpose
sim-tabletopoverlayGazebo GUI simulation
sim-tabletop-headlessoverlayServer-only Gazebo (CI, headless)
moveitoverlayMoveIt2 motion planning + RViz2
hardwarebaseReal AR4 hardware driver (calibrates on startup)
moveit-hardwarebaseMoveIt2 + RViz2 for real hardware (auto-starts hardware)
foxglove-bridgeoverlayWebSocket bridge for Foxglove Studio (:8765)
zenoh-routereclipse/zenohCentral Zenoh broker with in-memory storage (:7447)
zenoh-bridgeoverlayDDS-to-Zenoh bridge for cross-container pub/sub
devdevVS Code devcontainer with source mounts

Zenoh Middleware

Zenoh provides a lightweight pub/sub transport layer that decouples non-ROS components (like the Optuna PID tuner) from the DDS discovery mesh. This avoids requiring ROS 2 in every container.

CycloneDDS Configuration

This host has multiple network interfaces (Docker bridges, Tailscale, Cloudflare WARP) which confuse CycloneDDS multicast discovery. Two configs force loopback-only:

  • docker/cyclonedds.xml (bundled at /etc/ros/cyclonedds.xml) — sim and hardware containers
  • zenoh/cyclonedds-bridge.xml — zenoh-bridge (without SharedMemory element, unsupported by bundled CycloneDDS)

Episode Recording

Episodes are recorded using lerobot-record through the lerobot-ros bridge. Each frame captures:

  • Joint positions (6 DOF + gripper) as observation.state
  • Camera images as observation.images.{camera_name}
  • Joint action commands as action
  • Natural language task instruction

Datasets are stored in LeRobot v3.0 format (Parquet + MP4) and can be pushed to HuggingFace Hub.

Repository Structure

ar4_msgs/                    Custom ROS 2 interfaces
docker/                      Dockerfile.gpu (multi-stage), entrypoint.sh, cyclonedds.xml
zenoh/                       Zenoh router config, CycloneDDS bridge config
data/                        Local data store (gitignored)
docs/                        Documentation (Docusaurus source)
docs-site/                   Docusaurus website
.devcontainer/               VS Code dev container config
vendor/ar4_ros_driver/       Git submodule — upstream AR4 ROS driver
  annin_ar4_description/       URDF/Xacro + meshes
  annin_ar4_driver/            Hardware interface (Teensy, ros2_control)
  annin_ar4_firmware/          Teensy + Arduino Nano firmware
  annin_ar4_gazebo/            Gazebo Harmonic simulation
  annin_ar4_moveit_config/     MoveIt2 motion planning

External dependencies (pip-installed, not in this repo):

lerobot                      HuggingFace LeRobot framework
lerobot-robot-ros            ROS 2 bridge for LeRobot (AnninAR4 config)
lerobot-teleoperator-devices Keyboard + gamepad teleoperators

On this page