Architecture

AR4 Physical-AI is a VLA (Vision-Language-Action) platform layered on top of the AR4 ROS driver and LeRobot. The key architectural decisions:

LeRobot-native — uses lerobot-ros as the bridge between ROS 2 and LeRobot for recording, training, and inference (see Research: LeRobot Integration)
Submodule for driver — upstream ar4_ros_driver stays independently updateable
Docker-first — multi-stage GPU containers (base → overlay → dev) with docker-compose orchestration
Simulation-first — physics-enabled Gazebo world with gravity, contact properties, and graspable objects for policy development
Zenoh middleware — decouples non-ROS components from DDS for future inference pipelines
LeRobot dataset format — episodes recorded via lerobot-record, stored in LeRobot v3.0 format

The platform uses lerobot-ros by the same author as the AR4 ROS driver. It provides a LeRobot Robot plugin (AnninAR4) that bridges ROS 2 topics to LeRobot's recording, training, and inference pipeline.

See Research: LeRobot Integration for the full investigation and rationale.

Data flow

Recording — lerobot-record calls ROS2Robot.get_observation() (subscribes to /joint_states) and ROS2Robot.send_action() (publishes to MoveIt Servo or trajectory controller). Episodes are stored in LeRobot v3.0 dataset format.
Training — lerobot-train trains an ACT, Diffusion Policy, or VLA model from the recorded dataset.
Inference — lerobot-evaluate runs the trained policy through the same ROS2Robot interface back into Gazebo or real hardware.

VLA backend progression

Phase	Backend	Notes
v1	LeRobot ACT	Imitation learning baseline, trained on AR4 teleop data
v1.5	Cross-embodiment transfer	Fine-tune SO-101 policies on AR4 data
v2	Pi0 / GR00T N1.5	Foundation VLA models, zero-shot or fine-tuned

Simulation

The annin_ar4_gazebo package (from the upstream vendor submodule) provides the Gazebo Harmonic simulation. Launch with:

ros2 launch annin_ar4_gazebo gazebo.launch.py ar_model:=mk5

Or via Docker:

docker compose up sim-tabletop moveit

Docker Infrastructure

All services run in Docker containers orchestrated by docker-compose.yaml. The Dockerfile uses a multi-stage build:

Service	Image	Purpose
`sim-tabletop`	overlay	Gazebo GUI simulation
`sim-tabletop-headless`	overlay	Server-only Gazebo (CI, headless)
`moveit`	overlay	MoveIt2 motion planning + RViz2
`hardware`	base	Real AR4 hardware driver (calibrates on startup)
`moveit-hardware`	base	MoveIt2 + RViz2 for real hardware (auto-starts hardware)
`foxglove-bridge`	overlay	WebSocket bridge for Foxglove Studio (:8765)
`zenoh-router`	eclipse/zenoh	Central Zenoh broker with in-memory storage (:7447)
`zenoh-bridge`	overlay	DDS-to-Zenoh bridge for cross-container pub/sub
`dev`	dev	VS Code devcontainer with source mounts

Zenoh Middleware

Zenoh provides a lightweight pub/sub transport layer that decouples non-ROS components (like the Optuna PID tuner) from the DDS discovery mesh. This avoids requiring ROS 2 in every container.

CycloneDDS Configuration

This host has multiple network interfaces (Docker bridges, Tailscale, Cloudflare WARP) which confuse CycloneDDS multicast discovery. Two configs force loopback-only:

docker/cyclonedds.xml (bundled at /etc/ros/cyclonedds.xml) — sim and hardware containers
zenoh/cyclonedds-bridge.xml — zenoh-bridge (without SharedMemory element, unsupported by bundled CycloneDDS)

Episode Recording

Episodes are recorded using lerobot-record through the lerobot-ros bridge. Each frame captures:

Joint positions (6 DOF + gripper) as observation.state
Camera images as observation.images.{camera_name}
Joint action commands as action
Natural language task instruction

Datasets are stored in LeRobot v3.0 format (Parquet + MP4) and can be pushed to HuggingFace Hub.

Repository Structure

ar4_msgs/                    Custom ROS 2 interfaces
docker/                      Dockerfile.gpu (multi-stage), entrypoint.sh, cyclonedds.xml
zenoh/                       Zenoh router config, CycloneDDS bridge config
data/                        Local data store (gitignored)
docs/                        Documentation (Docusaurus source)
docs-site/                   Docusaurus website
.devcontainer/               VS Code dev container config
vendor/ar4_ros_driver/       Git submodule — upstream AR4 ROS driver
  annin_ar4_description/       URDF/Xacro + meshes
  annin_ar4_driver/            Hardware interface (Teensy, ros2_control)
  annin_ar4_firmware/          Teensy + Arduino Nano firmware
  annin_ar4_gazebo/            Gazebo Harmonic simulation
  annin_ar4_moveit_config/     MoveIt2 motion planning

External dependencies (pip-installed, not in this repo):

lerobot                      HuggingFace LeRobot framework
lerobot-robot-ros            ROS 2 bridge for LeRobot (AnninAR4 config)
lerobot-teleoperator-devices Keyboard + gamepad teleoperators

System Overview

LeRobot Integration