Remote GPU (Inference Server)

If you don't have a local GPU but have access to a remote GPU server, you can run inference there and stream actions back to your local machine over an SSH tunnel.

Your laptop                       Remote GPU server
──────────────────                ──────────────────────────────
Gazebo sim  ──ROS 2──► smolvla ──HTTP──► smolvla-server
RViz2                  (local)    :8000   (remote, has GPU)

Step 1 — Start the inference server on the GPU machine

Only the gpu-server/ar4-infer/ directory is needed on the GPU server. If you trained there already (Workflow 2 Option B), the repo is already cloned.

# On the GPU server
cd ~/ar4-physical-ai/gpu-server/ar4-infer
 
# Start the server — auto-selects the latest checkpoint under DATA_DIR
DATA_DIR=~/ar4-physical-ai/gpu-server/ar4-train/data/checkpoints/smolvla_ar4 \
  docker compose -f docker-compose.infer-server.yaml up

To pin a specific checkpoint step:

DATA_DIR=~/ar4-physical-ai/gpu-server/ar4-train/data/checkpoints/smolvla_ar4 \
SMOLVLA_CHECKPOINT=/data/checkpoints/040000/pretrained_model \
  docker compose -f docker-compose.infer-server.yaml up

Step 2 — Open an SSH tunnel (keep this terminal open)

# On your local machine — forwards GPU server port 8000 to localhost:8000
ssh -L 8000:localhost:8000 user@gpu-server

Step 3 — Start sim and point inference at the tunnel

# Terminal 1 — simulation
docker compose up
 
# Terminal 2 — inference node (uses remote GPU via tunnel)
SMOLVLA_SERVER_URL=http://localhost:8000 docker compose run --rm smolvla

The smolvla container sends observations to the server over HTTP and receives joint targets back. The arm moves as if the GPU were local.

Remote GPU (Inference Server)

Step 1 — Start the inference server on the GPU machine

Step 2 — Open an SSH tunnel (keep this terminal open)

Step 3 — Start sim and point inference at the tunnel

On this page