Workflows
Train SmolVLA
Training is done outside of this Docker Compose stack — you need a machine or cloud instance with a GPU. The config file is configs/smolvla_ar4.yaml.
Option A — Local GPU
With lerobot installed in your Python environment:
Checkpoints save to data/checkpoints/smolvla_ar4/ every 10 000 steps.
Option B — Remote GPU server
The gpu-server/ar4-train/ directory contains everything you need for training — you don't need the full repo on the GPU machine.
-
Get the training files on the server:
-
Sync your dataset to the server:
-
Train on the server:
Checkpoints save to
data/checkpoints/smolvla_ar4/under the training directory. -
Copy the checkpoint back locally:
Key training config options
Open gpu-server/ar4-train/configs/smolvla_ar4.yaml to adjust:
| Field | Default | Description |
|---|---|---|
steps | 50000 | Total training steps |
batch_size | 32 | Reduce if you get OOM errors |
save_freq | 10000 | How often to write a checkpoint |
freeze_vision_encoder | true | Keep VLM backbone frozen (faster, usually better for small datasets) |
wandb.enable | false | Set to true to log to Weights & Biases |