GEAR-SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control

Model Description

SONIC (Supersizing Motion Tracking) is a humanoid behavior foundation model developed by NVIDIA that gives robots a core set of motor skills learned from large-scale human motion data. Rather than building separate controllers for predefined motions, SONIC uses motion tracking as a scalable training task, enabling a single unified policy to produce natural, whole-body movement and support a wide range of behaviors.

Key Features

  • ๐Ÿค– Unified Whole-Body Control: Single policy handles walking, running, crawling, jumping, manipulation, and more
  • ๐ŸŽฏ Motion Tracking: Trained on large-scale human motion data for natural movements
  • ๐ŸŽฎ Real-Time Teleoperation: VR-based whole-body teleoperation via PICO headset
  • ๐Ÿš€ Hardware Deployment: C++ inference stack for real-time control on humanoid robots
  • ๐ŸŽจ Kinematic Planner: Real-time locomotion generation with multiple movement styles
  • ๐Ÿ”„ Multi-Modal Control: Supports keyboard, gamepad, VR, and high-level planning

VR Whole-Body Teleoperation

SONIC supports real-time whole-body teleoperation via PICO VR headset, enabling natural human-to-robot motion transfer for data collection and interactive control.

Walking Running
Sideways Movement Kneeling
Getting Up Jumping
Bimanual Manipulation Object Hand-off

Kinematic Planner

SONIC includes a kinematic planner for real-time locomotion generation โ€” choose a movement style, steer with keyboard/gamepad, and adjust speed and height on the fly.

In-the-Wild Navigation
Run Happy
Stealth Injured
Kneeling Hand Crawling
Elbow Crawling Boxing

Quick Start

๐Ÿ“š See the Quick Start Guide for step-by-step instructions on:

  • Installation and setup
  • Running SONIC with different control modes (keyboard, gamepad, VR)
  • Deploying on real hardware
  • Using the kinematic planner

Key Resources:

Model Checkpoints

All checkpoints (ONNX format) are available directly in this repository. Inference is powered by TensorRT and runs on both desktop and Jetson hardware.

Checkpoint File Description
Policy encoder model_encoder.onnx Encodes motion reference into latent
Policy decoder model_decoder.onnx Decodes latent into joint actions
Kinematic planner planner_sonic.onnx Real-time locomotion style planner
Low-latency policy low_latency/model_encoder.onnx, low_latency/model_decoder.onnx, low_latency/observation_config.yaml Reduced-lookahead G1 controller variant
Low-latency PyTorch checkpoint low_latency/last.pt Training checkpoint and config for the low-latency variant

Quick download (requires pip install huggingface_hub):

from huggingface_hub import snapshot_download
snapshot_download(repo_id="nvidia/GEAR-SONIC", local_dir="gear_sonic_deploy")

Or use the download script from the GitHub repo:

python download_from_hf.py             # policy + planner (default)
python download_from_hf.py --no-planner # policy only
python download_from_hf.py --low-latency # low-latency policy + planner

See the Download Models guide for full instructions.

Low-Latency Inference

The low-latency SONIC variant is stored under low_latency/ and does not replace the default top-level policy. To run it with the C++ deployment stack:

git clone https://github.com/NVlabs/GR00T-WholeBodyControl.git
cd GR00T-WholeBodyControl
pip install huggingface_hub
python download_from_hf.py --low-latency

C++ deployment

Simulation:

cd gear_sonic_deploy
./deploy.sh \
    --cp policy/low_latency/model \
    --obs-config policy/low_latency/observation_config.yaml \
    sim

Real robot VLA or teleoperation:

cd gear_sonic_deploy
./deploy.sh \
    --cp policy/low_latency/model \
    --obs-config policy/low_latency/observation_config.yaml \
    --input-type zmq_manager \
    real

deploy.sh expects --cp to be the shared model prefix and appends _encoder.onnx and _decoder.onnx internally. For the full VLA deployment flow, see the VLA Inference guide.

Python launcher

The Python VLA launcher can start the same low-latency C++ deployment pane and run the Python VLA client, keyboard publisher, and optional data exporter:

python gear_sonic/scripts/launch_inference.py \
    --deploy-checkpoint policy/low_latency/model \
    --deploy-obs-config policy/low_latency/observation_config.yaml \
    --camera-host 192.168.123.164 \
    --prompt "pick up the cup"

Python checkpoint evaluation

For Isaac Lab evaluation or rendering from the released PyTorch checkpoint:

python download_from_hf.py --training --low-latency
python download_from_hf.py --sample

python gear_sonic/eval_agent_trl.py \
    +checkpoint=low_latency/last.pt \
    +headless=False \
    ++num_envs=1 \
    ++manager_env.observations.policy.enable_corruption=False \
    ++manager_env.observations.tokenizer.enable_corruption=False \
    "++manager_env.commands.motion.motion_lib_cfg.motion_file=sample_data/robot_filtered" \
    "++manager_env.commands.motion.motion_lib_cfg.smpl_motion_file=sample_data/smpl_filtered"

Documentation

๐Ÿ“š Full Documentation

Guides

Tutorials

Repository Structure

GR00T-WholeBodyControl/
โ”œโ”€โ”€ gear_sonic_deploy/     # C++ inference stack for deployment
โ”œโ”€โ”€ gear_sonic/            # Teleoperation and data collection tools
โ”œโ”€โ”€ decoupled_wbc/         # Decoupled WBC (GR00T N1.5/N1.6)
โ”œโ”€โ”€ docs/                  # Documentation source
โ””โ”€โ”€ media/                 # Videos and images

Related Projects

This repository is part of NVIDIA's GR00T (Generalist Robot 00 Technology) initiative:

Citation

If you use GEAR-SONIC in your research, please cite:

@article{luo2025sonic,
    title={SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control},
    author={Luo, Zhengyi and Yuan, Ye and Wang, Tingwu and Li, Chenran and Chen, Sirui and Casta\~neda, Fernando and Cao, Zi-Ang and Li, Jiefeng and Minor, David and Ben, Qingwei and Da, Xingye and Ding, Runyu and Hogg, Cyrus and Song, Lina and Lim, Edy and Jeong, Eugene and He, Tairan and Xue, Haoru and Xiao, Wenli and Wang, Zi and Yuen, Simon and Kautz, Jan and Chang, Yan and Iqbal, Umar and Fan, Linxi and Zhu, Yuke},
    journal={arXiv preprint arXiv:2511.07820},
    year={2025}
}

License

This project uses dual licensing:

  • Source Code: Apache License 2.0 - applies to all code, scripts, and software components
  • Model Weights: NVIDIA Open Model License - applies to all trained model checkpoints

Key points of the NVIDIA Open Model License:

  • โœ… Commercial use permitted with attribution
  • โœ… Modification and distribution allowed
  • โš ๏ธ Must comply with NVIDIA's Trustworthy AI terms
  • โš ๏ธ Model outputs subject to responsible use guidelines

See LICENSE for complete terms.

Support & Contact

Acknowledgments

This work builds upon and acknowledges:

  • Beyond Mimic - Whole-body tracking foundation
  • Isaac Lab - Robot learning framework
  • NVIDIA Research GEAR Lab team
  • All contributors and collaborators

Model Card Contact

For questions about this model card or responsible AI considerations, contact: gear-wbc@nvidia.com

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Paper for nvidia/GEAR-SONIC