* denotes equal contribution and denotes equal advising. Below are selected papers. The full publication list is here. Topic-wise selected papers can be found in the research topic page.

2026

Scalable Behavior Cloning with Open Data, Training, and Evaluation

Arthur Allshire*, Himanshu Gaurav Singh*, Ritvik Singh*, Adam Rashid*, Hongsuk Choi*, David McAllister*, Justin Yu, Yiyuan Chen, Huang Huang, Pieter Abbeel, Xi Chen, Rocky Duan, Phillip Isola, Jitendra Malik, Fred Shentu, Guanya Shi, Philipp Wu, Angjoo Kanazawa

arXiv preprint

TL;DR: Open data, training, and infra for robotics. We release the largest teleop dataset (>3.5K hours, >130K episodes) to date, and extensively investigate training techniques.

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

Wenli Xiao*, Jia Xie*, Tonghe Zhang*, Haotian Lin*, Letian "Max" Fu, Haoru Xue, Jalen Lu, Yi Yang, Cunxi Dai, Zi Wang, Jimmy Wu, Guanzhi Wang, S. Shankar Sastry, Ken Goldberg, Linxi "Jim" Fan, Yuke Zhu, Guanya Shi

arXiv preprint

TL;DR: ENPIRE gives tool-calling coding agents a real-world feedback loop, enabling autonomous policy self-improvement to 99% success on dexterous manipulation tasks.

Learning Embodiment-Agnostic Intent Models from Unstructured Human Videos for Scalable Dexterous Robot Skill Acquisition

Harsh Gupta, Guanya Shi, Wenzhen Yuan

arXiv preprint

TL;DR: LUCID separates human-video intent from sim-trained control, enabling scalable real-world dexterous manipulation across tasks and embodiments w/o any real robot data.

PGDG

PGDG: Physically Grounded Data Generation for Robust Bimanual Policy Learning from a Single Demonstration

Cunxi Dai*, Haoran Chang*, Aditya Nisal, Rahul Kumar, Guofei Chen, Tao Chen, Yuzhe Qin, Guanya Shi

arXiv preprint

TL;DR: PGDG expands one demonstration into a physically grounded recovery dataset, enabling robust BC policy learning and fine-tuning.

CaP-X

CaP-X: A Framework for Benchmarking and Improving Coding Agents for Robot Manipulation

Max Fu*, Justin Yu*, Karim El-Refai*, Ethan Kou*, Haoru Xue*, Huang Huang, Wenli Xiao, Guanzhi Wang, Fei-Fei Li, Guanya Shi, Jiajun Wu, Shankar Sastry, Yuke Zhu, Ken Goldberg, Linxi "Jim" Fan

International Conference on Machine Learning (ICML), 2026

TL;DR: CaP-X benchmarks and improves embodied coding agents, enabling LMs to write robot-control code that generalizes zero-shot.

RPL

RPL: Learning Robust Humanoid Perceptive Locomotion on Challenging Terrains

Yuanhang Zhang, Younggyo Seo, Juyue Chen, Yifu Yuan, Koushil Sreenath, Pieter Abbeel, Carmelo Sferrazza, Karen Liu, Rocky Duan, Guanya Shi

arXiv preprint

TL;DR: A single policy trained by RPL enables multi-directional robust humanoid locomotion over various challenging terrains.

Perceptive Humanoid Parkour

Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching

Zhen Wu*, Xiaoyu Huang*, Lujie Yang*, Yuanhang Zhang, Xi Chen, Pieter Abbeel, Rocky Duan, Angjoo Kanazawa, Carmelo Sferrazza, Guanya Shi, C. Karen Liu

Robotics: Science and Systems (RSS), 2026

TL;DR: PHP enables humanoid robots to autonomously perform long-horizon, vision-based parkour across challenging obstacle courses.

VIRAL

VIRAL: Visual Sim-to-Real at Scale for Humanoid Loco-Manipulation

Tairan He*, Zi Wang*, Haoru Xue*, Qingwei Ben*, Zhengyi Luo, Wenli Xiao, Ye Yuan, Xingye Da, Fernando Castañeda, Shankar Sastry, Changliu Liu, Guanya Shi, Linxi Fan, Yuke Zhu

IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2026

TL;DR: VIRAL investigates the scaling law of visual sim-to-real and finds a recipe to achieve zero-shot, robust, and continuous real-world deployment.

UMI-on-Air

UMI-on-Air: Embodiment-Aware Guidance for Embodiment-Agnostic Visuomotor Policies

Harsh Gupta, Xiaofeng Guo, Huy Ha, Chuer Pan, Muqing Cao, Dongjae Lee, Sebastian Scherer, Shuran Song, Guanya Shi

International Conference on Robotics and Automation (ICRA), 2026

TL;DR: EADP steers UMI's embodiment-agnostic diffusion policy using the gradient of the low-level controller's tracking cost for cross-embodiment.

OmniRetarget

OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction

Lujie Yang*, Xiaoyu Huang*, Zhen Wu*, Angjoo Kanazawa, Pieter Abbeel, Carmelo Sferrazza, C. Karen Liu, Rocky Duan, Guanya Shi

International Conference on Robotics and Automation (ICRA), 2026

(Best Conference Paper Award)

(Best Paper Award on Robot Manipulation and Locomotion)

TL;DR: High-quality interaction-preserving motion reference generation that enables agile whole-body skills with minimal RL tracking.

TWIST2

TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System

Yanjie Ze, Siheng Zhao, Weizhuo Wang, Angjoo Kanazawa, Rocky Duan, Pieter Abbeel, Guanya Shi, Jiajun Wu, C. Karen Liu

International Conference on Robotics and Automation (ICRA), 2026

TL;DR: TWIST2 is a portable whole-body humanoid teleoperation system that enables scalable data collection.

Much Ado About Noising

Much Ado About Noising: Dispelling the Myths of Generative Robotic Control

Chaoyi Pan, Giri Anantharaman, Nai-Chieh Huang, Claire Jin, Daniel Pfrommer, Chenyang Yuan, Frank Permenter, Guannan Qu, Nicholas Boffi, Guanya Shi, Max Simchowitz

International Conference on Learning Representations (ICLR), 2026

TL;DR: In most benchmarks, the success of generative policies is NOT from its distributional-learning formulation.

BFM-Zero

BFM-Zero: A Promptable Behavioral Foundation Model for Humanoid Control Using Unsupervised Reinforcement Learning

Yitang Li*, Zhengyi Luo*, Tonghe Zhang, Cunxi Dai, Anssi Kanervisto, Andrea Tirinzoni, Haoyang Weng, Kris Kitani, Mateusz Guzek, Ahmed Touati, Alessandro Lazaric, Matteo Pirotta, Guanya Shi

International Conference on Learning Representations (ICLR), 2026

TL;DR: BFM-Zero enables zero-shot goal reaching, tracking, and reward optimization (any reward at test time) from one policy.

PLD

Self-Improving Vision-Language-Action Models with Data Generation via Residual RL

Wenli Xiao*, Haotian Lin*, Andy Peng, Haoru Xue, Tairan He, Yuqi Xie, Fengyuan Hu, Jimmy Wu, Zhengyi Luo, Linxi "Jim" Fan, Guanya Shi, Yuke Zhu

International Conference on Learning Representations (ICLR), 2026

TL;DR: Probe, Learn, Distill (PLD): On-policy probing from a base VLA model + off-policy residual RL + distillation for VLA post-training.

FALCON

FALCON: Learning Force-Adaptive Humanoid Loco-Manipulation

Yuanhang Zhang, Yifu Yuan, Prajwal Gurunath, Tairan He, Shayegan Omidshafiei, Ali-akbar Agha-mohammadi, Marcell Vazquez-Chanlatte, Liam Pedersen, Guanya Shi

Learning for Dynamics and Control Conference (L4DC), 2026

(Oral Presentation)

TL;DR: FALCON enables various heavy-duty humanoid loco-manipulation tasks via a new dual-agent force-adaptive RL framework.

TD-M(PC)²

TD-M(PC)2: Improving Temporal Difference MPC Through Policy Constraint

Haotian Lin, Pengcheng Wang, Jeff Schneider, Guanya Shi

Learning for Dynamics and Control Conference (L4DC), 2026

TL;DR: We observe the value overestimation issue in planner-based MBRL and propose a policy constraint solution with SOTA performance.

2025

SPIDER

SPIDER: Scalable Physics-Informed Dexterous Retargeting

Chaoyi Pan, Changhao Wang, Haozhi Qi, Zixi Liu, Homanga Bharadhwaj, Akash Sharma, Tingfan Wu, Guanya Shi, Jitendra Malik, Francois Hogan

International Conference on Intelligent Robots and Systems (IROS), 2026

TL;DR: A dynamically feasible, cross-embodiment retargeting framework for both humanoid and dexterous hand. Human → physics → real at scale.

HDMI

HDMI: Learning Interactive Humanoid Whole-Body Control from Human Videos

Haoyang Weng, Yitang Li, Nikhil Sobanbabu, Zihan Wang, Zhengyi Luo, Tairan He, Deva Ramanan, Guanya Shi

arXiv preprint

TL;DR: From human videos, HDMI learns robust humanoid loco-manipulation skills (e.g., opening a door continuously for 67 times).

FastSAC

Learning Sim-to-Real Humanoid Locomotion in 15 Minutes

Younggyo Seo*, Carmelo Sferrazza*, Juyue Chen, Guanya Shi, Rocky Duan, Pieter Abbeel

arXiv preprint

TL;DR: We provide a simple recipe with FastSAC and FastTD3 for rapid sim2real humanoid learning.

SPI-Active

Sampling-Based System Identification with Active Exploration for Legged Robot Sim2Real Learning

Nikhil Sobanbabu*, Guanqi He*, Tairan He, Yuxiang Yang, Guanya Shi

Conference on Robot Learning (CoRL), 2025

(Oral Presentation)

TL;DR: SPI-Active is a general system ID tool based on parallel sampling-based optimization and active exploration, for legged sim2real learning.

Humanoid Policy ~ Human Policy

Humanoid Policy ~ Human Policy

Ri-Zhao Qiu*, Shiqi Yang*, Xuxin Cheng*, Chaitanya Chawla*, Jialong Li, Tairan He, Ge Yan, David J. Yoon, Ryan Hoque, Lars Paulsen, Ge Yang, Jian Zhang, Sha Yi, Guanya Shi, Xiaolong Wang

Conference on Robot Learning (CoRL), 2025

TL;DR: Co-training humanoid manipulation policy with egocentric human data using a unified "human-centric state-action space".

ASAP

ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills

Tairan He*, Jiawei Gao*, Wenli Xiao*, Yuanhang Zhang*, Zi Wang, Jiashun Wang, Zhengyi Luo, Guanqi He, Nikhil Sobanbabu, Chaoyi Pan, Zeji Yi, Guannan Qu, Kris Kitani, Jessica Hodgins, Linxi "Jim" Fan, Yuke Zhu, Changliu Liu, Guanya Shi

Robotics: Science and Systems (RSS), 2025

TL;DR: ASAP learns agile whole-body humanoid motions via learning a residual action model from the real world to align sim and real physics.

Flying Hand

Flying Hand: End-Effector-Centric Framework for Versatile Aerial Manipulation Teleoperation and Policy Learning

Guanqi He*, Xiaofeng Guo*, Luyi Tang, Yuanhang Zhang, Mohammadreza Mousaei, Jiahe Xu, Junyi Geng, Sebastian Scherer, Guanya Shi

Robotics: Science and Systems (RSS), 2025

TL;DR: A general-purpose aerial manipulation framework with an EE-centric interface that bridges whole-body control and policy learning.

DIAL-MPC

Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing

Haoru Xue*, Chaoyi Pan*, Zeji Yi, Guannan Qu, Guanya Shi

International Conference on Robotics and Automation (ICRA), 2025

(Best Paper Award Finalist)

TL;DR: DIAL-MPC is the first training-free method achieving real-time whole-body torque control using full-order dynamics.

AnyCar

AnyCar to Anywhere: Learning Universal Dynamics Model for Agile and Adaptive Mobility

Wenli Xiao*, Haoru Xue*, Tony Tao, Dvij Kalaria, John M. Dolan, Guanya Shi

International Conference on Robotics and Automation (ICRA), 2025

TL;DR: AnyCar is a transformer-based dynamics model that can adapt to various vehicles, environments, state estimators, and tasks.

Jumping CoD

Agile Continuous Jumping in Discontinuous Terrains

Yuxiang Yang, Guanya Shi, Changyi Lin, Xiangyun Meng, Rosario Scalise, Mateo Guaman Castro, Wenhao Yu, Tingnan Zhang, Ding Zhao, Jie Tan, Byron Boots

International Conference on Robotics and Automation (ICRA), 2025

TL;DR: Continuous, agile, and autonomous quadrupedal jumping via hierarchical model-free RL and model-based control.

2024

Model-Based Diffusion

Model-Based Diffusion for Trajectory Optimization

Chaoyi Pan*, Zeji Yi*, Guanya Shi, Guannan Qu

Neural Information Processing Systems (NeurIPS), 2024

TL;DR: MBD is a diffusion-based traj optimization method that directly computes the score function using models without any external data.

Flying Calligrapher

Flying Calligrapher: Contact-Aware Motion and Force Planning and Control for Aerial Manipulation

Xiaofeng Guo*, Guanqi He*, Jiahe Xu, Mohammadreza Mousaei, Junyi Geng, Sebastian Scherer, Guanya Shi

IEEE Robotics and Automation Letters (RA-L), 2024

TL;DR: Flying calligrapher enables precise hybrid motion and contact force control for an aerial manipulator in various drawing tasks.

OmniH2O

OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning

Tairan He*, Zhengyi Luo*, Xialin He*, Wenli Xiao, Chong Zhang, Weinan Zhang, Kris Kitani, Changliu Liu, Guanya Shi

Conference on Robot Learning (CoRL), 2024

TL;DR: OmniH2O provides a universal whole-body humanoid control interface that enables diverse teleoperation and autonomy methods.

WoCoCo

WoCoCo: Learning Whole-Body Humanoid Control with Sequential Contacts

Chong Zhang*, Wenli Xiao*, Tairan He, Guanya Shi

Conference on Robot Learning (CoRL), 2024

(Oral Presentation)

TL;DR: WoCoCo is a task-agnostic skill learning framework without any motion priors, by decomposing long-horizon tasks into contact sequences.

H2O

Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation

Tairan He*, Zhengyi Luo*, Wenli Xiao, Chong Zhang, Kris Kitani, Changliu Liu, Guanya Shi

International Conference on Intelligent Robots and Systems (IROS), 2024

(Oral Presentation)

TL;DR: H2O enables real-time whole-body teleoperation of a full-sized humanoid to perform tasks like pick and place, walking, kicking, boxing, etc.

Agile But Safe

Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion

Tairan He*, Chong Zhang*, Wenli Xiao, Guanqi He, Changliu Liu, Guanya Shi

Robotics: Science and Systems (RSS), 2024

(Outstanding Student Paper Award Finalist)

TL;DR: ABS enables fully onboard, agile (>3m/s), and collision-free locomotion for quadrupedal robots in cluttered environments.

2023 and Before

Optimal Exploration

Optimal Exploration for Model-based RL in Nonlinear Systems

Andrew Wagenmaker, Guanya Shi, Kevin Jamieson

Neural Information Processing Systems (NeurIPS), 2023

(Spotlight, 3.1%)

TL;DR: Not all model parameters are equally important. We develop an instance-optimal exploration algorithm for MBRL in nonlinear systems.

DATT DATT diagram

DATT: Deep Adaptive Trajectory Tracking for Quadrotor Control

Kevin Huang, Rwik Rana, Alexander Spitzer, Guanya Shi, Byron Boots

Conference on Robot Learning (CoRL), 2023

(Oral presentation, 6.6%)

TL;DR: DATT can precisely track arbitrary, potentially infeasible trajectories in the presence of large disturbances.

Neural-Fly

Neural-Fly Enables Rapid Learning for Agile Flight in Strong Winds

Michael O'Connell*, Guanya Shi*, Xichen Shi, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, Soon-Jo Chung

Science Robotics

TL;DR: Neural-Fly uses adaptive control to online fine-tune a meta-pretrained DNN representation, enabling rapid adaptation in strong winds.

Perturbation-based MPC analysis

Perturbation-based Regret Analysis of Predictive Control in LTV Systems

Yiheng Lin*, Yang Hu*, Guanya Shi*, Haoyuan Sun*, Guannan Qu*, Adam Wierman

Neural Information Processing Systems (NeurIPS), 2021

(Spotlight, 2.9%)

TL;DR: We prove MPC's dynamic regret and competitive ratio exponentially improve as its prediction gets longer, in LTV systems.

Meta-Adaptive Nonlinear Control

Meta-Adaptive Nonlinear Control: Theory and Algorithms

Guanya Shi, Kamyar Azizzadenesheli, Michael O'Connell, Soon-Jo Chung, Yisong Yue

Neural Information Processing Systems (NeurIPS), 2021

TL;DR: We present an online multi-task learning approach for adaptive nonlinear control with non-asymptotic guarantees.

Fast UQ

Fast Uncertainty Quantification for Deep Object Pose Estimation

Guanya Shi, Yifeng Zhu, Jonathan Tremblay, Stan Birchfield, Fabio Ramos, Animashree Anandkumar, Yuke Zhu

International Conference on Robotics and Automation (ICRA), 2021

TL;DR: We develop a simple and efficient UQ method for 6-DoF pose estimation, and apply it in real-world grasping tasks.

Neural-Swarm2

Neural-Swarm2: Planning and Control of Heterogeneous Multirotor Swarms Using Learned Interactions

Guanya Shi, Wolfgang Hönig, Xichen Shi, Yisong Yue, Soon-Jo Chung

IEEE Transactions on Robotics

TL;DR: Neural-Swarm is a learning-based controller and planner for close-proximity flight of heterogeneous multirotor swarms.

Neural Lander

Neural Lander: Stable Drone Landing Control Using Learned Dynamics

Guanya Shi*, Xichen Shi*, Michael O'Connell*, Rose Yu, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, Soon-Jo Chung

International Conference on Robotics and Automation (ICRA), 2019

TL;DR: Spectrally normalized deep learning and nonlinear control enable provably stable agile drone landing.