BEAVR is an open-source, Bimanual, Multi-embodiment Virtual Reality (VR) Teleoperation system for Robots, designed to unify real-time control, data recording, and policy learning across heterogeneous robotic platforms. BEAVR enables real-time, dexterous teleoperation using commodity VR hardware, supports modular integration with robots ranging from 7-DoF manipulators to full-body humanoids, and records synchronized multi-modal demonstrations directly in the LeRobot dataset schema
Our system features a zero-copy streaming architecture achieving ≤35 ms latency, an asynchronous “think–act” control loop for scalable inference, and a flexible network API optimized for real-time, multi-robot operation. We benchmark BEAVR across diverse manipulation tasks and demonstrate its compatibility with leading visuomotor policies such as ACT, DiffusionPolicy, and SmolVLA.
All code is publicly available, and datasets are released on Hugging Face. Code, datasets, and VR app are available at https://github.com/ARCLab-MIT/BEAVR-Bot.
End-to-end pipeline from VR input to robot control
All experiments were conducted on an Alienware x16 R2 laptop equipped with an Intel Core Ultra 9 185H CPU and an NVIDIA GeForce RTX 4080 Max-Q GPU (12 GB VRAM). The system runs Ubuntu 24.04.2 LTS with CUDA 12.8 and NVIDIA driver version 570.169. We use a fixed-base 7-DoF tabletop manipulator (XArm7) with a 16-DoF LEAP hand attached to the end effector, operating in a tabletop workspace. Two statically mounted RGB cameras (front-facing and overhead) capture synchronized streams at 480×640 resolution and 30 FPS.
We conduct three separate experiments:
BEAVR uses consumer VR and commodity hardware to enable affordable teleoperation without proprietary dependencies. The reference setup employs Meta Quest 3 for hand tracking and VR streaming, paired with standard compute and networked robot controllers, aligning with the accessibility goals described in the paper. See the paper for details on latency and system design.
Task | Success rate | Avg. time (s) |
---|---|---|
Tape task | 6 / 10 (60%) | 17.61 |
Lantern task | 8 / 10 (80%) | 21.61 |
Stack blocks | 7 / 10 (70%) | 25.89 |
Flip cube | 10 / 10 (100%) | 16.50 |
Pour | 8 / 10 (80%) | 84.67 |
Pick and place | 10 / 10 (100%) | 11.90 |
Policy | Success rate | Avg. time (s) |
---|---|---|
ACT | 10 / 10 (100%) | 9.16 |
Diffusion | 8 / 10 (80%) | 23.88 |
SmolVLA | 7 / 10 (70%) | 33.84 |
Human operator | 10 / 10 (100%) | 12.08 |
Task | OpenTeach | BEAVR |
---|---|---|
Flip cube | 1.0 | 1.0 |
Pour | 0.8 | 0.8 |
Pick & Place | 0.8 | 1.0 |
Task | OpenTeach | BEAVR |
---|---|---|
Flip cube | 2.85 | 13.32 |
Pour | 14.83 | 28.92 |
Pick & Place | 11.88 | 9.72 |
Component | Target Hz | Achieved Hz | Jitter (ms) |
---|---|---|---|
XArm7 (single) | 30 | 29.93 | 0.90 |
LEAP (single) | 30 | 29.69 | 0.19 |
XArm7 (bimanual) | 30 | 29.93 | 0.89 |
LEAP (bimanual) | 30 | 29.61 | 0.21 |
XArm7 (high freq) | 90 | 99.18 | 0.75 |
LEAP (high freq) | 90 | 97.22 | 0.13 |
All metrics per the paper [arXiv].
@misc{posadasnava2025beavr,
title = {BEAVR: Bimanual, multi-Embodiment, Accessible, Virtual Reality Teleoperation System for Robots},
author = {Alejandro Posadas-Nava and Alejandro Carrasco and Richard Linares},
year = {2025},
eprint = {2508.09606},
archivePrefix = {arXiv},
primaryClass = {cs.RO},
note = {Accepted for presentation at ICCR 2025, Kyoto},
url = {https://arxiv.org/abs/2508.09606}
}