U-Arm: Imaging & Teleoperation Platforms
- U-Arm is a term encompassing two distinct systems: a dual-robotic arm imaging platform for dynamic 4D joint assessment and a 3D-printed leader-follower teleoperation interface.
- The imaging system employs deep learning-enhanced 2D-3D registration achieving sub-voxel accuracy and low radiation dose, while the teleoperation system uses cost-effective hardware with direct joint-angle mapping.
- Both systems illustrate cross-domain innovation, sharing nomenclature despite their distinct hardware configurations, control objectives, and evaluation metrics.
U-Arm is a term used in 2025 arXiv literature for two technically unrelated systems. In medical imaging, the First-Imaging U-Arm denotes a dual-robotic-arm cone-beam CT platform for upright, load-bearing, four-dimensional joint assessment via uni-plane X-ray and 2D-3D registration (Tang et al., 22 Aug 2025). In robot manipulation, U-ARM denotes an ultra low-cost, rapidly adaptable leader-follower teleoperation interface built around 3D-printed leader arms for commercially available robot manipulators (Zou et al., 2 Sep 2025). A common misconception is that the name refers to a single platform; in the cited literature, it instead spans two separate research programs with distinct hardware, kinematics, control objectives, and evaluation criteria.
1. Nomenclature and domain usage
The term appears in two principal forms: First-Imaging U-Arm, centered on dynamic musculoskeletal imaging, and U-ARM, centered on robot teleoperation and manipulation data collection. The overlap is lexical rather than architectural or methodological.
| System | Domain | Core function |
|---|---|---|
| First-Imaging U-Arm | CBCT and fluoroscopy | 4D joint analysis under physiological load |
| U-ARM | Robot teleoperation | Low-cost leader-follower manipulation interface |
The imaging system is defined by a dual robotic arm CBCT configuration, deep learning-based preprocessing, simulated 3D-to-2D projection, iterative registration, and clinical kinematic biomarkers such as tibial plateau motion and medial-lateral variance (Tang et al., 22 Aug 2025). The teleoperation system is defined by three 3D-printed leader-arm configurations, direct joint-angle mapping, low-cost electronics, and comparative experiments against Joycon in tabletop manipulation tasks (Zou et al., 2 Sep 2025).
This distribution of meanings suggests that “U-Arm” functions as a cross-domain label rather than a stable technical standard. In practice, disambiguation depends on surrounding terms such as CBCT, 2D-3D registration, teleoperation, or leader-follower.
2. First-Imaging U-Arm: hardware architecture and acquisition geometry
The First-Imaging U-Arm is an integrated 4D joint analysis platform intended to overcome the limitations of conventional CT for dynamic, weight-bearing joint motion. Its hardware core comprises two ultra-lightweight robotic arms, each approximately , mounted to a rigid floor base; one arm carries the X-ray source and the other the flat-panel detector (Tang et al., 22 Aug 2025).
Each arm includes three rotary degrees of freedom at its shoulder—pan, tilt, and roll—together with one linear “z-lift” for vertical motion and one rotary bearing about the vertical axis. The total workspace in is , and the detector width is . By independently controlling each arm, any arbitrary gantry-free trajectory around a patient standing on the load-bearing platform can be programmed. The patient-centred coordinate system is right-handed, with directed to the patient’s left, along the posterior-anterior beam central ray, and in the inferior-superior vertical direction.
The source focal spot is nominal. The detector is an amorphous-silicon flat panel with pixels and pixel pitch, corresponding to a 0 field of view. Source and detector are fixed to the endpoints of their respective robotic arms, and optical encoders on each joint report the instantaneous 3D pose of source and detector as extrinsic calibration.
The static 3D CBCT trajectory is a 1 reverse spiral around the patient, parameterized by
2
with scan duration 3 and vertical step size 4, reported as 5 per projection. Dynamic 2D fluoroscopy uses single-plane lateral projections acquired at high frame rate, for example 6, while the subject performs controlled knee flexion; the angular span is limited to approximately 7 of arm swing to maximize temporal resolution.
The acquisition sequence is explicitly two-stage. Step 1 acquires the static CBCT at 8, 9, and 0 per projection, with approximately 400 projections over 1, for a total skin dose of approximately 2. Step 2 immediately records a fluoroscopy sequence at 3 for 4, about 90 frames, with each frame using 5, 6, and 7, adding approximately 8. The total radiation burden per 4D study is reported as 9, roughly one-third of a conventional diagnostic knee CT at approximately 0 and also stated as less than 1 of a standard knee CT (Tang et al., 22 Aug 2025).
3. First-Imaging U-Arm: hybrid 4D reconstruction and kinematic quantification
The imaging pipeline combines static 3D CBCT with dynamic 2D X-rays through deep learning-based preprocessing, projection simulation, and iterative registration. In 2D, segmentation is performed by an adapted nnU-Net that enforces temporal consistency by propagating the manual segmentation of frame 1 through subsequent frames. In 3D, TotalSegmentator automatically labels the femur, patella, and tibia-fibula complex (Tang et al., 22 Aug 2025).
For simulated 3D-to-2D projection, the platform employs DeepDRR. Let the CBCT voxel volume be 2. The pipeline decomposes the volume into material attenuations with a CNN, performs spectrum-aware ray tracing along each detector ray 3, and augments the forward model with scatter and noise:
4
where 5 is scatter estimated by a second CNN and 6 is Poisson plus electronic noise. The output is a digitally reconstructed radiograph intended to match the real fluoroscopy distribution.
Registration treats each bone 7 as a rigid body with 6-DoF transform
8
For each frame 9, the objective is to find 0 that maximize similarity between the real segmented X-ray 1 and the corresponding DRR. Similarity is measured with normalized cross-correlation:
2
The BoneAxis-Reg workflow proceeds in three stages. First, PCA-derived principal mechanical axes are used with differential evolution for global initialization. Second, a Kinematic Priority Module updates transforms according to
3
where 4 encodes clinically plausible flexion-extension steps. Third, local refinement uses a hybrid Powell plus Nelder-Mead simplex search to maximize NCC.
At time 5, the reconstructed 4D volume is obtained by applying the recovered rigid transforms to static CBCT sub-volumes:
6
The reported simulation performance with the KPM enabled is a target registration error of 7 and a registration success rate of 8, with success defined as 9. The abstract characterizes this as sub-voxel accuracy and states that it outperforms conventional and state-of-the-art registration approaches (Tang et al., 22 Aug 2025).
Clinical quantification is organized around the tibial plateau (TP)-condyle distance, medial-lateral difference, and distance difference variance. For each frame, the femoral condyle lowest points are projected onto the tibial articular plane and perpendicular distances 0 and 1 are measured. The medial-lateral difference is
2
and the distance difference variance is the temporal variance of MLD across frames. Trajectory analysis connects successive contact points 3 into a 2D path; deviations from an ideal arc indicate malalignment or instability. Clinical evaluation further demonstrated accurate quantification of tibial plateau motion and medial-lateral variance in post-total knee arthroplasty patients.
4. U-ARM teleoperation system: mechanical configurations and cost structure
The teleoperation U-ARM is a low-cost and rapidly adaptable leader-follower framework designed to interface with most commercially available robotic arms. It supports teleoperation through three structurally distinct 3D-printed leader arms that share consistent control logic and standardized joint ordering, permitting compatibility with multiple commercial follower-arm families despite differing link lengths (Zou et al., 2 Sep 2025).
The three configurations correspond to common industrial 6-DoF and 7-DoF kinematic families. Config-1 is a 6-DoF XArm/Fanuc-style arm with revolute joints about 4, 5, 6, 7, 8, and 9. Its link lengths are scaled to tabletop use, with upper arm approximately 0, forearm approximately 1, and wrist stack approximately 2 total. The joint ranges are 3, 4, 5, 6, 7, and 8 degrees, and compatible robots include XArm6, Fanuc LR Mate, and KUKA LBR.
Config-2 is a 6-DoF UR-style arm with joint order 9, stated to be identical to UR5 after swapping joints 5 and 6 in the CAD for operator comfort. Its link lengths are upper arm approximately 0, forearm approximately 1, and wrist approximately 2. Joint ranges are 3, 4, 5, 6, 7, and 8 degrees. Compatible robots include UR5, Dobot CR5, and AUBO i5.
Config-3 is a 7-DoF Franka-style arm with anthropomorphic sequence 9. The shoulder-elbow distance is approximately 0, elbow-wrist approximately 1, and wrist stack approximately 2. Joint ranges are 3, 4, 5, 6, 7, 8, and 9 degrees. Compatible robots include Franka Emika Panda and Flexiv Rizon.
All structural parts are printed in PLA with at least 0 wall thickness and dual-axis fixation plates at every joint for durability under repeated use. The 6-DoF bill of materials is itemized as six Zhongling 1 servo encoders with gears removed at 2, a wiring harness and JST connectors at 3, and miscellaneous hardware including M2/M3 screws and bearings at 4. The 7-DoF version adds one extra servo encoder and extra screws, yielding 5. The abstract reports these as the BOM cost for the 6-DoF and 7-DoF versions, respectively (Zou et al., 2 Sep 2025).
5. U-ARM teleoperation system: kinematic model, mapping, and redundancy handling
The kinematic model is expressed with standard Denavit-Hartenberg parameters. For a generic 6-DoF anthropomorphic arm, with 7 or 8, and joint angles 9, the homogeneous transform from frame 00 to frame 01 is
02
where 03. The end-effector pose is
04
An example DH table is given for Config-1, including 05, 06, 07, and angle offsets such as 08 and a terminal 09 (Zou et al., 2 Sep 2025).
The spatial Jacobian 10 maps joint rates to end-effector twist 11 by
12
with columns
13
where 14 is the axis unit vector of joint 15 in the base frame and 16 are link origins obtained from forward kinematics. For 6-DoF arms, inverse kinematics admit the closed-form Pieper solution, first solving the wrist position from the first three joints and then using spherical-wrist decomposition for the last three. For the 7-DoF configuration, the system applies a damped-least-squares solver,
17
with 18 tuned to 19 to avoid singularities.
Control is based on direct joint-angle mapping. After calibrating leader and follower to nominal home poses 20 and 21, each sampled leader angle 22 is offset and scaled into a follower command:
23
24
where 25 compensates for different joint limit spans. Commands are sent at 26 over a low-latency USB link.
To mitigate encoder jitter and ensure smooth motion, each 27 passes through a dead-zone filter suppressing 28, and then through interpolation into ten sub-steps. No explicit null-space projection is needed for 6-DoF systems. For the 7-DoF arm, the DLS solver is used both for inverse kinematics and for redundancy resolution, described as effectively projecting joint updates into the manipulability-maximizing subspace. Drift and latency control rely on encoder zeroing by manual adjustment to a 29 midpoint before assembly, per-joint filtering thresholds, temporal interpolation, and periodic recalibration in which the follower returns to its known home pose and the leader is re-zeroed at the start of each demo (Zou et al., 2 Sep 2025).
6. Evaluation, open-source artifacts, and limitations
The First-Imaging U-Arm is evaluated by registration accuracy, success rate, dosimetry, and clinically interpretable kinematic outputs. Its reported target registration error of 30, registration success rate of 31, and total dose of 32 are positioned in the source as enabling fast, accurate, and low-dose dynamic joint imaging for biomechanical research, precision diagnostics, and personalized orthopedic care (Tang et al., 22 Aug 2025). The stated advantages include gantry-free upright scanning under physiological load, customizable 3D trajectories that can avoid obstacles such as implants, and a unified platform extending from raw data to kinematic biomarkers. The mention of applications such as early cartilage wear and ligament imbalance is framed as quantitative biomechanics enabled by sub-voxel registration accuracy.
The teleoperation U-ARM is evaluated against a standard Nintendo Joycon in five tabletop tasks: moving bottles, stacking cans, and retrieving blocks. Two metrics are defined over 50 trials per device: success rate,
33
and average collection time 34. Data-collection efficiency is defined as inverse time per demonstration, or relatively as
35
The reported task-wise results are as follows:
| Task | U-Arm 36 (s) / 37 (%) | Joycon 38 (s) / 39 (%) |
|---|---|---|
| Fanta-from-shelf-2 | 14.43 / 88.8 | 27.85 / 94.0 |
| Oreo-from-shelf-1 | 11.28 / 88.5 | 22.23 / 100.0 |
| Fanta-to-shelf-2 | 19.88 / 72.2 | 31.90 / 60.0 |
| Can-stacking | 20.93 / 39.6 | 31.35 / 64.0 |
| Block-from-litterbox | 21.99 / 90.0 | 31.89 / 96.0 |
| Average | 17.70 / 75.8 | 29.04 / 83.0 |
Overall, U-ARM reduced average demonstration time by 40, with a 41 absolute drop in success rate, identified primarily with very fine-grasp tasks (Zou et al., 2 Sep 2025). The abstract correspondingly describes 39% higher data collection efficiency and comparable task success rates across multiple manipulation scenarios.
Open-source availability is a defining feature of the teleoperation system. The repository at https://github.com/MINT-SJTU/LeRobot-Anything-U-Arm includes CAD files for all three configurations, firmware on STM32, simulation support with SAPIEN and ManiSkill examples for UR5, Panda, XArm, and SO100, and a dataset folder with more than 200 real-world teleoperation recordings containing joint trajectories and RGB-D streams. The paper also states that real-world manipulation data collected with U-ARM have been open-sourced (Zou et al., 2 Sep 2025).
Known limitations are explicitly enumerated for the teleoperation platform: connector looseness over time, no active gravity compensation, and limited workspace joint ranges tailored for tabletop tasks only. Planned extensions include low-cost IMUs or strain gauges for richer haptic feedback and automatic gravity balancing, automated calibration using vision-based pose estimation, dual-arm leader devices for bimanual tasks, and real-time adaptive scaling for differing commercial follower geometries. For the imaging platform, the available text does not present an analogous limitations list, but its framing around upright load-bearing acquisition, low dose, and implant-aware trajectory customization suggests a design targeted at clinical constraints rather than conventional gantry geometries.
Taken together, the 2025 usage of “U-Arm” covers two distinct technical lineages: one in robotic medical imaging and one in robot teleoperation. The shared label does not imply shared methodology. In one case, the central problem is 4D reconstruction from CBCT and fluoroscopy under dose and motion constraints; in the other, it is low-cost embodiment of leader-follower control for scalable manipulation data collection.