Papers
Topics
Authors
Recent
Search
2000 character limit reached

U-Arm: Imaging & Teleoperation Platforms

Updated 4 July 2026
  • U-Arm is a term encompassing two distinct systems: a dual-robotic arm imaging platform for dynamic 4D joint assessment and a 3D-printed leader-follower teleoperation interface.
  • The imaging system employs deep learning-enhanced 2D-3D registration achieving sub-voxel accuracy and low radiation dose, while the teleoperation system uses cost-effective hardware with direct joint-angle mapping.
  • Both systems illustrate cross-domain innovation, sharing nomenclature despite their distinct hardware configurations, control objectives, and evaluation metrics.

U-Arm is a term used in 2025 arXiv literature for two technically unrelated systems. In medical imaging, the First-Imaging U-Arm denotes a dual-robotic-arm cone-beam CT platform for upright, load-bearing, four-dimensional joint assessment via uni-plane X-ray and 2D-3D registration (Tang et al., 22 Aug 2025). In robot manipulation, U-ARM denotes an ultra low-cost, rapidly adaptable leader-follower teleoperation interface built around 3D-printed leader arms for commercially available robot manipulators (Zou et al., 2 Sep 2025). A common misconception is that the name refers to a single platform; in the cited literature, it instead spans two separate research programs with distinct hardware, kinematics, control objectives, and evaluation criteria.

1. Nomenclature and domain usage

The term appears in two principal forms: First-Imaging U-Arm, centered on dynamic musculoskeletal imaging, and U-ARM, centered on robot teleoperation and manipulation data collection. The overlap is lexical rather than architectural or methodological.

System Domain Core function
First-Imaging U-Arm CBCT and fluoroscopy 4D joint analysis under physiological load
U-ARM Robot teleoperation Low-cost leader-follower manipulation interface

The imaging system is defined by a dual robotic arm CBCT configuration, deep learning-based preprocessing, simulated 3D-to-2D projection, iterative registration, and clinical kinematic biomarkers such as tibial plateau motion and medial-lateral variance (Tang et al., 22 Aug 2025). The teleoperation system is defined by three 3D-printed leader-arm configurations, direct joint-angle mapping, low-cost electronics, and comparative experiments against Joycon in tabletop manipulation tasks (Zou et al., 2 Sep 2025).

This distribution of meanings suggests that “U-Arm” functions as a cross-domain label rather than a stable technical standard. In practice, disambiguation depends on surrounding terms such as CBCT, 2D-3D registration, teleoperation, or leader-follower.

2. First-Imaging U-Arm: hardware architecture and acquisition geometry

The First-Imaging U-Arm is an integrated 4D joint analysis platform intended to overcome the limitations of conventional CT for dynamic, weight-bearing joint motion. Its hardware core comprises two ultra-lightweight robotic arms, each approximately 16kg16\,\mathrm{kg}, mounted to a rigid floor base; one arm carries the X-ray source and the other the flat-panel detector (Tang et al., 22 Aug 2025).

Each arm includes three rotary degrees of freedom at its shoulder—pan, tilt, and roll—together with one linear “z-lift” for vertical motion and one rotary bearing about the vertical axis. The total workspace in zz is 0180cm0\ldots180\,\mathrm{cm}, and the detector width is 61cm61\,\mathrm{cm}. By independently controlling each arm, any arbitrary gantry-free trajectory around a patient standing on the load-bearing platform can be programmed. The patient-centred coordinate system is right-handed, with XX directed to the patient’s left, YY along the posterior-anterior beam central ray, and ZZ in the inferior-superior vertical direction.

The source focal spot is 0.3mm0.3\,\mathrm{mm} nominal. The detector is an amorphous-silicon flat panel with 1024×10241024\times1024 pixels and 0.3mm0.3\,\mathrm{mm} pixel pitch, corresponding to a zz0 field of view. Source and detector are fixed to the endpoints of their respective robotic arms, and optical encoders on each joint report the instantaneous 3D pose of source and detector as extrinsic calibration.

The static 3D CBCT trajectory is a zz1 reverse spiral around the patient, parameterized by

zz2

with scan duration zz3 and vertical step size zz4, reported as zz5 per projection. Dynamic 2D fluoroscopy uses single-plane lateral projections acquired at high frame rate, for example zz6, while the subject performs controlled knee flexion; the angular span is limited to approximately zz7 of arm swing to maximize temporal resolution.

The acquisition sequence is explicitly two-stage. Step 1 acquires the static CBCT at zz8, zz9, and 0180cm0\ldots180\,\mathrm{cm}0 per projection, with approximately 400 projections over 0180cm0\ldots180\,\mathrm{cm}1, for a total skin dose of approximately 0180cm0\ldots180\,\mathrm{cm}2. Step 2 immediately records a fluoroscopy sequence at 0180cm0\ldots180\,\mathrm{cm}3 for 0180cm0\ldots180\,\mathrm{cm}4, about 90 frames, with each frame using 0180cm0\ldots180\,\mathrm{cm}5, 0180cm0\ldots180\,\mathrm{cm}6, and 0180cm0\ldots180\,\mathrm{cm}7, adding approximately 0180cm0\ldots180\,\mathrm{cm}8. The total radiation burden per 4D study is reported as 0180cm0\ldots180\,\mathrm{cm}9, roughly one-third of a conventional diagnostic knee CT at approximately 61cm61\,\mathrm{cm}0 and also stated as less than 61cm61\,\mathrm{cm}1 of a standard knee CT (Tang et al., 22 Aug 2025).

3. First-Imaging U-Arm: hybrid 4D reconstruction and kinematic quantification

The imaging pipeline combines static 3D CBCT with dynamic 2D X-rays through deep learning-based preprocessing, projection simulation, and iterative registration. In 2D, segmentation is performed by an adapted nnU-Net that enforces temporal consistency by propagating the manual segmentation of frame 1 through subsequent frames. In 3D, TotalSegmentator automatically labels the femur, patella, and tibia-fibula complex (Tang et al., 22 Aug 2025).

For simulated 3D-to-2D projection, the platform employs DeepDRR. Let the CBCT voxel volume be 61cm61\,\mathrm{cm}2. The pipeline decomposes the volume into material attenuations with a CNN, performs spectrum-aware ray tracing along each detector ray 61cm61\,\mathrm{cm}3, and augments the forward model with scatter and noise:

61cm61\,\mathrm{cm}4

where 61cm61\,\mathrm{cm}5 is scatter estimated by a second CNN and 61cm61\,\mathrm{cm}6 is Poisson plus electronic noise. The output is a digitally reconstructed radiograph intended to match the real fluoroscopy distribution.

Registration treats each bone 61cm61\,\mathrm{cm}7 as a rigid body with 6-DoF transform

61cm61\,\mathrm{cm}8

For each frame 61cm61\,\mathrm{cm}9, the objective is to find XX0 that maximize similarity between the real segmented X-ray XX1 and the corresponding DRR. Similarity is measured with normalized cross-correlation:

XX2

The BoneAxis-Reg workflow proceeds in three stages. First, PCA-derived principal mechanical axes are used with differential evolution for global initialization. Second, a Kinematic Priority Module updates transforms according to

XX3

where XX4 encodes clinically plausible flexion-extension steps. Third, local refinement uses a hybrid Powell plus Nelder-Mead simplex search to maximize NCC.

At time XX5, the reconstructed 4D volume is obtained by applying the recovered rigid transforms to static CBCT sub-volumes:

XX6

The reported simulation performance with the KPM enabled is a target registration error of XX7 and a registration success rate of XX8, with success defined as XX9. The abstract characterizes this as sub-voxel accuracy and states that it outperforms conventional and state-of-the-art registration approaches (Tang et al., 22 Aug 2025).

Clinical quantification is organized around the tibial plateau (TP)-condyle distance, medial-lateral difference, and distance difference variance. For each frame, the femoral condyle lowest points are projected onto the tibial articular plane and perpendicular distances YY0 and YY1 are measured. The medial-lateral difference is

YY2

and the distance difference variance is the temporal variance of MLD across frames. Trajectory analysis connects successive contact points YY3 into a 2D path; deviations from an ideal arc indicate malalignment or instability. Clinical evaluation further demonstrated accurate quantification of tibial plateau motion and medial-lateral variance in post-total knee arthroplasty patients.

4. U-ARM teleoperation system: mechanical configurations and cost structure

The teleoperation U-ARM is a low-cost and rapidly adaptable leader-follower framework designed to interface with most commercially available robotic arms. It supports teleoperation through three structurally distinct 3D-printed leader arms that share consistent control logic and standardized joint ordering, permitting compatibility with multiple commercial follower-arm families despite differing link lengths (Zou et al., 2 Sep 2025).

The three configurations correspond to common industrial 6-DoF and 7-DoF kinematic families. Config-1 is a 6-DoF XArm/Fanuc-style arm with revolute joints about YY4, YY5, YY6, YY7, YY8, and YY9. Its link lengths are scaled to tabletop use, with upper arm approximately ZZ0, forearm approximately ZZ1, and wrist stack approximately ZZ2 total. The joint ranges are ZZ3, ZZ4, ZZ5, ZZ6, ZZ7, and ZZ8 degrees, and compatible robots include XArm6, Fanuc LR Mate, and KUKA LBR.

Config-2 is a 6-DoF UR-style arm with joint order ZZ9, stated to be identical to UR5 after swapping joints 5 and 6 in the CAD for operator comfort. Its link lengths are upper arm approximately 0.3mm0.3\,\mathrm{mm}0, forearm approximately 0.3mm0.3\,\mathrm{mm}1, and wrist approximately 0.3mm0.3\,\mathrm{mm}2. Joint ranges are 0.3mm0.3\,\mathrm{mm}3, 0.3mm0.3\,\mathrm{mm}4, 0.3mm0.3\,\mathrm{mm}5, 0.3mm0.3\,\mathrm{mm}6, 0.3mm0.3\,\mathrm{mm}7, and 0.3mm0.3\,\mathrm{mm}8 degrees. Compatible robots include UR5, Dobot CR5, and AUBO i5.

Config-3 is a 7-DoF Franka-style arm with anthropomorphic sequence 0.3mm0.3\,\mathrm{mm}9. The shoulder-elbow distance is approximately 1024×10241024\times10240, elbow-wrist approximately 1024×10241024\times10241, and wrist stack approximately 1024×10241024\times10242. Joint ranges are 1024×10241024\times10243, 1024×10241024\times10244, 1024×10241024\times10245, 1024×10241024\times10246, 1024×10241024\times10247, 1024×10241024\times10248, and 1024×10241024\times10249 degrees. Compatible robots include Franka Emika Panda and Flexiv Rizon.

All structural parts are printed in PLA with at least 0.3mm0.3\,\mathrm{mm}0 wall thickness and dual-axis fixation plates at every joint for durability under repeated use. The 6-DoF bill of materials is itemized as six Zhongling 0.3mm0.3\,\mathrm{mm}1 servo encoders with gears removed at 0.3mm0.3\,\mathrm{mm}2, a wiring harness and JST connectors at 0.3mm0.3\,\mathrm{mm}3, and miscellaneous hardware including M2/M3 screws and bearings at 0.3mm0.3\,\mathrm{mm}4. The 7-DoF version adds one extra servo encoder and extra screws, yielding 0.3mm0.3\,\mathrm{mm}5. The abstract reports these as the BOM cost for the 6-DoF and 7-DoF versions, respectively (Zou et al., 2 Sep 2025).

5. U-ARM teleoperation system: kinematic model, mapping, and redundancy handling

The kinematic model is expressed with standard Denavit-Hartenberg parameters. For a generic 0.3mm0.3\,\mathrm{mm}6-DoF anthropomorphic arm, with 0.3mm0.3\,\mathrm{mm}7 or 0.3mm0.3\,\mathrm{mm}8, and joint angles 0.3mm0.3\,\mathrm{mm}9, the homogeneous transform from frame zz00 to frame zz01 is

zz02

where zz03. The end-effector pose is

zz04

An example DH table is given for Config-1, including zz05, zz06, zz07, and angle offsets such as zz08 and a terminal zz09 (Zou et al., 2 Sep 2025).

The spatial Jacobian zz10 maps joint rates to end-effector twist zz11 by

zz12

with columns

zz13

where zz14 is the axis unit vector of joint zz15 in the base frame and zz16 are link origins obtained from forward kinematics. For 6-DoF arms, inverse kinematics admit the closed-form Pieper solution, first solving the wrist position from the first three joints and then using spherical-wrist decomposition for the last three. For the 7-DoF configuration, the system applies a damped-least-squares solver,

zz17

with zz18 tuned to zz19 to avoid singularities.

Control is based on direct joint-angle mapping. After calibrating leader and follower to nominal home poses zz20 and zz21, each sampled leader angle zz22 is offset and scaled into a follower command:

zz23

zz24

where zz25 compensates for different joint limit spans. Commands are sent at zz26 over a low-latency USB link.

To mitigate encoder jitter and ensure smooth motion, each zz27 passes through a dead-zone filter suppressing zz28, and then through interpolation into ten sub-steps. No explicit null-space projection is needed for 6-DoF systems. For the 7-DoF arm, the DLS solver is used both for inverse kinematics and for redundancy resolution, described as effectively projecting joint updates into the manipulability-maximizing subspace. Drift and latency control rely on encoder zeroing by manual adjustment to a zz29 midpoint before assembly, per-joint filtering thresholds, temporal interpolation, and periodic recalibration in which the follower returns to its known home pose and the leader is re-zeroed at the start of each demo (Zou et al., 2 Sep 2025).

6. Evaluation, open-source artifacts, and limitations

The First-Imaging U-Arm is evaluated by registration accuracy, success rate, dosimetry, and clinically interpretable kinematic outputs. Its reported target registration error of zz30, registration success rate of zz31, and total dose of zz32 are positioned in the source as enabling fast, accurate, and low-dose dynamic joint imaging for biomechanical research, precision diagnostics, and personalized orthopedic care (Tang et al., 22 Aug 2025). The stated advantages include gantry-free upright scanning under physiological load, customizable 3D trajectories that can avoid obstacles such as implants, and a unified platform extending from raw data to kinematic biomarkers. The mention of applications such as early cartilage wear and ligament imbalance is framed as quantitative biomechanics enabled by sub-voxel registration accuracy.

The teleoperation U-ARM is evaluated against a standard Nintendo Joycon in five tabletop tasks: moving bottles, stacking cans, and retrieving blocks. Two metrics are defined over 50 trials per device: success rate,

zz33

and average collection time zz34. Data-collection efficiency is defined as inverse time per demonstration, or relatively as

zz35

The reported task-wise results are as follows:

Task U-Arm zz36 (s) / zz37 (%) Joycon zz38 (s) / zz39 (%)
Fanta-from-shelf-2 14.43 / 88.8 27.85 / 94.0
Oreo-from-shelf-1 11.28 / 88.5 22.23 / 100.0
Fanta-to-shelf-2 19.88 / 72.2 31.90 / 60.0
Can-stacking 20.93 / 39.6 31.35 / 64.0
Block-from-litterbox 21.99 / 90.0 31.89 / 96.0
Average 17.70 / 75.8 29.04 / 83.0

Overall, U-ARM reduced average demonstration time by zz40, with a zz41 absolute drop in success rate, identified primarily with very fine-grasp tasks (Zou et al., 2 Sep 2025). The abstract correspondingly describes 39% higher data collection efficiency and comparable task success rates across multiple manipulation scenarios.

Open-source availability is a defining feature of the teleoperation system. The repository at https://github.com/MINT-SJTU/LeRobot-Anything-U-Arm includes CAD files for all three configurations, firmware on STM32, simulation support with SAPIEN and ManiSkill examples for UR5, Panda, XArm, and SO100, and a dataset folder with more than 200 real-world teleoperation recordings containing joint trajectories and RGB-D streams. The paper also states that real-world manipulation data collected with U-ARM have been open-sourced (Zou et al., 2 Sep 2025).

Known limitations are explicitly enumerated for the teleoperation platform: connector looseness over time, no active gravity compensation, and limited workspace joint ranges tailored for tabletop tasks only. Planned extensions include low-cost IMUs or strain gauges for richer haptic feedback and automatic gravity balancing, automated calibration using vision-based pose estimation, dual-arm leader devices for bimanual tasks, and real-time adaptive scaling for differing commercial follower geometries. For the imaging platform, the available text does not present an analogous limitations list, but its framing around upright load-bearing acquisition, low dose, and implant-aware trajectory customization suggests a design targeted at clinical constraints rather than conventional gantry geometries.

Taken together, the 2025 usage of “U-Arm” covers two distinct technical lineages: one in robotic medical imaging and one in robot teleoperation. The shared label does not imply shared methodology. In one case, the central problem is 4D reconstruction from CBCT and fluoroscopy under dose and motion constraints; in the other, it is low-cost embodiment of leader-follower control for scalable manipulation data collection.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to U-Arm.