Multi-Camera Asynchronous Ball Localization and Trajectory Prediction with Factor Graphs and Human Poses (2401.17185v1)

Published 30 Jan 2024 in cs.RO and cs.CV

Abstract: The rapid and precise localization and prediction of a ball are critical for developing agile robots in ball sports, particularly in sports like tennis characterized by high-speed ball movements and powerful spins. The Magnus effect induced by spin adds complexity to trajectory prediction during flight and bounce dynamics upon contact with the ground. In this study, we introduce an innovative approach that combines a multi-camera system with factor graphs for real-time and asynchronous 3D tennis ball localization. Additionally, we estimate hidden states like velocity and spin for trajectory prediction. Furthermore, to enhance spin inference early in the ball's flight, where limited observations are available, we integrate human pose data using a temporal convolutional network (TCN) to compute spin priors within the factor graph. This refinement provides more accurate spin priors at the beginning of the factor graph, leading to improved early-stage hidden state inference for prediction. Our result shows the trained TCN can predict the spin priors with RMSE of 5.27 Hz. Integrating TCN into the factor graph reduces the prediction error of landing positions by over 63.6% compared to a baseline method that utilized an adaptive extended Kalman filter.

Citations (4)

View on Semantic Scholar

Summary

The paper introduces a novel factor graph approach that fuses asynchronous multi-camera data with human pose-derived spin priors to enhance real-time tennis ball localization and trajectory prediction.
It leverages integrated physical dynamics and a temporal convolutional network to achieve a 63.6% reduction in landing position prediction error compared to adaptive extended Kalman filters.
The study outlines limitations in current spin estimation from human pose data and suggests incorporating learnable bounce dynamics for further improvements.

Introduction

Object tracking is an established and vital area in computer vision, particularly when objects in question are small and dynamically fast, such as balls in sports scenarios. Agile robots capable of tracking and intercepting such objects in sports like tennis—where the ball's high-speed and spin present unique challenges—are becoming increasingly important. Previous research has sought to address this by employing synchronized multi-camera systems and time filters, with varying degrees of success. Such systems often struggle with precise localization and robust trajectory prediction, especially for balls exhibiting complex spin characteristics.

Factor Graphs and Prediction

The novel paper at hand introduces an inventive approach leveraging factor graphs integrated with a multi-camera system to enhance both real-time, asynchronous localization and trajectory prediction of a tennis ball. This factor graph framework, normally associated with robotics and SLAM, is adept at estimating hidden states by forging connections between camera detections and temporal data. Here, the primary contribution lies in the real-time estimation of the tennis ball's location, velocity, and spin.

The proposed factor graph deeply integrates physical dynamics, enabling capturing of aerodynamic forces acting upon the ball during flight and the restitution forces during bounces. The elegance of this method is its ability to chisel out hidden states without the need for camera synchronization. By incorporating human pose data to bootstrap initial spin priors, the system boasts a remarkable 63.6% reduction in landing position prediction error compared to baseline methods employing adaptive extended Kalman filters.

Spin Priors from Human Poses

A significant augmentation this paper offers is the usage of human poses to compute spin priors—leveraging a Temporal Convolutional Network (TCN) for spin estimation—that are integrated early within the factor graph. The innovation here links the stroke mechanics of a player, observable via camera, to the subsequent ball spin, which is a challenging aspect to capture. This integration skews the trajectory prediction algorithm towards accuracy, especially in predicting multiple bounce points, crucial for developing agile robotic responses in competitive tennis matches.

Experimental Findings and Limitations

Extensive experimental validation denotes that the factor graph technique, enhanced by human pose-derived spin priors, outperforms the baseline methods substantially. The real-world setup used features multi-camera systems transmitting detection data to a centralized computer capable of performing factor graph optimizations in near real-time.

However, the accurate estimation of spin prior from human poses remains a challenge, particularly when dealing with professional-level spins. This is attributed to the limited training data which did not cover the racket's pose or the grip type, indicating potential areas for future enhancement. Furthermore, the paper suggests that the inclusion of a learnable factor for bounce dynamics within the factor graphs could make the system robust to high-spin scenarios.

Conclusion

In an encapsulating perspective, the paper presented demonstrates a cutting-edge method of using factor graphs and human pose data to expeditiously and accurately predict the state of a tennis ball, achieving this with a level of precision that is significantly improved from prior techniques. While it lays out a robust foundation for further research, especially in addressing its current limitations, it unequivocally marks a meaningful advancement in the field of robotics, object tracking, and sports analytics.

PDF Markdown

Related Papers

Tweets

https://twitter.com/MatthewGombolay/status/1752751029907546220

https://twitter.com/arxivsanitybot/status/1753236733557109098