AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility (2208.06888v1)

Published 14 Aug 2022 in cs.CV

Abstract: One of the key factors behind the recent success in visual tracking is the availability of dedicated benchmarks. While being greatly benefiting to the tracking research, existing benchmarks do not pose the same difficulty as before with recent trackers achieving higher performance mainly due to (i) the introduction of more sophisticated transformers-based methods and (ii) the lack of diverse scenarios with adverse visibility such as, severe weather conditions, camouflage and imaging effects. We introduce AVisT, a dedicated benchmark for visual tracking in diverse scenarios with adverse visibility. AVisT comprises 120 challenging sequences with 80k annotated frames, spanning 18 diverse scenarios broadly grouped into five attributes with 42 object categories. The key contribution of AVisT is diverse and challenging scenarios covering severe weather conditions such as, dense fog, heavy rain and sandstorm; obstruction effects including, fire, sun glare and splashing water; adverse imaging effects such as, low-light; target effects including, small targets and distractor objects along with camouflage. We further benchmark 17 popular and recent trackers on AVisT with detailed analysis of their tracking performance across attributes, demonstrating a big room for improvement in performance. We believe that AVisT can greatly benefit the tracking community by complementing the existing benchmarks, in developing new creative tracking solutions in order to continue pushing the boundaries of the state-of-the-art. Our dataset along with the complete tracking performance evaluation is available at: https://github.com/visionml/pytracking

Citations (17)

View on Semantic Scholar

Summary

The paper introduces AVisT, a benchmark with 120 video sequences and 80,000 frames that test trackers under adverse weather, obstruction, imaging, target, and camouflage challenges.
It evaluates 17 state-of-the-art algorithms, showing that even top models like MixFormerL-22k achieve only a 56.0% AUC in these complex scenarios.
The study advocates for innovative tracker designs and hybrid models that integrate complex data representations to enhance performance in real-world adverse conditions.

AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility

The advancement of visual object tracking has been heavily driven by the availability of comprehensive benchmarks. Despite the progress made in this field, existing benchmarks often lack the complexity offered by real-world scenarios, particularly those involving adverse visibility conditions. The paper discusses the creation of AVisT, a benchmark specifically designed to address this gap by evaluating visual object tracking performance under challenging conditions that include severe weather, obstruction effects, adverse imaging conditions, and scenarios involving camouflage.

Dataset Composition and Attributes

AVisT encompasses a curated set of 120 video sequences with roughly 80,000 annotated frames, capturing the essence of 18 diverse scenarios collectively encapsulated under five primary attributes: weather conditions, obstruction effects, imaging effects, target effects, and camouflage. These scenarios present significant challenges to current state-of-the-art trackers, ensuring a high level of difficulty remains a constant. Adverse weather conditions such as dense fog, heavy rain, and sandstorms are represented, alongside obstruction phenomena like fire, smoke, and sun glare. Moreover, challenging imaging conditions such as low-light and archival video qualities are included, as well as target-specific challenges like small objects and deformations.

Benchmark Evaluation

The authors investigated the performance of 17 prominent tracking algorithms, spanning methodologies based on Siamese networks, discriminative classifiers, and, more recently, transformers. Among these, MixFormerL-22k stood out with an AUC of 56.0%, marking it as one of the best-performing models, yet still illustrating the substantial challenge posed by AVisT. This dataset intends to spotlight areas needing innovation and optimization in tracker design, as even the best algorithms show significant performance drops when faced with these adverse conditions.

Implications and Future Directions

AVisT's introduction to the community implies a paradigm shift towards embracing complexity in benchmark design. It challenges conventional trackers to evolve beyond traditional scenarios and compels researchers to innovate new models capable of handling real-world tracking difficulties. The findings suggest that current methodologies must harness more complex data representations and potentially integrate auxiliary tasks like visibility estimation to improve robustness.

Future developments could focus on the design of adaptive tracking frameworks, leveraging temporal constraints and contextual learning drawn from the AVisT benchmark data. Insight into handling such real-world scenarios might also spur the creation of hybrid tracker models that incorporate elements from multiple tracking paradigms to optimize performance under adverse conditions. Furthermore, extending AVisT with real-time updates and expanding its scope to include new adverse scenarios as they emerge would ensure its continued relevance and challenge to the visual tracking community.

In conclusion, AVisT provides an essential platform for guiding advancements in visual object tracking. It underscores the necessity for trackers that not only perform well under controlled conditions but are robust enough to manage the unpredictability and complexity inherent in real-world environments. As tracking applications become increasingly integrated into everyday technologies, benchmarks like AVisT will play an indispensable role in shaping the future of visual tracking systems.

PDF Markdown

Related Papers

GitHub

GitHub - visionml/pytracking: Visual tracking library based on PyTorch. (3,237 stars)