Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Gait Representation from Massive Unlabelled Walking Videos: A Benchmark (2206.13964v2)

Published 28 Jun 2022 in cs.CV

Abstract: Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, our method outperforms existing methods by a large margin in most cases. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks. The source code of GaitSSB will be integrated into OpenGait which is available at https://github.com/ShiqiYu/OpenGait.

Citations (28)

Summary

  • The paper introduces a self-supervised gait recognition framework using the GaitSSB model and the extensive GaitLU-1M dataset, matching and even surpassing supervised methods.
  • It employs a contrastive learning approach with innovative silhouette-specific augmentations to capture spatial and temporal variations in walking patterns.
  • Empirical results demonstrate effective transfer learning, with the model outperforming traditional techniques across multiple real-world benchmarks.

Learning Gait Representation from Massive Unlabelled Walking Videos: A Benchmark

The paper presents a comprehensive self-supervised approach to gait recognition, targeting the limitations of existing methods dependent on costly, fully-annotated datasets. The authors introduce a new benchmark leveraging massive unlabelled video data through contrastive learning to achieve superior gait representation, ultimately facilitating effective transfer to various application scenarios.

Key Contributions

  1. GaitLU-1M Dataset: A significant contribution of this work is the creation of the GaitLU-1M dataset, containing 1.02 million walking sequences extracted from public videos worldwide. This dataset is ten times larger than current leading datasets, such as OU-MVLP and GREW, offering diverse capturing conditions and individual attributes.
  2. GaitSSB Model: The proposed GaitSSB model emphasizes learning from unlabelled data using a well-structured contrastive learning framework. This model introduces silhouette-specific data augmentation to engage effectively with spatial, intra-sequence, and sampling variations to robustly capture intra-view and inter-view consistencies in walking patterns.
  3. Empirical Validation: The empirical results illustrate that GaitSSB, even without labelled data, performs comparably to or better than previous supervised methods like PoseGait and GEINet across multiple datasets such as CASIA-B, OU-MVLP, GREW, and Gait3D. The unsupervised pre-training shows particular strength in identity verification across differing viewpoints.
  4. Transfer Learning Superiority: Through fine-tuning, GaitSSB not only surpasses existing state-of-the-art models on diverse benchmarks but significantly outperforms them on datasets collected in real-world environments (e.g., GREW and Gait3D). This indicates its robustness and adaptability to practical conditions.

Methodological Insights

  • Silhouette Augmentation: The augmentation involves operations such as affine transformations and dilation to simulate realistic clothing and carrying variations, fundamental for achieving generalization.
  • Contrastive Learning Design: Unlike many visual contrasts, GaitSSB harnesses negative samples to delineate between similar view appearances and truly distinct walking patterns, crucial for fine-grained biometric tasks like gait recognition.
  • Pre-training Scale Impact: Analysis reveals performance gains with larger datasets, though challenges persist in modeling certain complexities like drastic clothing changes, prompting further exploration in augmentation strategies.

Implications and Future Directions

The work implies significant progress in utilizing large-scale unlabelled data for biometric tasks, with clear potential for application in security systems and surveillance. The methodological framework underscores the gap between simulation and real-world scenarios, advocating for continued research:

  • Enhanced data augmentation techniques reflecting authentic environmental and physical interactions.
  • Exploration of unsupervised methods in combination with minimal labelled guidance to fine-tune model generalization capabilities.
  • Extension of the approach to related areas of computer vision and biometrics where large data availability outpaces labelled annotations.

The paper’s publication is supported by rigorous empirical evaluations and thought-provoking discussions, pushing the boundaries of what unsupervised learning can achieve in the domain of human identification. With its scalable and adaptable framework, GaitSSB sets a promising precedent for future research trajectories in automated gait analysis.