Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Continual Learning for Blind Image Quality Assessment (2102.09717v2)

Published 19 Feb 2021 in cs.CV and eess.IV

Abstract: The explosive growth of image data facilitates the fast development of image processing and computer vision methods for emerging visual applications, meanwhile introducing novel distortions to the processed images. This poses a grand challenge to existing blind image quality assessment (BIQA) models, failing to continually adapt to such subpopulation shift. Recent work suggests training BIQA methods on the combination of all available human-rated IQA datasets. However, this type of approach is not scalable to a large number of datasets, and is cumbersome to incorporate a newly created dataset as well. In this paper, we formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets, building on what was learned from previously seen data. We first identify five desiderata in the new setting with a measure to quantify the plasticity-stability trade-off. We then propose a simple yet effective method for learning BIQA models continually. Specifically, based on a shared backbone network, we add a prediction head for a new dataset, and enforce a regularizer to allow all prediction heads to evolve with new data while being resistant to catastrophic forgetting of old data. We compute the quality score by an adaptive weighted summation of estimates from all prediction heads. Extensive experiments demonstrate the promise of the proposed continual learning method in comparison to standard training techniques for BIQA. We made the code publicly available at https://github.com/zwx8981/BIQA_CL.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Weixia Zhang (19 papers)
  2. Dingquan Li (18 papers)
  3. Chao Ma (187 papers)
  4. Guangtao Zhai (231 papers)
  5. Xiaokang Yang (207 papers)
  6. Kede Ma (57 papers)
Citations (77)

Summary

Continual Learning for Blind Image Quality Assessment

The paper "Continual Learning for Blind Image Quality Assessment" presents a method designed to improve Blind Image Quality Assessment (BIQA) models' abilities in adapting to subpopulation shifts. This is achieved through the application of continual learning strategies, allowing models to evolve through streams of Image Quality Assessment (IQA) datasets sequentially. Rather than relying on combining multiple datasets for training, which is non-scalable and cumbersome, the proposed approach leverages continual learning to build upon previously acquired knowledge, thus offering improved adaptability and remedying catastrophic forgetting.

Methodological Insights

Central to the proposed framework are five desiderata that guide the continual learning process, ensuring practicality and effectiveness:

  1. Common Perceptual Scale: This assumes that each dataset's Mean Opinion Scores (MOSs) can be mapped onto a uniform perceptual scale, allowing for unified processing across varying datasets.
  2. Robustness to Subpopulation Shift: Highlights the need for models that maintain performance levels despite variations in image distortions present across datasets.
  3. Limited Direct Access to Previous Data: Outlines constraints on revisiting previous datasets, necessitating mechanisms like knowledge distillation or memory consolidation for retention.
  4. No Test-Time Oracle: The method is designed to be deployment-ready without relying on dataset-specific information during inference.
  5. Bounded Memory Footprint: Ensures computational efficiency, particularly keeping backbone network parameters relatively fixed as new prediction heads are added.

The continual learning model is based on a dual-stream network architecture, integrating a pre-trained VGG-like Convolutional Neural Network (CNN) that serves as a stable branch alongside a modifiable ResNet-18 variant acting as the plastic branch. Predictions are performed using multiple heads associated with each dataset, and a weighted summation approach driven by a KK-means gating strategy is employed to amalgamate these predictions into a comprehensive quality score. This setup is strengthened by learning-to-rank from relative quality data, enabling robust processing across heterogeneous dataset inputs.

Empirical Evaluation

The paper details experimental evaluations across multiple IQA datasets, demonstrating superior adaptability and stability compared to traditional BIQA techniques. Key performance metrics such as Spearman’s rank correlation coefficient (SRCC) indicate substantial improvements, particularly in domains where subpopulation shifts—synthetic versus realistic distortions—historically challenge conventional BIQA methods.

Moreover, the paper explores alternative continual learning configurations with and without experience replay. Regularization methods and mechanisms like Elastic Weight Consolidation (EWC), Synaptic Intelligence (SI), and others are tested, alongside experience replay implementations such as iCaRL setups and memory-buffer-based strategies like GDumb. The outcomes suggest that experience replay, when used judiciously with apt memory management, further boosts performance without requiring full access to historical datasets, reinforcing the robustness and practical utility of rehearsal strategies.

Implications and Future Work

This paper's findings have pragmatic implications in AI, particularly in deploying BIQA systems that are resilient to evolving distortion landscapes seen in real-world applications. It bridges gaps left by previous BIQA studies by addressing the critical challenge of catastrophic forgetting in dynamic environments. This adaptability is paramount when real-time processing requisites diverge from tightly controlled laboratory settings, laying groundwork for more autonomous AI systems in video processing and smartphone camera technologies.

Future explorations could leverage multi-modal architectures or disentangled representations to facilitate more granular learning amid extensive subpopulation shifts. This would be ideal for environments where distortions are highly varied and dataset-generation methods evolve rapidly. Further extension into domain generalization techniques could complement continual learning strategies, offering elevated prediction reliability on unseen datasets.

In summary, this research underscores the viability of continual learning paradigms in BIQA disciplines, presenting a foundational framework that enhances model robustness and adaptability across varying perceptual landscapes.

Github Logo Streamline Icon: https://streamlinehq.com