Resolution Adaptive Networks for Efficient Inference (2003.07326v5)

Published 16 Mar 2020 in cs.CV

Abstract: Adaptive inference is an effective mechanism to achieve a dynamic tradeoff between accuracy and computational cost in deep networks. Existing works mainly exploit architecture redundancy in network depth or width. In this paper, we focus on spatial redundancy of input samples and propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs containing large objects with prototypical features, while only some "hard" samples need spatially detailed information. In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations, and those samples with high prediction confidence will exit early from the network without being further processed. Meanwhile, high-resolution paths in the network maintain the capability to recognize the "hard" samples. Therefore, RANet can effectively reduce the spatial redundancy involved in inferring high-resolution inputs. Empirically, we demonstrate the effectiveness of the proposed RANet on the CIFAR-10, CIFAR-100 and ImageNet datasets in both the anytime prediction setting and the budgeted batch classification setting.

Citations (204)

View on Semantic Scholar

Summary

The paper introduces Resolution Adaptive Networks (RANet), a novel architecture for efficient inference that adaptively processes input images at varying resolutions based on their complexity.
RANet employs multiple sub-networks and exit points, allowing "easy" samples to be classified quickly at low resolution while routing "hard" samples through higher-resolution paths.
Experimental results show RANet outperforms competitive models on CIFAR and ImageNet datasets, achieving up to a 27% reduction in computational cost on ImageNet at equivalent accuracy.

Resolution Adaptive Networks for Efficient Inference

The paper "Resolution Adaptive Networks for Efficient Inference" introduces a novel architecture called Resolution Adaptive Network (RANet), addressing computational efficiency in deep neural networks through the lens of adaptive inference. The primary innovation lies in exploiting spatial redundancy within input samples, enabling dynamic adaptation of resolution according to the complexity of the sample. This adaptation is particularly meaningful as it emphasizes the computational advantage gained when low-resolution feature maps are sufficient for classifying less challenging inputs while reserving high-resolution processing for more difficult cases.

Architecture and Methodology

The RANet architecture consists of multiple sub-networks (referred to as layers) capable of processing input images at varying resolutions. The process begins with routing input images to a low-resolution sub-network designed to classify "easy" samples—those with high prediction confidence can exit early, thereby saving computational resources. Conversely, complicated samples, dubbed "hard" samples, are evaluated through subsequent high-resolution paths within the network. This adaptive framework is a departure from the conventional techniques focusing on architectural redundancy in deep networks, such as width or depth pruning.

The adaptive inference in RANet is implemented by incorporating multiple exit points throughout the network, allowing a sample to conclude processing at the point where a sufficiently confident prediction is produced. This design introduces a coarse-to-fine processing strategy, drawing upon the notion that low-resolution features encapsulate adequate information for recognizing simpler patterns inherent in several datasets.

Experimental Validation and Results

The effectiveness of RANet was empirically validated on benchmark datasets including CIFAR-10, CIFAR-100, and ImageNet. In both anytime prediction and budgeted batch classification settings, the proposed network consistently outperformed several competitive models, including enhancements to existing methods such as MSDNet, ResNet ensembles, and DenseNet variants.

On the CIFAR-10 dataset, for instance, RANet demonstrated superior accuracy at all classifiers compared to MSDNet, with a marked improvement in low computational budgets. The ImageNet results substantiated these gains, with RANet models achieving up to 27% reduction in computational cost at equivalent performance levels compared to MSDNet.

Implications and Future Directions

The introduction of RANet underscores a significant shift towards maximizing computational efficiency in neural networks without compromising accuracy, especially pertinent in resource-constrained environments such as mobile and edge devices. By adopting a resolution-adaptive approach, RANet pioneers a strategic direction that could influence future architectures focusing on minimizing inference cost through intelligent resource allocation.

This work opens several avenues for future research. One prospective development could be integrating this resolution adaptive framework with other forms of redundancy reduction techniques, potentially leading to hybrid models with further efficiency improvements. Additionally, exploring application areas beyond image classification, such as video processing or real-time data streams, can extend the utility of RANet, leveraging its computational benefits in diverse real-world tasks.

In conclusion, the RANet sets a precedent in adaptive inference strategies, harnessing input sample characteristics to optimize neural network operations, indicating a promising paradigm in designing more efficient deep learning models.

PDF Markdown