Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Target Aware Network Architecture Search and Compression for Efficient Knowledge Transfer (2205.05967v2)

Published 12 May 2022 in cs.CV

Abstract: Transfer Learning enables Convolutional Neural Networks (CNN) to acquire knowledge from a source domain and transfer it to a target domain, where collecting large-scale annotated examples is time-consuming and expensive. Conventionally, while transferring the knowledge learned from one task to another task, the deeper layers of a pre-trained CNN are finetuned over the target dataset. However, these layers are originally designed for the source task which may be over-parameterized for the target task. Thus, finetuning these layers over the target dataset may affect the generalization ability of the CNN due to high network complexity. To tackle this problem, we propose a two-stage framework called TASCNet which enables efficient knowledge transfer. In the first stage, the configuration of the deeper layers is learned automatically and finetuned over the target dataset. Later, in the second stage, the redundant filters are pruned from the fine-tuned CNN to decrease the network's complexity for the target task while preserving the performance. This two-stage mechanism finds a compact version of the pre-trained CNN with optimal structure (number of filters in a convolutional layer, number of neurons in a dense layer, and so on) from the hypothesis space. The efficacy of the proposed method is evaluated using VGG-16, ResNet-50, and DenseNet-121 on CalTech-101, CalTech-256, and Stanford Dogs datasets. Similar to computer vision tasks, we have also conducted experiments on Movie Review Sentiment Analysis task. The proposed TASCNet reduces the computational complexity of pre-trained CNNs over the target task by reducing both trainable parameters and FLOPs which enables resource-efficient knowledge transfer. The source code is available at: https://github.com/Debapriya-Tula/TASCNet.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. S. H. Shabbeer Basha (8 papers)
  2. Debapriya Tula (4 papers)
  3. Sravan Kumar Vinakota (3 papers)
  4. Shiv Ram Dubey (55 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.