Enhancing Scalability in Recommender Systems through Lottery Ticket Hypothesis and Knowledge Distillation-based Neural Network Pruning (2401.10484v1)

Published 19 Jan 2024 in cs.IR, cs.AI, and cs.AR

Abstract: This study introduces an innovative approach aimed at the efficient pruning of neural networks, with a particular focus on their deployment on edge devices. Our method involves the integration of the Lottery Ticket Hypothesis (LTH) with the Knowledge Distillation (KD) framework, resulting in the formulation of three distinct pruning models. These models have been developed to address scalability issue in recommender systems, whereby the complexities of deep learning models have hindered their practical deployment. With judicious application of the pruning techniques, we effectively curtail the power consumption and model dimensions without compromising on accuracy. Empirical evaluation has been performed using two real world datasets from diverse domains against two baselines. Gratifyingly, our approaches yielded a GPU computation-power reduction of up to 66.67%. Notably, our study contributes to the field of recommendation system by pioneering the application of LTH and KD.

References (29)

Authors (4)

Rajaram R (1 paper)
Manoj Bharadhwaj (1 paper)
Vasan VS (1 paper)
Nargis Pervin (1 paper)

Citations (1)

View on Semantic Scholar

Summary

The paper introduces three innovative pruning models (SP-SAD, LTH-SAD, SS-SAD) that combine Lottery Ticket Hypothesis and Knowledge Distillation to reduce complexity while preserving accuracy.
It employs dynamic, structured and unstructured pruning strategies to extract effective 'winning tickets' from over-parameterized networks for efficient edge deployment.
Empirical results on CIFAR-100 and movie datasets show up to 70% model size reduction, improved error metrics, and a 66.67% decrease in GPU power consumption.

Enhancing Scalability in Recommender Systems through Lottery Ticket Hypothesis and Knowledge Distillation-based Neural Network Pruning

This paper presents an approach to improving the scalability and efficiency of deep learning models, specifically within recommender systems, by integrating the Lottery Ticket Hypothesis (LTH) with Knowledge Distillation (KD) for neural network pruning. The key contribution lies in formulating three novel pruning models—SP-SAD, LTH-SAD, and SS-SAD—that address the inherent complexity and power consumption challenges associated with deploying deep learning models on edge devices.

Overview of Methodology

The integration of LTH and KD facilitates an effective pruning framework that reduces model dimensions while maintaining predictive accuracy. The authors introduce structured and unstructured pruning strategies through dynamic techniques that exploit both the KD framework and the LTH. This involves selecting smaller subnetworks within over-parameterized models ("winning tickets") that retain or exceed the performance of the original network. Knowledge distillation is utilized as a mechanism to tune these smaller models effectively by transferring crucial features from a larger "teacher" model to a simpler "student" model using an attention-guided approach.

Experimental Results

The approach was empirically validated using CIFAR-100 and movie datasets, demonstrating significant improvements in reduced computational power and model size. For instance, in the context of computer vision tasks using CIFAR-100, the models achieved an accuracy of approximately 73% while enabling a reduction in model size by 60% to 70%. In movie recommendation systems, the model showed up to a 32.08% improvement in Mean Squared Error (MSE) and 25.10% improvement in Mean Absolute Error (MAE), alongside a 66.67% reduction in GPU power consumption.

Implications and Future Prospects

The research extends the applicability of CNNs within recommender systems by effectively addressing scalability issues. By reducing model complexity while preserving accuracy, the authors present a scalable solution pertinent for deployment on resource-constrained devices. This has significant implications for industries reliant on recommendation systems, offering a practical approach to optimizing performance without necessitating extensive computational resources.

Future research could explore expanding this methodology to other domains and leveraging additional optimization strategies to further enhance model efficiency. The integration of LTH and KD presented in this paper holds promise for advancing model compression techniques and supports the broader adoption of AI across diverse platforms. The work may also encourage further exploration into the nuanced dynamics of feature transfer between large and small networks, perhaps leading to more nuanced frameworks that could be developed for real-time applications.

The paper underscores the potential of combining theoretical concepts like LTH with practical mechanisms such as KD to achieve scalable AI solutions, demonstrating a focused direction for future applications in AI and machine learning.

PDF Markdown

Related Papers

YouTube

Show All Videos