Papers
Topics
Authors
Recent
Search
2000 character limit reached

Optimizing LSTM Neural Networks for Resource-Constrained Retail Sales Forecasting: A Model Compression Study

Published 2 Jan 2026 in cs.LG and cs.AI | (2601.00525v1)

Abstract: Standard LSTM(Long Short-Term Memory) neural networks provide accurate predictions for sales data in the retail industry, but require a lot of computing power. It can be challenging especially for mid to small retail industries. This paper examines LSTM model compression by gradually reducing the number of hidden units from 128 to 16. We used the Kaggle Store Item Demand Forecasting dataset, which has 913,000 daily sales records from 10 stores and 50 items, to look at the trade-off between model size and how accurate the predictions are. Experiments show that lowering the number of hidden LSTM units to 64 maintains the same level of accuracy while also improving it. The mean absolute percentage error (MAPE) ranges from 23.6% for the full 128-unit model to 12.4% for the 64-unit model. The optimized model is 73% smaller (from 280KB to 76KB) and 47% more accurate. These results show that larger models do not always achieve better results.

Summary

  • The paper finds that compressing LSTM models to 64 hidden units reduces MAPE from 23.6% to 12.4% while decreasing model size by 73%.
  • The methodology leverages systematic evaluation on a Kaggle dataset using time-series cross-validation and TensorFlow on a CPU to simulate resource-constrained settings.
  • The study highlights practical implications for SMEs, enabling advanced forecasting with reduced computational demands and cost-effective deployment.

Optimizing LSTM Neural Networks for Resource-Constrained Retail Sales Forecasting: A Model Compression Study

Introduction

This paper investigates the optimization of Long Short-Term Memory (LSTM) neural networks through model compression for retail sales forecasting, specifically targeting resource-constrained environments. Retail forecasting plays a crucial role in minimizing inventory-related losses, yet traditional LSTM models, with their substantial computational and memory requirements, pose challenges for small to medium-sized enterprises (SMEs) with limited IT budgets. The study explores the compression of LSTM models by reducing the number of hidden units, thereby aiming to enhance computational efficiency without compromising predictive accuracy.

Methodology

Dataset and Architecture

The research utilizes the Kaggle Store Item Demand Forecasting dataset, comprising approximately 913,000 daily sales records from 10 stores across 50 items over a five-year period. A systematic evaluation of LSTM architectures with hidden units ranging from 128 to 16 is conducted to identify an optimal configuration that balances accuracy with efficiency.

Model Configurations

Five different LSTM configurations—LSTM-128, LSTM-64, LSTM-48, LSTM-32, and LSTM-16—are examined. Each configuration maintains the same architectural framework apart from the number of hidden units. The methodology incorporates best practices for feature engineering in time-series forecasting and evaluates model performance based on Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), and efficiency metrics (model size, inference time, RAM usage).

Experimental Setup

Experiments are implemented on a standard CPU setup using TensorFlow, with no GPU deployment to faithfully simulate resource-constrained environments. The training involves an 80/20 temporal split with 30 epochs and a batch size of 64. Performance is validated through cross-validation of time-series data.

Results

Accuracy and Computational Efficiency

The study finds that reducing the hidden units to 64 significantly improves the model's accuracy, achieving a MAPE of 12.4%, a notable reduction from the 23.6% MAPE of the 128-unit baseline. This is accompanied by a 73% reduction in model size (from 280KB to 76KB). Additionally, inference times remain constant across model variants due to TensorFlow's fixed overhead, affirming that the smaller models do not sacrifice speed or memory efficiency. Figure 1

Figure 1: (a) Prediction Error vs. Model Size shows the U-shaped relationship between the size of the model and its accuracy. (b) Storage Requirements showing that the model size goes down in a straight line as the number of hidden units goes down.

Predictive Performance

The compressed LSTM-64 model demonstrates a strong alignment between predicted and actual sales over a 100-day evaluation period, verifying the model's robustness and practical applicability in real-world scenarios. Figure 2

Figure 2: Sample predictions from LSTM-64 showing close alignment between predicted and actual sales over a 100-day period.

Comparative Analysis

Performance analysis, including inference speed and RAM needs, indicates that moderate compression (64 units) provides the best trade-off between model size and accuracy. Statistical tests (paired t-tests) further validate the significance of improvement over larger baseline models. Figure 3

Figure 3: A full performance analysis that shows (a) inference speed, (b) RAM needs, (c) relative accuracy compared to the baseline, and (d) the trade-off between compression and accuracy, with LSTM-64 being the best choice.

Discussion

The study challenges the conventional understanding that larger models yield superior performance. Notably, the LSTM-64 configuration optimally captures temporal dependencies in retail sales data with minimal computational demands. These findings align with the lottery ticket hypothesis and underscore the importance of selecting an appropriate model size.

Practical Implications

For SMEs, adopting the LSTM-64 model offers a practical solution for achieving high forecasting accuracy with significantly reduced computational costs. This enables broader accessibility to advanced predictive analytics without necessitating GPU-based infrastructures, thereby empowering a larger segment of the retail market to leverage sophisticated forecasting techniques.

Limitations and Future Work

The study's generalizability is limited to the specific Kaggle dataset and single-layer LSTM architectures. Future research could explore advanced compression methods such as pruning, quantization, and the integration of multi-layer or transformer architectures to further validate and expand upon these findings across different datasets and domains.

Conclusion

This research articulates the efficacy of LSTM model compression, showing that significant reductions in hidden units can enhance accuracy and efficiency in retail sales forecasting. The findings provide actionable insights for optimizing LSTM architectures, facilitating broader deployment of AI-driven solutions in resource-constrained environments. The study invites further exploration of hybrid and transformer-based models to continue advancing the field of time-series forecasting.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.