Demand Prediction Using Machine Learning Methods and Stacked Generalization (2009.09756v2)

Published 21 Sep 2020 in cs.LG, cs.AI, and stat.ML

Abstract: Supply and demand are two fundamental concepts of sellers and customers. Predicting demand accurately is critical for organizations in order to be able to make plans. In this paper, we propose a new approach for demand prediction on an e-commerce web site. The proposed model differs from earlier models in several ways. The business model used in the e-commerce web site, for which the model is implemented, includes many sellers that sell the same product at the same time at different prices where the company operates a market place model. The demand prediction for such a model should consider the price of the same product sold by competing sellers along the features of these sellers. In this study we first applied different regression algorithms for specific set of products of one department of a company that is one of the most popular online e-commerce companies in Turkey. Then we used stacked generalization or also known as stacking ensemble learning to predict demand. Finally, all the approaches are evaluated on a real world data set obtained from the e-commerce company. The experimental results show that some of the machine learning methods do produce almost as good results as the stacked generalization method.

Citations (22)

View on Semantic Scholar

Summary

The paper demonstrates that combining multiple regression models via stacked generalization enhances demand prediction accuracy.
It introduces a two-layer framework integrating LR, DT, RF, and GBT with a meta-learner to refine ensemble outputs.
The study reveals that while stacked generalization improves precision, Random Forest remains a competitive alternative in certain scenarios.

Demand Prediction Using Machine Learning Methods and Stacked Generalization

The paper addresses the demand prediction problem in the e-commerce industry, specifically focusing on a marketplace model where multiple sellers offer the same product at different prices. Accurately forecasting demand is essential for minimizing surplus and maximizing revenue. The research proposes a novel approach employing stacked generalization, a type of ensemble learning mechanism, to improve predictive accuracy over individual regression models.

Methodology Overview

The authors explore various machine learning algorithms, including Linear Regression (LR), Decision Tree Regression (DT), Random Forest (RF), and Gradient Boosted Trees (GBT), as baseline models for demand prediction. Their approach integrates these models within a stacked generalization framework, combining their predictions to produce a final output. This method is structured in two layers: the first layer harnesses individual models trained on the data, and the second layer evaluates their outputs using a meta-learner that combines the predictions into a single, refined estimate of demand.

Several regression methodologies, such as Random Forest and Gradient Boosted Trees, are inherently ensemble-based. However, the core distinction in this research lies in the strategic integration of these models using stacked generalization, which is posited to leverage the strengths and mitigate the weaknesses of individual models through this hierarchical training structure.

Experimental Process and Results

The paper evaluates these models using a real-world dataset from a prominent Turkish e-commerce firm, following extensive preprocessing steps like aggregation of weekly sales and handling of outliers. The authors report that the RMSE (Root Mean Squared Error) metric is employed to assess model performance, demonstrating that the stacked generalization method achieved competitive results relative to individual models, specifically outperforming single classifiers in predictive accuracy.

The experiments reveal a noteworthy observation: while the stacked generalization approach resulted in the lowest RMSE, indicating high precision, the difference from the best-performing single model, Random Forest, was not statistically significant. This nuance provides a practical consideration in model selection, suggesting that in scenarios with less available data, Random Forest could be sufficient given its comparable efficacy.

Theoretical and Practical Implications

From a theoretical standpoint, this paper engages with broader discussions on ensemble learning and its utility in complex prediction tasks. The comparative analysis of stacked generalization against individual models underscores its potential to achieve robust predictions by effectively amalgamating diverse methodologies. Practically, this approach offers significant benefits for e-commerce platforms operating marketplace models, where precise demand forecasting can streamline inventory management and optimize pricing strategies.

Speculations on Future Developments

As AI continues to evolve, further refinements to ensemble methods like stacking could include adaptive mechanisms that dynamically adjust weights assigned to different base learners, thereby improving responsiveness to changing market conditions. Additionally, hybrid models that incorporate temporal components or integrate external data sources (e.g., social media trends) may enhance predictive capabilities.

In conclusion, this research reinforces the value of ensemble learning frameworks in demand prediction and sets the stage for subsequent studies to explore more efficient implementations and adaptations of stacked generalization in various commercial contexts.

PDF Markdown