GreenFlow: A Computation Allocation Framework for Building Environmentally Sound Recommendation System (2312.16176v1)
Abstract: Given the enormous number of users and items, industrial cascade recommendation systems (RS) are continuously expanded in size and complexity to deliver relevant items, such as news, services, and commodities, to the appropriate users. In a real-world scenario with hundreds of thousands requests per second, significant computation is required to infer personalized results for each request, resulting in a massive energy consumption and carbon emission that raises concern. This paper proposes GreenFlow, a practical computation allocation framework for RS, that considers both accuracy and carbon emission during inference. For each stage (e.g., recall, pre-ranking, ranking, etc.) of a cascade RS, when a user triggers a request, we define two actions that determine the computation: (1) the trained instances of models with different computational complexity; and (2) the number of items to be inferred in the stage. We refer to the combinations of actions in all stages as action chains. A reward score is estimated for each action chain, followed by dynamic primal-dual optimization considering both the reward and computation budget. Extensive experiments verify the effectiveness of the framework, reducing computation consumption by 41% in an industrial mobile application while maintaining commercial revenue. Moreover, the proposed framework saves approximately 5000kWh of electricity and reduces 3 tons of carbon emissions per day.
- Online vertex-weighted bipartite matching and single-bid budgeted allocations. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms, pages 1253–1264. SIAM, 2011.
- A dynamic near-optimal algorithm for online linear programming. Operations Research, 62(4):876–890, 2014.
- Collaborative personalized tweet recommendation. In Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2012.
- Deep neural networks for youtube recommendations. Proceedings of the 10th ACM Conference on Recommender Systems, 2016.
- Danny Hernandez Dario Amodei. Ai and compute. 2018.
- Eie: Efficient inference engine on compressed deep neural network. 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), pages 243–254, 2016.
- Mobilenets: Efficient convolutional neural networks for mobile vision applications. ArXiv, abs/1704.04861, 2017.
- Learning deep structured semantic models for web search using clickthrough data. Proceedings of the 22nd ACM international conference on Information & Knowledge Management, 2013.
- Gpipe: Efficient training of giant neural networks using pipeline parallelism. ArXiv, abs/1811.06965, 2018.
- Dcaf: A dynamic computation allocation framework for online serving system. arXiv preprint arXiv:2006.09684, 2020.
- Dynamic vm placement method for minimizing energy and carbon cost in geographically distributed cloud data centers. IEEE Transactions on Sustainable Computing, 2:183–196, 2017.
- Reformer: The efficient transformer. ArXiv, abs/2001.04451, 2020.
- Quantifying the carbon emissions of machine learning. ArXiv, abs/1910.09700, 2019.
- Simple and fast algorithm for binary integer and online linear programming. arXiv preprint arXiv:2003.02513, 2020.
- Sampling methods for efficient training of graph convolutional networks: A survey. IEEE/CAA Journal of Automatica Sinica, 9:205–234, 2021.
- Entire space multi-task model: An effective approach for estimating post-click conversion rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pages 1137–1140, 2018.
- Pruning convolutional neural networks for resource efficient inference. arXiv: Learning, 2016.
- Field-aware calibration: A simple and empirically strong method for reliable probabilistic predictions. Proceedings of The Web Conference 2020, 2019.
- Rankflow: Joint optimization of multi-stage cascade ranking systems as flows. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2022.
- Green ai. Communications of the ACM, 63:54 – 63, 2019.
- A state of art survey for concurrent computation and clustering of parallel computing for distributed systems. Journal of Applied Science and Technology Trends, 2020.
- Borg: the next generation. In EuroSys’20, Heraklion, Crete, 2020.
- A cascade ranking model for efficient ranked retrieval. Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, 2011.
- A survey on green deep learning. ArXiv, abs/2111.05193, 2021.
- Computation resource allocation solution in recommender systems. ArXiv, abs/2103.02259, 2021.
- Green artificial intelligence: Towards an efficient, sustainable and equitable technology for smart cities and futures. Sustainability, 2021.
- Deep interest network for click-through rate prediction. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2017.
- Deep interest evolution network for click-through rate prediction. In AAAI Conference on Artificial Intelligence, 2018.