Input Convex Lipschitz RNN: A Fast and Robust Approach for Engineering Tasks
Abstract: Computational efficiency and robustness are essential in process modeling, optimization, and control for real-world engineering applications. While neural network-based approaches have gained significant attention in recent years, conventional neural networks often fail to address these two critical aspects simultaneously or even independently. Inspired by natural physical systems and established literature, input convex architectures are known to enhance computational efficiency in optimization tasks, whereas Lipschitz-constrained architectures improve robustness. However, combining these properties within a single model requires careful review, as inappropriate methods for enforcing one property can undermine the other. To overcome this, we introduce a novel network architecture, termed Input Convex Lipschitz Recurrent Neural Networks (ICLRNNs). This architecture seamlessly integrates the benefits of convexity and Lipschitz continuity, enabling fast and robust neural network-based modeling and optimization. The ICLRNN outperforms existing recurrent units in both computational efficiency and robustness. Additionally, it has been successfully applied to practical engineering scenarios, such as modeling and control of chemical process and the modeling and real-world solar irradiance prediction for solar PV system planning at LHT Holdings in Singapore. Source code is available at https://github.com/killingbear999/ICLRNN.
- X. Bao, Z. Sun, and N. Sharma, “A Recurrent Neural Network Based MPC for a Hybrid Neuroprosthesis System,” in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 4715–4720, IEEE, 2017.
- A. Afram, F. Janabi-Sharifi, A. S. Fung, and K. Raahemifar, “Artificial Neural Network (ANN) Based Model Predictive Control (MPC) and Optimization of HVAC Systems: A State of the Art Review and Case Study of a Residential HVAC System,” Energy and Buildings, vol. 141, pp. 96–113, 2017.
- N. Lanzetti, Y. Z. Lian, A. Cortinovis, L. Dominguez, M. Mercangöz, and C. Jones, “Recurrent Neural Network Based MPC for Process Industries,” in 2019 18th European Control Conference (ECC), pp. 1005–1010, IEEE, 2019.
- M. J. Ellis and V. Chinde, “An Encoder-Decoder LSTM-Based EMPC Framework Applied to a Building HVAC System,” Chemical Engineering Research and Design, vol. 160, pp. 508–520, 2020.
- J. Nubert, J. Köhler, V. Berenz, F. Allgöwer, and S. Trimpe, “Safe and Fast Tracking on a Robot Manipulator: Robust MPC and Neural Network Control,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 3050–3057, 2020.
- Y. Zheng, X. Wang, and Z. Wu, “Machine Learning Modeling and Predictive Control of the Batch Crystallization Process,” Industrial & Engineering Chemistry Research, vol. 61, no. 16, pp. 5578–5592, 2022.
- Y. Zheng, T. Zhao, X. Wang, and Z. Wu, “Online Learning-Based Predictive Control of Crystallization Processes under Batch-to-Batch Parametric Drift,” AIChE Journal, vol. 68, no. 11, p. e17815, 2022.
- N. Sitapure and J. S.-I. Kwon, “Neural Network-Based Model Predictive Control for Thin-film Chemical Deposition of Quantum Dots using Data from a Multiscale Simulation,” Chemical Engineering Research and Design, vol. 183, pp. 595–607, 2022.
- Z. Wang and Z. Wu, “Input Convex LSTM: A Convex Approach for Fast Lyapunov-Based Model Predictive Control,” arXiv preprint arXiv:2311.07202, 2023.
- W. G. Y. Tan and Z. Wu, “Robust Machine Learning Modeling for Predictive Control using Lipschitz-Constrained Neural Networks,” Computers & Chemical Engineering, vol. 180, p. 108466, 2024.
- H. Goldstein, C. Poole, and J. Safko, “Classical Mechanics,” 2002.
- Springer, 2005.
- E. M. Purcell, “Electricity and Magnetism,” Berkeley University, vol. 2, 1963.
- D. J. Griffiths, “Introduction to Electrodynamics,” 2005.
- J. B. Hartle, “Gravity: an Introduction to Einstein’s General Relativity,” 2003.
- B. Amos, L. Xu, and J. Z. Kolter, “Input Convex Neural Networks,” in International Conference on Machine Learning, pp. 146–155, PMLR, 2017.
- Y. Chen, Y. Shi, and B. Zhang, “Optimal Control via Neural Networks: A Convex Approach,” arXiv preprint arXiv:1805.11835, 2018.
- C. Anil, J. Lucas, and R. Grosse, “Sorting out Lipschitz Function Approximation,” in International Conference on Machine Learning, pp. 291–301, PMLR, 2019.
- N. B. Erichson, O. Azencot, A. Queiruga, L. Hodgkinson, and M. W. Mahoney, “Lipschitz Recurrent Neural Networks,” arXiv preprint arXiv:2006.12070, 2020.
- M. Serrurier, F. Mamalet, A. González-Sanz, T. Boissin, J.-M. Loubes, and E. Del Barrio, “Achieving Robustness in Classification using Optimal Transport with Hinge Regularization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 505–514, 2021.
- Y. LeCun, C. Cortes, and C. Burges, “Mnist Handwritten Digit Database,” ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, vol. 2, 2010.
- T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral Normalization for Generative Adversarial Networks,” arXiv preprint arXiv:1802.05957, 2018.
- G. H. Golub and H. A. Van der Vorst, “Eigenvalue Computation in the 20th Century,” Journal of Computational and Applied Mathematics, vol. 123, no. 1-2, pp. 35–65, 2000.
- Å. Björck and C. Bowie, “An Iterative Algorithm for Computing the Best Estimate of an Orthogonal Matrix,” SIAM Journal on Numerical Analysis, vol. 8, no. 2, pp. 358–364, 1971.
- S. M. Rump, “Verified Bounds for Singular Values, in particular for the Spectral Norm of a Matrix and Its Inverse,” BIT Numerical Mathematics, vol. 51, pp. 367–384, 2011.
- B. Gao and L. Pavel, “On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning,” arXiv preprint arXiv:1704.00805, 2017.
- A. Virmaux and K. Scaman, “Lipschitz Regularity of Deep Neural Networks: Analysis and Efficient Estimation,” Advances in Neural Information Processing Systems, vol. 31, 2018.
- S. P. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.
- Z. Wu, A. Tran, D. Rincon, and P. D. Christofides, “Machine-Learning-Based Predictive Control of Nonlinear Processes. Part i: Theory,” AIChE Journal, vol. 65, no. 11, p. e16729, 2019.
- Q. V. Le, N. Jaitly, and G. E. Hinton, “A Simple Way to Initialize Recurrent Networks of Rectified Linear Units,” arXiv preprint arXiv:1504.00941, 2015.
- D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv preprint arXiv:1412.6980, 2014.
- V. Nair and G. E. Hinton, “Rectified Linear Units Improve Restricted Boltzmann Machines,” in Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807–814, 2010.
- E. Zelikman, S. Zhou, J. Irvin, C. Raterink, H. Sheng, A. Avati, J. Kelly, R. Rajagopal, A. Y. Ng, and D. Gagne, “Short-Term Solar Irradiance Forecasting using Calibrated Probabilistic Models,” arXiv preprint arXiv:2010.04715, 2020.
- Z. Wu, A. Tran, D. Rincon, and P. D. Christofides, “Machine-Learning-Based Predictive Control of Nonlinear Processes. Part ii: Computational Implementation,” AIChE Journal, vol. 65, no. 11, p. e16734, 2019.
- A. Wächter and L. T. Biegler, “On the Implementation of an Interior-Point FilterLine-Search Algorithm for Large-Scale Nonlinear Programming,” Mathematical programming, vol. 106, pp. 25–57, 2006.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.