Enhancing Column Generation by Reinforcement Learning-Based Hyper-Heuristic for Vehicle Routing and Scheduling Problems (2310.09686v1)
Abstract: Column generation (CG) is a vital method to solve large-scale problems by dynamically generating variables. It has extensive applications in common combinatorial optimization, such as vehicle routing and scheduling problems, where each iteration step requires solving an NP-hard constrained shortest path problem. Although some heuristic methods for acceleration already exist, they are not versatile enough to solve different problems. In this work, we propose a reinforcement learning-based hyper-heuristic framework, dubbed RLHH, to enhance the performance of CG. RLHH is a selection module embedded in CG to accelerate convergence and get better integer solutions. In each CG iteration, the RL agent selects a low-level heuristic to construct a reduced network only containing the edges with a greater chance of being part of the optimal solution. In addition, we specify RLHH to solve two typical combinatorial optimization problems: Vehicle Routing Problem with Time Windows (VRPTW) and Bus Driver Scheduling Problem (BDSP). The total cost can be reduced by up to 27.9\% in VRPTW and 15.4\% in BDSP compared to the best lower-level heuristic in our tested scenarios, within equivalent or even less computational time. The proposed RLHH is the first RL-based CG method that outperforms traditional approaches in terms of solution quality, which can promote the application of CG in combinatorial optimization.
- A branch-and-price approach to the feeder network design problem. European Journal of Operational Research, 264(2):607–622.
- Coil: A deep architecture for column generation. Bureau de Montreal, Université de Montreal.
- Machine learning for combinatorial optimization: a methodological tour d’horizon. European Journal of Operational Research, 290(2):405–421.
- A deep reinforcement learning framework for column generation. In Advances in Neural Information Processing Systems, volume 35, pages 9633–9644.
- Exact branch-price-and-cut algorithms for vehicle routing. Transportation Science, 53(4):946–985.
- Column Generation, volume 5. Springer Science & Business Media.
- Tabu search, partial elementarity, and generalized k-path inequalities for the vehicle routing problem with time windows. Transportation Science, 42(3):387–404.
- A new optimization algorithm for the vehicle routing problem with time windows. Operations Research, 40:342–354.
- Dror, M. (1994). Note on the complexity of the shortest path models for column generation in vrptw. Operations Research, 42(5):977–978.
- A machine learning-based branch and price algorithm for a sampled vehicle routing problem. OR Spectrum, 43(3):693–732.
- A column generation approach for large-scale aircrew rostering problems. Operations Research, 47(2):247–263.
- Exact combinatorial optimization with graph convolutional neural networks. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
- A linear programming approach to the cutting-stock problem. Operations Research, 9(6):849–859.
- A linear programming approach to the cutting stock problem—part ii. Operations Research, 11(6):863–888.
- Lookback for learning to branch. arXiv preprint arXiv:2206.14987.
- Simultaneous vehicle and crew scheduling in urban mass transit systems. Transportation Science, 35(3):286–303.
- Neural large neighborhood search for routing problems. Artificial Intelligence, 313:103786.
- Reinforcement learning for route optimization with robustness guarantees. In International Joint Conference on Artificial Intelligence, pages 2592–2598.
- Learning combinatorial optimization algorithms over graphs. In Advances in Neural Information Processing Systems, volume 30.
- Learning to branch in mixed integer programming. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30.
- Learning to run heuristics in tree search. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pages 659–666.
- 2-path cuts for the vehicle routing problem with time windows. Transportation Science, 33(1):101–116.
- Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475.
- Boosting combinatorial problem modeling with machine learning. arXiv preprint arXiv:1807.05517.
- A learning-based iterative method for solving vehicle routing problems. In International Conference on Learning Representations.
- Efficient neural neighborhood search for pickup and delivery problems. arXiv preprint arXiv:2204.11399.
- Learning to iteratively solve routing problems with dual-aspect collaborative transformer. In Advances in Neural Information Processing Systems, volume 34, pages 11096–11107.
- A branch-and-price approach to the vehicle routing problem with simultaneous distribution and collection. Transportation Science, 40(2):235–247.
- Human-level control through deep reinforcement learning. Nature, 518(7540):529–533.
- Machine-learning–based column selection for column generation. Transportation Science, 55(4):815–831.
- Machine-learning-based arc selection for constrained shortest path problems in column generation. INFORMS Journal on Optimization, 5(2):191–210.
- A column generation approach for the driver scheduling problem with staff cars. Public Transport, 14(3):705–738.
- A novel reinforcement learning-based hyper-heuristic for heterogeneous vehicle routing problem. Computers & Industrial Engineering, 156:107252.
- A bus crew scheduling system using a set covering formulation. Transportation Research Part A: General, 22(2):97–108.
- Solomon, M. M. (1987). Algorithms for the vehicle routing and scheduling problems with time window constraints. Operations Research, 35(2):254–265.
- Asymmetry matters: Dynamic half-way points in bidirectional labeling for solving shortest path problems with resource constraints faster. European Journal of Operational Research, 261(2):530–539.
- Accelerating the branch-and-price algorithm using machine learning. European Journal of Operational Research, 271(3):1055–1069.
- Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 30.
- Pointer networks. In Advances in Neural Information Processing Systems, volume 28.
- Learning for graph matching and related combinatorial optimization problems. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, pages 4988–4996.
- The neural-prediction based acceleration algorithm of column generation for graph-based set covering problems. In 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pages 1115–1120. IEEE.
- A deep reinforcement learning based hyper-heuristic for combinatorial optimisation with uncertainties. European Journal of Operational Research, 300(2):418–427.