Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Efficient and Fair Policies for Uncertainty-Aware Collaborative Human-Robot Order Picking (2404.08006v1)

Published 9 Apr 2024 in cs.RO, cs.AI, cs.LG, and math.OC

Abstract: In collaborative human-robot order picking systems, human pickers and Autonomous Mobile Robots (AMRs) travel independently through a warehouse and meet at pick locations where pickers load items onto the AMRs. In this paper, we consider an optimization problem in such systems where we allocate pickers to AMRs in a stochastic environment. We propose a novel multi-objective Deep Reinforcement Learning (DRL) approach to learn effective allocation policies to maximize pick efficiency while also aiming to improve workload fairness amongst human pickers. In our approach, we model the warehouse states using a graph, and define a neural network architecture that captures regional information and effectively extracts representations related to efficiency and workload. We develop a discrete-event simulation model, which we use to train and evaluate the proposed DRL approach. In the experiments, we demonstrate that our approach can find non-dominated policy sets that outline good trade-offs between fairness and efficiency objectives. The trained policies outperform the benchmarks in terms of both efficiency and fairness. Moreover, they show good transferability properties when tested on scenarios with different warehouse sizes. The implementation of the simulation model, proposed approach, and experiments are published.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Deep policies for online bipartite matching: A reinforcement learning approach. Transactions on Machine Learning Research, .
  2. Robotized and automated warehouse systems: Review and recent developments. doi:10.1287/trsc.2018.0873.
  3. Dynamic human-robot collaborative picking strategies. Available at SSRN 3585396, .
  4. Breaking the limits of message passing graph neural networks. In International Conference on Machine Learning (pp. 599–608). PMLR.
  5. Deep Reinforcement Learning for a Multi-Objective Online Order Batching Problem. Proceedings of the International Conference on Automated Planning and Scheduling, 32, 435–443.
  6. Deep reinforcement learning for two-sided online bipartite matching in collaborative order picking. In Proceedings of the 15th Asian Conference on Machine Learning (ACML2023) Proceedings of Machine Learning Research. PMLR.
  7. Openai gym. URL: http://arxiv.org/abs/1606.01540.
  8. Solving the online batching problem using deep reinforcement learning. Computers & Industrial Engineering, 156, 107221.
  9. Order-picking methods: improving order-picking efficiency. International Journal of Logistics Systems and Management, 3, 451. doi:10.1504/IJLSM.2007.013214.
  10. Survey on fair reinforcement learning: Theory and practice. URL: http://arxiv.org/abs/2205.10032.
  11. Gurobi Optimization, LLC (2023). Gurobi Optimizer Reference Manual. URL: https://www.gurobi.com.
  12. Adam: A method for stochastic optimization. URL: http://arxiv.org/abs/1412.6980.
  13. Design and control of warehouse order picking: A literature review. European Journal of Operational Research, 182, 481–501. doi:10.1016/j.ejor.2006.07.009.
  14. Robotics in order picking: evaluating warehouse layouts for pick, place, and transport vehicle routing systems. International Journal of Production Research, 57, 5821–5841.
  15. Fairness control of traffic light via deep reinforcement learning. In 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE) (pp. 652–658). IEEE. doi:10.1109/CASE48305.2020.9216899.
  16. Picker routing in agv-assisted order picking systems. INFORMS Journal on Computing, 34, 440–462. doi:10.1287/ijoc.2021.1060.
  17. Energy-efficient uav movement control for fair communication coverage: A deep reinforcement learning approach. Sensors, 22, 1919. doi:10.3390/s22051919.
  18. Energy efficient 3-d uav control for persistent communication service and fairness: A deep reinforcement learning approach. IEEE Access, 8, 53172–53184. doi:10.1109/ACCESS.2020.2981403.
  19. A deep reinforcement learning approach for fair traffic signal control. In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC) (pp. 2512–2518). IEEE. doi:10.1109/ITSC48978.2021.9564847.
  20. Proximal policy optimization algorithms. URL: http://arxiv.org/abs/1707.06347.
  21. Learning fair policies in multi-objective (deep) reinforcement learning with average and discounted rewards. In H. D. III, & A. Singh (Eds.), Proceedings of the 37th International Conference on Machine Learning (pp. 8905–8915). PMLR volume 119.
  22. Collaborative order picking with multiple pickers and robots: Integrated approach for order batching, sequencing and picker-robot routing. International Journal of Production Economics, 254, 108634. doi:10.1016/j.ijpe.2022.108634.
  23. Formulating and solving integrated order batching and routing in multi-depot agv-assisted mixed-shelves warehouses. European Journal of Operational Research, . doi:10.1016/j.ejor.2022.08.047.
  24. Prediction-guided multi-objective reinforcement learning for continuous robot control. In H. D. III, & A. Singh (Eds.), Proceedings of the 37th International Conference on Machine Learning (pp. 10607–10616). PMLR volume 119.
  25. Deep reinforcement learning for fairness in distributed robotic multi-type resource allocation. In 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 460–466). IEEE. doi:10.1109/ICMLA.2018.00075.
  26. Order batching and batch sequencing in an amr-assisted picker-to-parts system. European Journal of Operational Research, 298, 182–201.

Summary

We haven't generated a summary for this paper yet.