The paper "Agentic LLMs in the Supply Chain: Towards Autonomous Multi-Agent Consensus-Seeking" explores the use of LLM agents to automate consensus-seeking in Supply Chain Management (SCM). The paper posits that LLM agents can overcome limitations of traditional and software agent-based approaches in SCM coordination, particularly for Small and Medium-sized Enterprises (SMEs). The authors introduce a supply chain-specific consensus-seeking framework tailored for LLM agents and validate it through a case paper in inventory management. The source code is open-sourced to encourage further research.
The paper identifies key limitations in existing approaches to SCM, which include high entry barriers for SMEs, limited capabilities, and adaptability issues in complex scenarios. These limitations often lead to sub-optimal outcomes such as inefficient capacity utilization and information distortion, including the bullwhip effect and shortage gaming. The authors suggest that LLMs can address these challenges due to their ability to negotiate, reason, and plan, facilitating near-human-level consensus at scale with minimal entry barriers.
The paper introduces a modular communication framework for LLM-powered agents in the sequential supply chain, allowing tool usage and communication between neighboring agents. The authors conduct an empirical case paper, experimenting across supply chain metrics and framework sophistication levels. The results indicate situations where using tools within the framework can significantly enhance performance, while in other cases, prioritizing sophisticated multi-agent communication is more effective.
The paper focuses on a subset of supply chain consensus-seeking problems, that could benefit from automated handling using AI. Decisions on delivery quantities, order frequency, and capacity allocation are given frequently and often involve situations where end-to-end coordination would yield superior solution outcomes. A lack of end-to-end consensus often results in small problems aggregating and leading to sub-optimal outcomes, such as loss of efficiency and information distortion.
The related works review highlights that traditional optimization methods lack the ability to capture the complexity and vulnerabilities of real-world Supply Chains (SCs). The authors discuss how multi-agent systems can improve automated consensus-seeking and coordination in SCM research, the challenges with previous approaches, as well as the benefits of introducing Generative AI technology, and LLMs in particular. They also discuss why companies struggle with coordination and the role of technological support to achieve it and look into the availability of benchmarking environments both in supply chain management research and in computer science research which they draw inspiration from to build the frameworks.
The problem setting involves a multi-agent communication framework in an environment that simulates an end-to-end supply chain inventory management setting, based on prior work. The paper focuses on mitigating the bullwhip effect, a phenomenon where order variability increases upstream in a supply chain. The aggregate bullwhip effect is computed by multiplying the coefficients of variation of each echelon's demand. The Economic Order Quantity (EOQ) measure is used for each agent to optimize for the bullwhip effect, and the framework allows for agreements on order amounts between neighboring agents to help mitigate the bullwhip effect. The formula for EOQ is:
where:
- is the mean of historical demand data from the downstream agent.
- is the per-unit cost of ordering.
- is the per-unit cost of holding inventory.
Global costs in the end-to-end Supply Chains (SC) are based on the implementation of local costs, where the global costs are given by the sum of all the local costs of each agent, cumulated across each step of the inventory management simulation. The local costs of each agent are given by the sum of different cost components: inventory costs, backlog costs (unmet demand from an agent's downstream neighbor), variable ordering cost, and fixed ordering cost.
The methodology section describes the LLM-powered consensus-seeking frameworks, including standalone LLM-powered agents, information sharing between neighboring agents, standalone LLM-powered agents with tools, information sharing between neighboring agents with tools, and negotiation between neighboring agents. The agents use existing memory structures to store previous local observations, relying on context to create agent memory and facilitate consensus-seeking. A cognitive-inspired modular framework is developed, integrating perception, memory, and execution, where each agent can save and receive information from memory, make a tentative decision, communicate with a neighboring agent, and then make a final decision. Communication is limited to neighboring agents to reflect the partial observability. Two types of communication frameworks are defined: information sharing and negotiation, which both reflect the partial observability of sequential supply chains.
The prompt engineering section emphasizes that the experiments used manual prompt engineering and that the prompts are not fully optimized. The benchmarking environment utilizes pre-trained LLMs and zero-shot learning. The prompts used in the inventory management setting comprise different parts, with details to describe all of the observations and supply chain metrics necessary for each agent to make a decision. When an agent needs to make a decision based on a tool output, the importance of this tool is emphasized in the prompt. To minimize each agent's costs in the end to end supply chain, a demand forecasting tool that uses linear regression is used to estimate the next order amount. To minimize the bullwhip effect of an individual agent in the sequential supply chain, a tool that calculates the EOQ formula is implemented, and the tool output is placed within the LLM prompt.
The experimental setup involved running a total of 24 experiments to compare communication frameworks on different optimization metrics and on LLM models of different sizes. The experiments were conducted with foundation models from the Gemini family, namely Gemini Flash and Gemini Pro, with a temperature of 0.1. The customer demand is based on the Merton Jump Diffusion Model. The fixed parameters across all agents and experiments are a fixed maximum order amount, a common inventory holding cost, backlog cost, variable order cost, and fixed order cost for each agent in the sequential supply chain. A (S, s) policy is used as a baseline.
In the results and discussion section, experiments focusing on two different problem settings for inventory management in the sequential supply chain were performed: global cost minimization and global bullwhip effect minimization. For Gemini Flash, a gradual cost reduction was observed as demand forecasting tools were introduced and as the consensus-seeking framework became more sophisticated. The only framework that underperforms the weak baseline is the standalone LLM agent without tool usage. For Gemini Pro, the trend in cost reduction is similar to that obtained with Gemini Flash, but there is not a consistent performance improvement achieved when using a larger model. For both models used, the performance of the negotiation framework beats the hard baseline involving a tool-based restocking policy by the agent. In the experiments relating to the bullwhip effect, frameworks that included negotiation achieved the best performance. The authors find that when not explicitly directed on which strategy should be adopted for selecting ordering amounts, LLM-powered agents primarily use the average strategy although they occasionally use some other strategies, such as going for one extreme of the negotiation interval or disagreeing altogether. Throughout the experiments, the authors observe that end-to-end supply chains can operate much better compared to extant restocking policies when adopting LLM-powered settings that include coordinated communication and tool usage.
In the managerial implications section, it is stated that automating consensus-seeking in low level supply chain decisions using agent-based systems has been proposed for a long time, but not adopted by industry. In this paper, the authors developed and provided a suite of LLM-powered consensus-seeking frameworks for supply chains and tested them with a bullwhip effect simulator. The experiments yield interesting managerial implications across four categories: performance, scalability, implementation, and reliability.
The limitations section states that, as a technology with limited benchmarking environments in the supply chain management community, LLM-powered agents are still at a very early stage in reaching their potential. A general weakness of foundation models, independently of the rules and frameworks that surround them, is that the performance when switching between different foundation models is still unstable. A learning curve for the communication framework is expected to be quite steep for a non-specialist to implement and use, especially when it comes to adapting the communication framework to other supply chain challenges. A further step to improve explainability of this approach is to enhance the prompts with Chain-of-Thought Reasoning.
In conclusion, the paper asserts that LLM-agents offer an intuitive, scalable and interoperable solution to automated consensus-seeking across a range of supply chain challenges, and presents a series of frameworks for LLM-powered consensus-seeking in the end-to-end supply chain, which were tested in a bullwhip effect setting. The experiments showed that using such an LLM-powered setting can significantly reduce global costs and global bullwhip effect compared to traditional restocking policies. This approach still requires a human-in-the-loop, since the LLM agent outputs are still too inconsistent and unexplainable for real-world scenarios. Future work can extend the framework with automatic prompt optimization tools.