Exploring the Potential of Large Language Models in Public Transportation: San Antonio Case Study (2501.03904v1)

Published 7 Jan 2025 in cs.LG, cs.AI, and cs.IR

Abstract: The integration of LLMs into public transit systems presents a transformative opportunity to enhance urban mobility. This study explores the potential of LLMs to revolutionize public transportation management within the context of San Antonio's transit system. Leveraging the capabilities of LLMs in natural language processing and data analysis, we investigate their capabilities to optimize route planning, reduce wait times, and provide personalized travel assistance. By utilizing the General Transit Feed Specification (GTFS) and other relevant data, this research aims to demonstrate how LLMs can potentially improve resource allocation, elevate passenger satisfaction, and inform data-driven decision-making in transit operations. A comparative analysis of different ChatGPT models was conducted to assess their ability to understand transportation information, retrieve relevant data, and provide comprehensive responses. Findings from this study suggest that while LLMs hold immense promise for public transit, careful engineering and fine-tuning are essential to realizing their full potential. San Antonio serves as a case study to inform the development of LLM-powered transit systems in other urban environments.

PDF Abstract

The Integration of LLMs in Public Transportation: Insights from the San Antonio Case Study

The paper titled "Exploring the Potential of LLMs in Public Transportation: San Antonio Case Study" explores the application of LLMs in optimizing urban public transit systems. It investigates the deployment of LLMs like OpenAI's GPT series to augment route planning, minimize wait times, enhance passenger interactions, and improve resource distribution in the context of San Antonio's transportation network. The paper conducts a comparative analysis of different ChatGPT models to assess their performance in interpreting transportation data, specifically the General Transit Feed Specification (GTFS).

Study Design and Methodology

The research employs a thorough experimental approach to evaluate the capability of LLMs in two primary tasks: understanding public transportation data and retrieving transport-related information. A set of five experiments was conducted, encompassing 3,275 multiple-choice questions (MCQs) and 80 short-answer questions based on San Antonio’s public transit data. These experiments used OpenAI's ChatGPT models, mainly examining GPT-3.5-turbo and GPT-4, with maximum context lengths of 16,385 tokens and 128k tokens, respectively.

The paper distinguishes between “understanding” tasks—gauging the LLMs' inherent comprehension of GTFS data—and “information retrieval” tasks, where LLMs are tested on their ability to extract and compile relevant data from provided sources. This dual examination offered comprehensive insights into the models' capabilities and limitations, as well as their pre-training data dependencies.

Key Findings

Significant findings from the investigation include:

Performance Variation in Understanding Tasks: LLM performance showed high variance across different categories of the understanding tasks, with accuracy ranging from approximately 48% to 98%. Tasks such as Term Definition and Attribute Mapping reported higher accuracies, suggesting robust model pre-training in these areas, whereas Categorical Mapping tasks showed lower accuracy, pointing toward insufficient pre-training in semantically rich categories.
Impact of Augmentation on LLM Performance: Augmenting the dataset with additional question variants resulted in a noticeable performance drop. While resulting in decreased accuracy by approximately 10% on an average, this experiment also demonstrated that GPT-4 showed stronger robustness to increased difficulty compared to GPT-3.5-turbo.
Information Retrieval Efficacy: In simpler retrieval tasks, LLMs achieved satisfactory performance, up to 90.48% accuracy, showcasing their competency in straightforward data retrieval. However, accuracy declined to approximately 64% in more complex tasks requiring intricate data integration, highlighting an area for future improvement in LLM capabilities.
Inconsistency and Contextual Challenges: The research identified challenges related to response consistency and context retention within LLMs, which can affect their deployment in dynamic environments like public transportation systems.

Implications and Future Directions

The paper provides a critical analysis of the applicability of LLMs in urban transit planning, emphasizing the need for advanced tuning and training to maximize efficiency and responsiveness. By employing LLMs, urban transit authorities can potentially revolutionize traffic management and elevate passenger satisfaction through real-time personalization and improved information accuracy. However, the paper warns of the necessity to address LLM inconsistency, data privacy issues, and the ethical use of AI technologies in public applications.

Looking ahead, the research indicates significant potential for further development, particularly focusing on system robustness, data integration techniques, and the utilization of even larger datasets for pre-training LLMs. As artifical intelligence capabilities are refined, LLMs could become integral in designing smarter, more efficient transit systems, offering broader implications for urban planning and public service delivery.

This paper serves as a valuable contribution to understanding how advancements in AI, particularly LLMs, can be harnessed to address the challenges faced by rapidly growing urban centers like San Antonio.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Ramya Jonnala (2 papers)
Gongbo Liang (16 papers)
Jeong Yang (3 papers)
Izzat Alsmadi (17 papers)

Exploring the Potential of Large Language Models in Public Transportation: San Antonio Case Study (2501.03904v1)

The Integration of LLMs in Public Transportation: Insights from the San Antonio Case Study

Study Design and Methodology

Key Findings

Implications and Future Directions

Related Papers