Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models (2504.14126v1)

Published 19 Apr 2025 in cs.AI, cs.CL, and cs.LG

Abstract: Determining the ideal architecture for deep learning models, such as the number of layers and neurons, is a difficult and resource-intensive process that frequently relies on human tuning or computationally costly optimization approaches. While Particle Swarm Optimization (PSO) and LLMs have been individually applied in optimization and deep learning, their combined use for enhancing convergence in numerical optimization tasks remains underexplored. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence for deep learning hyperparameter tuning. The proposed LLM-enhanced PSO method addresses the difficulties of efficiency and convergence by using LLMs (particularly ChatGPT-3.5 and Llama3) to improve PSO performance, allowing for faster achievement of target objectives. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions offered by LLMs. Comprehensive experiments across three scenarios -- (1) optimizing the Rastrigin function, (2) using Long Short-Term Memory (LSTM) networks for time series regression, and (3) using Convolutional Neural Networks (CNNs) for material classification -- show that the method significantly improves convergence rates and lowers computational costs. Depending on the application, computational complexity is lowered by 20% to 60% compared to traditional PSO methods. Llama3 achieved a 20% to 40% reduction in model calls for regression tasks, whereas ChatGPT-3.5 reduced model calls by 60% for both regression and classification tasks, all while preserving accuracy and error rates. This groundbreaking methodology offers a very efficient and effective solution for optimizing deep learning models, leading to substantial computational performance improvements across a wide range of applications.

PDF Abstract

This paper introduces a novel approach called LLM-driven Particle Swarm Optimization (PSO) to accelerate the hyperparameter tuning process for deep learning (DL) models (Hameed et al., 19 Apr 2025 ). The core problem addressed is the computationally expensive and often manual nature of finding optimal DL architectures, specifically parameters like the number of layers and neurons/filters. Traditional methods like grid search are exhaustive, while standard metaheuristics like PSO can sometimes converge slowly or get stuck in local optima.

The proposed solution integrates LLMs, specifically ChatGPT-3.5 and Llama3, into the standard PSO algorithm. The key idea is to leverage the pattern recognition and generation capabilities of LLMs to guide the PSO search more effectively. Instead of relying solely on the PSO update rules (Equations 1-4), the LLM-driven PSO periodically queries an LLM with the current state of the particle swarm (positions, velocities, and associated costs). The LLM then suggests potentially better particle positions and velocities.

Methodology: LLM-Driven PSO

The methodology involves two phases:

Standard PSO Phase: Initially, a standard PSO algorithm (Algorithm 1) runs for a small number of iterations to explore the search space. This phase establishes initial personal best (pbest) and global best (gbest) positions.
LLM-Enhanced Phase: After the initial PSO iterations, the system queries the LLM (Algorithm 2). The prompt (shown in the paper's text box labeled "Format of our LLM Prompt") provides the LLM with the current particle information (neurons/layers, velocities, cost). The LLM is asked to generate a new set of particle positions (neurons/layers) and velocities intended to reduce the cost function further.
Particle Replacement: The LLM's suggestions are evaluated. The worst-performing particles from the current PSO swarm are replaced with the best suggestions provided by the LLM.
Iteration: The PSO process continues, potentially making further calls to the LLM if the global best does not improve significantly or until a maximum iteration count is reached.

This process aims to replace computationally expensive DL model evaluations (which are needed to calculate the cost/fitness for each particle in standard PSO) with cheaper LLM calls and potentially guide the search towards the global optimum faster.

Implementation Details

PSO Parameters: The standard PSO parameters like population size, inertia weight ( $w$ ), and acceleration coefficients ( $c_1, c_2$ ) are used (Table I). The particle dimensions represent the hyperparameters being tuned (e.g., number of layers, number of neurons/filters).

LLM Interaction: A specific prompt format was designed to ensure the LLM returns suggestions in a parsable format. The prompt includes the current best particle configurations and their costs, asking the LLM for new configurations likely to yield lower costs.

# Simplified Prompt Structure Example
my_prompt = f"""
Below is the string showing the best number of neurons as the first entry 
and best number of layers as the second entry of the DL model for {Npop} 
particles with their corresponding cost as the fifth entry... 
The first entry (Neurons) ranges from 2 to 200, while the second entry 
(Layers) ranges from 2 to 5.

{particle_prompt_string} 

Give me exactly {Npop} more number of neurons and layers for the same model 
in order to reduce the cost further. Your response must be exactly in the 
same format as input and must contain only values. Your response must not 
contain the cost values.
"""

Hyperparameter Ranges: Specific ranges were defined for the hyperparameters being optimized (e.g., layers: [2, 5], neurons: [2, 200]).

Experimental Evaluation

The LLM-driven PSO was evaluated in three scenarios:

Rastrigin Function Optimization: A standard mathematical benchmark function (Equation 5) used to test optimization algorithms. LLM-driven PSO showed modest improvements in convergence speed (fewer iterations) compared to standard PSO, especially with Llama3 (Table II vs. Table IV, Figure 5 vs. Figure 7). Llama3 achieved reductions of 2.94% to 8.50% in iterations, while ChatGPT-3.5 showed improvements mainly for larger particle sizes (up to 4.25%).
LSTM Hyperparameter Tuning for Regression: Optimizing the number of layers and neurons for an LSTM model predicting Air Quality Index (AQI) based on sensor data (Figure 4). The goal was to minimize Root Mean Squared Error (RMSE). Standard PSO (5 particles, 10 iterations) required 50 model calls.
- LLM-driven PSO with ChatGPT-3.5 achieved comparable RMSE with only 20 model calls (4 PSO iterations total), a 60% reduction (Table VI, Figure 8).
- LLM-driven PSO with Llama3 required 30-40 model calls (6-8 PSO iterations total), a 20%-40% reduction (Table VI, Figure 8).
- The final RMSE values were statistically similar across all methods (Figure 9).
CNN Hyperparameter Tuning for Classification: Optimizing the number of layers and filters for a CNN classifying images as recyclable or organic materials (Figure 3). The goal was to maximize classification accuracy. Standard PSO (5 particles, 10 iterations) required 50 model calls.
- LLM-driven PSO with both ChatGPT-3.5 and Llama3 achieved comparable accuracy with only 20 model calls (4 PSO iterations total), a 60% reduction (Table VII, Figure 10).
- Final accuracy was statistically similar across methods (Figure 11).

Key Findings and Practical Implications

Reduced Computational Cost: The primary benefit is a significant reduction (20%-60%) in the number of expensive DL model training runs (model calls) needed to find good hyperparameters, while maintaining comparable model performance (RMSE/accuracy).
Faster Convergence: LLMs can effectively guide the PSO search, replacing poorly performing particles and accelerating convergence towards optimal hyperparameter configurations.
Efficiency: Using a small number of particles (e.g., 5) combined with LLM guidance proved effective for DL hyperparameter tuning, minimizing the overhead of both PSO and DL model evaluations.
LLM Choice: ChatGPT-3.5 generally required fewer iterations/calls than Llama3 in the DL tasks, although Llama3 showed slightly better performance on the Rastrigin function.
Prompt Engineering: The effectiveness relies on well-structured prompts that elicit useful and correctly formatted suggestions from the LLM.
Applicability: The method offers a practical way to speed up hyperparameter optimization, particularly valuable in resource-constrained environments or when model training is time-consuming. It can potentially be adapted for other metaheuristic algorithms like Genetic Algorithms.

The paper demonstrates a practical and efficient method for leveraging LLMs to enhance a well-established optimization technique (PSO) specifically for the common challenge of DL hyperparameter tuning. The reduction in model evaluations translates directly to savings in computation time and resources.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Saad Hameed (1 paper)
Basheer Qolomany (20 papers)
Samir Brahim Belhaouari (10 papers)
Mohamed Abdallah (47 papers)
Junaid Qadir (110 papers)
Ala Al-Fuqaha (82 papers)

YouTube

Show All Videos