Firefly Algorithm-Guided Estimation (FAABE)

Updated 6 December 2025

The paper demonstrates that FAABE significantly reduces estimation errors, achieving up to an 80% decrease in MMRE compared to conventional ABE.
FAABE employs a Firefly Algorithm to optimize feature-weight vectors and utilizes Pearson correlation for feature selection, enhancing prediction precision.
The model effectively handles noisy data in small-to-medium datasets, though it requires careful tuning of metaheuristic parameters to manage its computational overhead.

Firefly Algorithm-Guided Analogy-Based Estimation (FAABE) is a model for software effort estimation that integrates a nature-inspired metaheuristic—the Firefly Algorithm (FA)—with Analogy-Based Estimation (ABE). FAABE directly addresses several limitations inherent in traditional ABE by employing metaheuristic weight optimization and feature selection, thereby systematically improving estimation accuracy on benchmark datasets through reduction of prediction error metrics such as MMRE, MAE, MSE, and RMSE (Chintada et al., 29 Nov 2025).

1. Firefly Algorithm Metaheuristic

The Firefly Algorithm is a population-based, nature-inspired global search method. It operates by modeling a swarm of fireflies in which each firefly represents a candidate solution, specifically a feature-weight vector $w = (w_1, \dots, w_d) \in [0,1]^d$ . The brightness, $I$ , of each firefly reflects the solution quality, here inversely related to prediction error.

The Euclidean distance between two fireflies $i$ and $j$ is:

$r_{ij} = \|x_i - x_j\|_2 = \sqrt{\sum_{k=1}^d (x_{i,k} - x_{j,k})^2}$

Light intensity at distance $r$ is modeled as:

$I(r) = I_0 e^{-\gamma r^2} \quad \text{or} \quad I(r) = \frac{I_0}{1 + \gamma r^2}$

where $I_0$ is the base intensity and $\gamma$ is the light absorption coefficient.

Attractiveness function:

$\beta(r) = \beta_0 e^{-\gamma r^2}$

where $\beta_0$ is the attractiveness at zero distance.

Movement update rule:

$x_i \leftarrow x_i + \beta_0 e^{-\gamma r_{ij}^2}(x_j - x_i) + \alpha (\mathrm{rand} - \tfrac{1}{2})$

$\alpha$ modulates random perturbation, and $\mathrm{rand} \in [0,1]^d$ .

In the FAABE context, each firefly’s position is interpreted as a candidate weighting vector for the features in the ABE similarity function, and the FA’s optimization dynamics iteratively seek weight assignments that reduce estimation error.

2. Analogy-Based Estimation Fundamentals

Analogy-Based Estimation (ABE) predicts software project effort by referencing similar historical projects. The estimation process involves retrieval of analog cases via a similarity function and aggregation of their known efforts.

Per-feature distance:

$\mathrm{Dis}(a_i, a_i') = \begin{cases} |a_i - a_i'| & \text{if numeric or ordinal} \ 0 & \text{if nominal and } a_i = a_i' \ 1 & \text{if nominal and } a_i \ne a_i' \end{cases}$

Weighted similarity:

$\mathrm{Sim}(p, p') = \frac{1}{\sqrt{\sum_{i=1}^d w_i \mathrm{Dis}(a_i, a_i') + \delta}}$

with smoothing term $\delta = 10^{-4}$ .

Aggregated effort by inverse-weighted mean over $S$ nearest analogies:

$\hat{C}_p = \sum_{k=1}^S \frac{\mathrm{Sim}(p, p_k)}{\sum_{i=1}^S \mathrm{Sim}(p, p_i)} C_{p_k}$

ABE’s efficacy depends heavily on the chosen similarity metric and feature weighting.

3. FAABE Integration Architecture

FAABE links the global search capabilities of FA with the retrieval-aggregation steps of ABE. The key points of integration include:

Optimization of the feature-weight vector $w$ within the similarity metric.
Preliminary feature selection through Pearson correlation: features with absolute correlation to effort below 0.5 are removed, and FA then re-weights the remaining features.
Objective functions (to minimize) include:

$\mathrm{MMRE}(w) = \frac{1}{N} \sum_{j=1}^N \left| \frac{Actual_j - Predicted_j(w)}{Actual_j} \right|$

Additionally, $\mathrm{MAE}(w)$ , $\mathrm{MSE}(w)$ , and $\mathrm{RMSE}(w)$ are used for secondary evaluation.

The FA loop:
1. Population initialization: $\{w^{(i)}\}_{i=1}^N$ .
2. Brightness assignment: $I_i \propto 1/\mathrm{Err}(w^{(i)})$ .
3. Position update via movement rule.
4. Iteration and convergence until best $w^*$ is found.

This architecture enables adaptive determination of relevant features and their contributions.

4. Experimental Datasets and Results

FAABE was tested across six public datasets, each varying in size and feature dimensionality:

Dataset	Projects	Features
COCOMO81	64	16
Desharnais	81	12
China	499	14
Kemerer	15	7
Maxwell	62	27
Albrecht	24	8

Processing steps:

Handling missing data and normalization of continuous features.
Pearson correlation-based feature selection ( $|r| \geq 0.5$ ).
Training/testing split: 67% training, 33% testing per dataset.

Evaluation metrics (computed on test set):

$\mathrm{MMRE}$ : $\frac{1}{k}\sum|\frac{A_i-P_i}{A_i}|$
$\mathrm{MAE}$ : $\frac{1}{k}\sum|A_i-P_i|$
$\mathrm{MSE}$ : $\frac{1}{k}\sum(A_i-P_i)^2$
$\mathrm{RMSE}$ : $\sqrt{\mathrm{MSE}}$

Selected results for ABE vs. FAABE:

Dataset	Metric	ABE	FAABE
COCOMO81	MMRE	3.2072	0.7188
	MAE	723.70	62.24
	MSE	5.04e6	3.63e6
	RMSE	2233.3	1905.8
Desharnais	MMRE	0.7264	0.4270
	MAE	1938.1	1283.6
	MSE	2.70e7	2.79e6
	RMSE	2629.5	1670.2
China	MMRE	1.5647	1.2546
Kemerer	MMRE	0.3497	0.1371
Maxwell	MMRE	0.7859	0.2397

FAABE produced substantial reductions in MMRE (up to approximately 80%), as well as improvements in MAE and RMSE, evidencing consistently superior performance relative to ordinary ABE across all examined datasets.

5. Computational and Practical Considerations

The FAABE model is best suited to datasets with noisy or irrelevant features, where the adaptive weight optimization refines the similarity computation. For small-to-medium-sized repositories, FA convergence rates are adequate. Model performance is sensitive to FA parameters: population size $N$ , absorption coefficient $\gamma$ , randomness weight $\alpha$ , and number of iterations $T$ .

In terms of complexity, FAABE incurs $O(T \cdot N^2 \cdot d)$ cost per dataset due to the pairwise firefly distance and position updates. This overhead can increase substantially for high-dimensional spaces or large population sizes.

A plausible implication is that for very large datasets or feature sets, computational requirements may surpass those of conventional ABE, necessitating trade-offs between search precision and efficiency.

6. Summary and Application Scope

FAABE systematizes feature selection and weighting in effort estimation by embedding FA-based metaheuristic optimization within the structure of ABE. By mapping estimation error to the "brightness" of solutions and driving firefly movement accordingly, FAABE consistently yields lower error rates on standard benchmarks. Its principal contributions include robust performance in the presence of noisy features and greater reliability for atypical projects compared to standard ABE, at the cost of moderate metaheuristic search overhead and FA parameter tuning (Chintada et al., 29 Nov 2025).

PDF Markdown Chat (Pro)

References (1)

Enhancing Analogy-Based Software Effort Estimation with Firefly Algorithm Optimization (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Firefly Algorithm-Guided Analogy-Based Estimation (FAABE).