MH-FSF Framework: Reproducible Feature Selection

Updated 16 July 2025

MH-FSF framework is a modular and extensible platform that improves reproducibility and benchmarking in feature selection research for Android malware detection.
It integrates 17 classical and domain-specific feature selection techniques within a standardized workflow including data preprocessing, model training, and rigorous cross-validation.
The open design facilitates seamless integration of new methods and datasets, advancing comparability and transparency in security analytics research.

The MH-FSF framework is a comprehensive, modular, and extensible platform designed to address reproducibility and benchmarking limitations prevalent in feature selection research, with a particular emphasis on Android malware detection. Developed through collaborative research, MH-FSF implements a wide range of classical and domain-specific feature selection techniques and provides systematic evaluation across numerous publicly available datasets. The framework is intended to foster methodological consistency, transparency, and rigorous comparison within the feature selection research community (Rocha et al., 11 Jul 2025).

1. Framework Architecture and Workflow

MH-FSF utilizes a modular architecture organized into four sequential stages:

Data Manipulation: Raw data undergoes preprocessing, which includes duplicate and NaN value removal, class balancing (when necessary), sampling, and subsampling. These steps are designed to ensure the robustness and comparability of subsequent evaluations, with particular attention paid to the challenges posed by imbalanced datasets.
Feature Selection Methods: A diverse suite of 17 algorithms is integrated into the framework, comprising 11 classical/statistical methods and 6 domain-specific methods tailored for Android malware datasets. Each algorithm is implemented with a standardized interface and a pluggable directory/code structure.
Machine Learning Model Training and Evaluation: Post feature selection, reduced datasets are subjected to supervised learning models such as Support Vector Machines (SVM), Random Forests, and K-Nearest Neighbors (KNN). Model performance is assessed using a range of metrics including accuracy, precision, recall, F1 score, ROC-AUC, and the Matthews Correlation Coefficient (MCC), with stratified 5-fold cross-validation (i.e., $K=5$ ) consistently applied to address class asymmetry and ensure method comparability.
Results Visualization: The framework includes visualization support with bar charts, boxplots, confusion matrices, radar charts, and heatmaps, allowing for both summary and detailed exploration of method performance across different datasets and metrics.

The modular nature of MH-FSF facilitates extensibility, enabling researchers to introduce new feature selection techniques, datasets, and classifiers with minimal overhead.

2. Implemented Feature Selection Methods

MH-FSF provides direct implementations for 17 feature selection techniques, distilled into classical and domain-specific categories:

Classical Methods (11)	Type	Domain-Specific Methods (6)	Type
Artificial Bee Colony (ABC)	Subset	JOWMDroid	Subset
ANOVA	Subset	Multi-Tiered (MT)	Subset
Chi-Square	Ordering	RFG	Subset
Information Gain (IG)	Ordering	SemiDroid	Subset
LASSO	Subset	SigAPI	Ordering (API-based)
Linear Regression (LR)	Subset	SigPID	Ordering (permission)
Mean Absolute Deviation	Ordering
PCA	Subset
Pearson Corr. Coefficient	Ordering
ReliefF	Ordering
Recursive Feature Elimination (RFE)	Ordering

Classical techniques range from filter-based (e.g., Chi-Square, Information Gain), wrapper-based (e.g., RFE), to embedded methods (e.g., LASSO), and dimensionality reduction (e.g., PCA). Domain-specific methods are built around Android-specific features such as permissions and API calls, and incorporate behavioral and hybrid analyses. Each method follows a uniform coding interface, specified by the inclusion of a method directory, a description file, and a run function.

3. Datasets and Evaluation Methodology

Ten publicly available Android malware datasets are integrated into MH-FSF, encompassing various feature types (API calls, permissions, intents, operation codes) and a mix of benign/malicious samples. Examples include Adroit, AndroCrawl, Android Permissions, DefenseDroid PI/A, Drebin-215, and KronoDroid R/E.

Key evaluation protocols include:

Preprocessing routines: Class balancing (via stratified sampling) and removal of anomalies.
Cross-validation: Stratified $K$ -fold cross-validation ( $K = 5$ ), preserving class distribution within each fold for reliable assessment, especially critical in imbalanced settings.
Metric reporting: Standard classification metrics, with an emphasis on MCC:

$\text{MCC} = \frac{TP \times TN - FP \times FN}{\sqrt{(TP + FP)(TP + FN)(TN + FP)(TN + FN)}}$

This metric is particularly justified for cases where class imbalance might obscure the performance of competing methods.

4. Experimental Results and Performance Analysis

Performance evaluation demonstrates considerable variation among feature selection methods across datasets, particularly when considering data balance. Key findings include:

Methods such as LASSO and RFE deliver high F1 and recall scores (often exceeding 0.90) with low standard deviation, indicating strong robustness to class imbalance.
PCA, ReliefF, and SigPID can underperform on datasets where their targeted features lack discriminative power or where imbalance is introduced, suggesting sensitivity to both feature and data distribution characteristics.
Systematic data balancing leads to more stable and generalizable results, as observed in compact boxplots and MCC heatmaps.
Domain-specific methods like SigAPI achieve high performance on datasets dominated by API calls, whereas permission-based techniques (e.g., SigPID) are effective only when permissions are a major class discriminator.

Visualization tools allow for detailed comparative analysis, supporting reproducibility and informed method selection.

5. Reproducibility, Benchmarking, and Transparency

One of the principal contributions of MH-FSF is addressing long-standing reproducibility barriers in feature selection research:

All 17 methods and 10 datasets are implemented and maintained under standardized protocols within a unified repository.
Proprietary datasets are eschewed in favor of public sources, allowing for independent verification and meaningful benchmarking of novel methods.
Experimental workflows, from data ingestion to model evaluation and result logging, are fully automated and support parallel execution with robust error handling.
The complete framework—including code, data, and automation scripts—is made publicly available, ensuring transparency and facilitating the reproducibility of published results.

6. Research Implications and Future Directions

MH-FSF paves the way for sustained methodological advances and wider adoption of reproducible benchmarking in feature selection:

The framework is amenable to integration of additional algorithms, including emerging techniques and methods from adjacent domains.
Regular incorporation of contemporary malware datasets allows for dynamic updating of benchmarks as new threat patterns emerge.
The prospect of integrating explainable AI (XAI) components could enhance the interpretability of feature selection decisions.
Real-time and streaming data readiness is a planned extension, aiming to reflect deployment conditions in live malware detection settings.
Investigation into the resilience of feature selection methods against adversarial attacks forms a prospective research axis.

This suggests that MH-FSF establishes both a methodological foundation and a collaborative infrastructure for evaluating and advancing feature selection techniques in Android malware detection.

7. Significance in Feature Selection Research

MH-FSF represents a concerted effort toward achieving methodological consistency, transparency, and depth in the evaluation of feature selection approaches. By encompassing a full suite of classical and context-driven techniques, applying rigorous, reproducible evaluation pipelines, and providing automated, extensible workflows, MH-FSF addresses significant gaps in prior research practices. Its comprehensive design enables both incremental improvements and substantive comparisons across methods, datasets, and application domains within the broader field of feature selection and security analytics.

PDF Markdown Chat (Pro)

References (1)

MH-FSF: A Unified Framework for Overcoming Benchmarking and Reproducibility Limitations in Feature Selection Evaluation (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to MH-FSF Framework.