AdversariaLib: Adversarial ML Library

Updated 4 November 2025

AdversariaLib is an open-source Python library for assessing ML algorithm security by simulating adversarial attacks using gradient methods.
It features a modular design that integrates core attack algorithms, classifier interfaces via scikit-learn and FANN, and Matlab support for comprehensive experiments.
Optimized with C/C++ and multi-processing, the library enables efficient large-scale evaluations and straightforward extensions to novel attack strategies.

AdversariaLib is an open-source Python library developed for the security evaluation of machine learning algorithms when subjected to adversarial attacks. It enables the simulation of targeted adversarial scenarios, supports a broad spectrum of machine learning models, and is designed for extensibility, performance, and integration with standard ML workflows. AdversariaLib serves as a foundational tool for research and experimentation within adversarial machine learning.

1. Design Principles and Architecture

AdversariaLib is structured around modularity and extensibility, supporting rapid experimentation and integration of novel attacks and defenses. Its codebase is subdivided into distinct modules:

advlib: Core attack algorithms and evaluation tools for classifiers under adversarial manipulation.
prlib: Management of datasets, measurement of sample distances, training, and evaluation of classifiers, including wrappers for both scikit-learn and FANN backends.
util: Persistent storage of data, configurations, trained models, and experiment results, supplemented by logging and format control.

The library wraps high-performance implementations of learning algorithms (particularly scikit-learn and the Fast Artificial Neural Network library, FANN), with C/C++ optimizations ensuring efficient evaluation and response to attacks. AdversariaLib is compatible across major operating systems and enables multiprocessing to parallelize experiments.

Experiments are orchestrated via a command-line interface and API scripts, supporting configuration of datasets, classifiers, and attack parameters through setup files. Matlab integration is facilitated with a dedicated wrapper, enabling experiment management and result extraction from Matlab environments.

2. Supported Machine Learning Models and Frameworks

AdversariaLib is intentionally agnostic to underlying learning algorithms:

scikit-learn: Provides access to a wide suite of classifiers (e.g., SVMs, decision trees), with all wrapped models benefiting from compiled C/C++ backends for computational efficiency.
FANN (Fast Artificial Neural Network Library): Neural networks are addressed through custom Python/C wrappers, notably where scikit-learn's functionality is insufficient.
Extensibility: Researchers can introduce novel algorithms with minimal changes, typically by conforming to the established classifier API and adding wrappers in prlib.

This design allows comparative evaluation of attack impact across disparate model classes with unified experimental control.

3. Attack Methods and Adversarial Evaluation

The current release of AdversariaLib implements gradient-based evasion attacks. Evasion (test-time) attacks involve iterative perturbation of input samples to cross the decision boundary of a trained model, causing misclassification:

$x^{(t+1)} = x^{(t)} - \eta \nabla_x g(x^{(t)})$

where $g(x)$ is the discriminant function (e.g., $w^T x + b$ for linear SVMs), and $\eta$ is the step size. The attack proceeds until an adversarial stopping criterion is met, typically $g(x) \leq 0$ for SVMs, denoting successful evasion.

The architecture is designed to accommodate additional attack methodologies with minimal friction—such as poisoning attacks that corrupt training data, as well as surrogate model strategies compatible with limited black-box information—by implementing new attack modules under advlib.

4. Experimental Workflow and Use Cases

The canonical workflow consists of the following steps, automated via the runexp script and parameterized through setup files:

Dataset Preparation: Training and test data are randomly split or provided.
Model Training: One or more classifiers are trained per training set.
Attack Simulation: Attacks are systematically launched on each trained model.
Result Logging: Performance metrics, perturbed samples, and logs are stored for post hoc analysis.

A typical use case is the evaluation of linear SVM vulnerability on the MNIST dataset (e.g., discriminating digits "3" vs "7"). Here, the evasion attack iteratively adjusts pixel values to drive $g(x)$ across the decision boundary, often resulting in adversarial samples that remain visually similar to the source digit.

Surrogate models can be trained in adversarial settings with restricted access to the victim model, mimicking practical black-box attack scenarios.

Batch experiments are facilitated by multi-processing, enabling extensive parametric sweeps and statistical validation (e.g., averaging attack success over $k$ random data splits).

5. Performance Optimization and Scalability

Performance is underpinned by:

C/C++ Implementations: Via scikit-learn and FANN.
Multi-processing: Permits parallel execution of independent experiments or multiple attack variations.
Data Storage and Logging: Modular, ensuring scalability as experiment complexity or dataset size increases.

These optimizations permit the evaluation of both attack efficacy and model robustness over large-scale and computationally intensive experimental designs.

6. Extensibility and Integration

AdversariaLib is intended to be straightforward to extend:

Attacks: New attack strategies are implemented by conforming to a general interface and populating the corresponding directory in advlib.
Classifiers: New models, including robust or custom ML algorithms, can be integrated with minimal code, as evidenced by the FANN neural network wrapper.
Interface Compatibility: Matlab wrappers enable broader usability for research teams leveraging Matlab-based pipelines, with export capabilities to standard formats (PDF).

The project documentation provides in-depth usage examples and extension guidelines to accelerate community adoption and innovation.

7. Licensing, Availability, and Documentation

AdversariaLib is distributed under the GNU General Public License v3 (GPLv3), ensuring freedom to modify and redistribute within open-source guidelines. Source code, comprehensive documentation, and additional resources are available:

Download: http://sourceforge.net/projects/adversarialib
Documentation: http://comsec.diee.unica.it/adversarialib

Summary Table

Aspect	Details
Attacks Supported	Gradient-based evasion (test-time); extensible for other classes
Algorithms Supported	scikit-learn (SVM, etc.), FANN (neural nets), extensible to new classifiers
Architecture	advlib (attacks), prlib (ML/classifiers), util (data/logging)
Performance	C/C++-optimized backend; multi-processing
Integration	scikit-learn, FANN, Matlab (API wrapper)
License	GPLv3

AdversariaLib constitutes a robust, flexible, and performance-oriented framework for adversarial machine learning research, facilitating reproducible security evaluations and supporting methodological innovation in the assessment of ML algorithm robustness under adversarial conditions (Corona et al., 2016).

PDF Markdown Chat (Pro)

References (1)

AdversariaLib: An Open-source Library for the Security Evaluation of Machine Learning Algorithms Under Attack (2016)

Follow Topic

Get notified by email when new papers are published related to AdversariaLib.