Auditory Modelling Toolbox (AMT)

Updated 10 April 2026

Auditory Modelling Toolbox (AMT) is an open-source MATLAB/Octave framework that supports physiologically detailed and perceptually motivated auditory models.
It features a modular API that enables standardized design, execution, and comparative analysis of models like Gammatone, Transmission-Line, and CARFAC.
The toolbox facilitates batch processing, parameter sweeps, and differentiable model integration, making it valuable for auditory neuroscience, engineering, and psychoacoustics.

The Auditory Modelling Toolbox (AMT) is an open-source MATLAB and Octave framework for the standardized design, execution, and comparative analysis of a diverse collection of computational auditory models. It provides a modular infrastructure supporting both physiologically detailed and perceptually motivated auditory models, enabling multi-model comparisons, parameter sweeps, batch processing, and extensible integration of new algorithms within a common API. AMT serves as a foundational toolkit in auditory neuroscience, engineering, and psychoacoustics for simulating and testing monaural auditory signal processing from acoustic input to various stages of neural and subcortical representation (Lyon et al., 2024, Vecchi et al., 2021).

1. Architecture and Model Families

AMT implements a unifying API and data structure convention for "design" and "run" workflows, streamlining model usage and comparison (Lyon et al., 2024). Upon startup, AMT scans for available model pairs in its cache, registering models and supporting user additions as submodules with designated design and run functions and help files. This organizational principle allows consistent scripting across algorithmically and structurally diverse models.

The main model families in AMT, as of the latest releases, include:

Gammatone Filterbank (GT): Linear bandpass filterbanks, ERB-spaced, serving as widely used auditory-nerve front ends.
Transmission-Line (TL) Models: Multi-stage mechanical models of the cochlear partition, accurately capturing traveling-wave delay and cochlear frequency-dependent impedance.
CARFAC (Cascade of Asymmetric Resonators with Fast-Acting Compression): Physically motivated cascades of two-pole/two-zero IIR stages, each featuring outer hair cell (OHC) compression, automatic gain control (AGC), and producing basilar-membrane motion, local undamping, and neural activity pattern (NAP) outputs.

A comprehensive comparative study lists eight key AMT models spanning biophysical, phenomenological, and effective (perceptual) approaches: dau1997, zilany2014, verhulst2015, verhulst2018, bruce2018, king2019, relanoiborra2019, and osses2021 (Vecchi et al., 2021).

2. Processing Pipeline and Mathematical Foundations

All models in AMT are constructed around a canonical modular pipeline: outer/middle ear → cochlear filter bank → inner hair cell (IHC) transduction → auditory nerve synapse (or equivalent) → central auditory (e.g., modulation filter bank) processing (Vecchi et al., 2021). The granularity and biophysical realism of processing stages vary by model family.

Gammatone and Chirp-Gammatone Banks: Linear or nonlinear bandpass filtering, typically implemented with difference equations parameterized by ERB-based bandwidth formulas (Glasberg & Moore 1990, Shera et al. 2002).

Transmission-Line Cochlea: Multi-section models using human OTO data to emulate fluid and membrane mechanics, providing detailed spatiotemporal cochlear responses.

CARFAC Model Formulas (Lyon et al., 2024):

The CAR stage realizes a two-pole/two-zero IIR cascade:

$y[n] = b_0 x[n] + b_1 x[n-1] + b_2 x[n-2] - a_1 y[n-1] - a_2 y[n-2]$

with frequency-dependent and compression-modulated coefficients:

$a_1 = -2r\cos\bigl(2\pi f_c/F_s\bigr), \quad a_2 = r^2, \quad r = \exp(-2\pi BW/F_s)$

and local gain $G = G_0\,g_{OHC}(env)$ , where $g_{OHC}(E)$ is parameterized by a compression nonlinearity:

$g(E) = g_{min} + \frac{g_{max}-g_{min}}{1 + (E/\epsilon)^p}$

OHC health is controlled per channel via a vector $h_i \in [0,1]$ :

$g_{OHC,i} = h_i \cdot g_{healthy}(E_i)$

Other Families: Effective models use functional blocks (rectification, lowpass, adaptation loops), while phenomenological and biophysical models implement multi-stage diffusive adaptation and synaptic dynamics.

3. Model Integration, Extension, and Data Structures

Integration of new models into AMT requires adherence to function and data-structure conventions: a model-specific design function (e.g., design_carfac.m), a standard run function, parameter and output structs with defined field names, and parseable help documentation. Standardization enables the toolbox dispatcher to enumerate, initialize, and correctly interoperate with models (Lyon et al., 2024).

Output arrays maintain an $[n_{ch} \times N]$ structure (channels $\times$ signal length), and parameter structs encapsulate model-specific metadata and tunables such as filter spacing, OHC/IHC properties, and AGC time constants. For cross-language implementations (Matlab, Python/NumPy, JAX), these data structures are mirrored using equivalent dictionaries or pytrees, ensuring reproducibility and facilitating validation suites (e.g., using pytest).

Version control within AMT is achieved via module prefixes (e.g., lyon2011, lyon2024 for CARFAC releases).

4. CARFAC v2 Integration and Differentiable Hearing Loss Modeling

The integration of CARFAC v2 introduces several algorithmic enhancements (Lyon et al., 2024):

DC Quadratic Distortion Correction: The 20 Hz highpass AC-coupler, moved from IHC to the CAR cascade, resolves DC leakage issues in basilar-membrane outputs.
Reduced High-Frequency Neural Synchrony: Reformulation of the IHC block (two-capacitor cascade, 200 μs and 80 μs time constants) yields more physiologically accurate phase-locking cutoff behavior.
Impairment Modeling: The ohc_health vector modulates OHC gain per channel, thus enabling simulation of frequency-specific hearing loss.

In differentiable environments (JAX), all model weights, including ohc_health, become trainable via automatic differentiation (e.g., optimizing to match an individual's audiogram). This enables gradient-based fitting to physiological or behavioral targets, with direct applicability to hearing-loss compensation and patient-specific model adaptation.

Representative parameter adjustment and model-invocation examples:

Platform	Parameter Function	Run Function	OHC Health Modification
MATLAB	`lyon2024Design`	`lyon2024Run`	`params.ohc_health(51:end)=0.3`
NumPy	`lyon2024.design`	`.run`	`params['ohc_health'] [50:]=0.3`
JAX	`design_and_init_*`	`carfac_step`	JAX gradient-based updates

5. Comparative Evaluation and Best Practices

A systematic comparison of AMT models demonstrates that all major approaches follow the modular pipeline but exhibit divergent behavior in nonlinearity, phase-locking, adaptation, bandwidth, and computational efficiency (Vecchi et al., 2021).

Evaluation highlights:

Phenomenological and biophysical models (e.g., verhulst, zilany, bruce) capture nonlinear cochlear compression and auditory-nerve adaptation, but at higher computational cost (0.3–0.8 s/channel, up to 2.5 s/channel for spiking models).
Effective models (e.g., dau1997, king2019) offer rapid simulation (0.02–0.06 s/channel) but fewer physiological details; limited or fixed adaptation/compression.
At low and high SPLs, Q-factor behavior and IHC rectification differ across models.
All models display decreasing IHC phase-locking (AC/DC) above 1 kHz, though roll-off rates vary.
Synchrony capture, subcortical transformation, and output polarity provide distinctive signatures with direct implications for auditory physiology and psychoacoustics.

Best practices emphasize:

Selecting model type (biophysical, phenomenological, effective) based on application (physiology, psychoacoustics, coding).
Configuring channel density (≥30 ERB channels for broadband stimuli).
Input calibration to preserve model validity and reproducibility.
Validation against source implementations, especially for machine-learned approximations.

6. Applications and Significance

AMT underpins comparative modeling studies, development of individualized auditory front ends, and parameter sweeps for psychoacoustic and clinical research. The integration of differentiable structures, notably in CARFAC v2 with JAX, enables joint optimization within machine-learning pipelines and provides a pathway for research on personalizing hearing-loss models.

AMT's harmonization of diverse modeling approaches—ranging from physiologically explicit cochlear mechanics to perceptual front ends—makes it a reference

Markdown Report Issue Upgrade to Chat

References (2)

The CARFAC v2 Cochlear Model in Matlab, NumPy, and JAX (2024)

A comparative study of eight human auditory models of monaural processing (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Auditory Modelling Toolbox (AMT).