witheFlow System: Real-Time Biosignal Audio Effects
- witheFlow system is a real-time biosignal-informed audio effects modulation framework that integrates physiological data with digital audio workstations.
- It employs EEG and ECG sensors alongside a CNN-based audio regression model to generate continuous valence-arousal estimates for dynamic effect control.
- Its transparent, customizable mixing logic uses interpretable YAML rules and MIDI communication to enable expressive, adaptive sound manipulation during live performances.
The witheFlow system is a real-time, biosignal-informed audio effects modulation framework designed to enhance live music performance by automatically adjusting audio effects in response to both physiological (EEG and ECG) and audio (valence-arousal) features. Developed as a lightweight, locally executable, and open-source platform, witheFlow integrates seamlessly with standard Digital Audio Workstation (DAW) environments, utilizing standard audio routing and MIDI control to provide dynamic, emotion-sensitive sound processing without requiring substantial alteration to a performer’s existing workflow (Dervakos et al., 2 Oct 2025).
1. System Architecture
witheFlow’s architecture comprises three principal modules: (1) biosignal-based emotional state feature extraction, (2) audio-based emotion regression, and (3) a rule-based mixing logic module interfaced with a DAW. The biosignal module uses EEG and ECG sensors to estimate attention, relaxation, and stress indices. The audio-based module employs a modified convolutional neural network (PANNs CNN10) to estimate the emotional state of the dry audio signal in continuous valence-arousal (VA) space. These features are jointly consumed by a mixing logic engine, which uses a configurable set of rules—expressed in interpretable YAML files—to adjust the relative gain of multiple effects channels within the DAW. Communication with the DAW is accomplished via MIDI messages and the routing of multiple parallel effects chains through a virtual audio device, enabling remote and low-latency control of the sound’s spatial and timbral properties.
2. Biosignal and Audio Emotion Feature Extraction
EEG and ECG Processing
witheFlow’s biosignal component employs commercial-grade EEG and ECG sensors with the following configuration:
- EEG:
- Four electrodes (O1, O2, T3, T4) sample at 250 Hz.
- EEG features are extracted in overlapping windows of 1,000 samples (~4 sec).
- Alpha (8–13 Hz) and Beta (13–30 Hz) powers are calculated, yielding:
- ECG:
- Samples at 1,000 Hz.
- Stress is determined via the Baevsky Stress Index over 15-sec sliding windows (updated every 0.5 sec):
where is the modal bin percentage of RR intervals, is the most frequent RR interval, and is the RR variation range.
Audio-Based Emotion Regression
The system applies a modified PANNs CNN10 deep neural architecture to 5s windows of audio (downsampled to 30 kHz). Rather than discrete emotion classes, the regression output yields real-valued coordinates in the VA space according to the continuous Russell circumplex model. The outcome complements the biosignal features for downstream logic.
3. Dynamic Audio Effects Modulation and Mixing Logic
witheFlow modulates audio effects by controlling the gain of several parallel DAW effects channels in real time. The dry input audio is routed along with the processed signals through a virtual device, accessible to the Python-based logic module managing MIDI control.
The core mixing logic is expressed as a set of piecewise-defined, user-editable rules. Each rule determines gain adjustments for effects channels, conditioned on the state vectors comprising stress, attention, and VA distances between dry and processed audio:
- Example rule: If stress is high and attention low, boost channels far from the dry signal in VA space.
- Rules are formalized as piecewise functions: for domain partitioned into regions , with gain functions ,
Rules are stored in YAML files, supporting both transparency (traceable and visualizable) and rapid customization by performers.
4. Technical Requirements and Implementation
witheFlow maintains strict low-latency requirements suitable for live acoustic and electronic performance:
| Hardware | Software | DAW Integration |
|---|---|---|
| Laptop/portable computer | Python environment | Audio routing via DAW |
| 4-channel EEG sensor | python-sounddevice, MIDI | Multiple effect channels |
| 1-channel ECG sensor | Pre-trained PANNs CNN10 | MIDI control surface |
| Audio interface | YAML rule configuration | Virtual audio device |
Additional modules manage real-time artifact detection and signal quality assurance. Processing is entirely local, preserving privacy and ensuring rapid response.
5. Operational Status and Extension Pathways
witheFlow is validated as a proof-of-concept and demonstrated in live, improvisational settings. Its architecture supports actionable performer feedback and controllability (e.g., via MIDI footpedal override). However, the system currently relies on rule-based logic and pre-trained models trained on existing datasets (e.g., DEAM), with a recognized need for further tuning, especially for solo-instrument scenarios.
Current limitations include the dependency on the robustness and reliability of biosignal quality; ongoing research targets improved artifact detection and adaptive sensor management. The rule-based mixing logic, while transparent and customizable, is identified as a locus for further machine learning approaches—potentially incorporating decision trees or reinforcement learning to optimize modulation using empirical performance data. Expansion of annotated biosignal and solo performance datasets is necessary for broader generalizability and expressivity.
6. Significance for Expressivity and Human–Machine Collaboration
witheFlow’s primary contribution lies in its facilitation of closed-loop, embodied expressivity: connecting quantifiable physiological and audio features directly to the modulation of audio effects enables performers to externalize and shape their emotional and attentional states through sound processing. This tight coupling gives rise to augmented expressive possibilities, allowing the performer’s internal state to manifest sonically in nuanced and contextually sensitive ways.
From a systems perspective, by maintaining interpretable, rule-based logic and clear mappings between inputs and effect changes, witheFlow supports trust and performer understanding. The system is architected not to displace human agency, but to supplement and react to it in a responsive and adaptive fashion. This approach offers a template for future emotion-aware, collaborative music technologies and raises new questions at the intersection of physiological computing, artificial intelligence, and creative practice.
7. Mathematical Formulation and System Diagrams
Key mathematical formalizations include:
- EEG-based indices:
- ECG-based stress index:
- Mixing logic as domain-partitioned gain assignment:
Diagrams presented in the source illustrate (1) the relationship between audio VA coordinates and effects, and (2) the complete signal flow from biosignals and audio input to dynamic DAW control output.
witheFlow exemplifies a transparent, extensible, and technically robust approach to fusing biosignal analytics with audio feature extraction for real-time, emotion-driven effects modulation in live music (Dervakos et al., 2 Oct 2025). Its proof-of-concept status, already demonstrated in operational contexts, invites further advances in adaptive logic techniques and dataset diversity to maximize its potential for affective human–machine musical collaboration.