Event Probability Mask (EPM)
- Event Probability Mask (EPM) is a probabilistic framework for labeling pixel-wise, spatio-temporal data in neuromorphic camera streams by leveraging APS intensity frames and IMU measurements.
- It computes soft event labels by correlating intensity changes with sensor data, thereby providing objective benchmarks for event denoising and performance evaluation.
- EPM enables the training of event denoising networks and sensor parameter calibration, as validated on the DVSNOISE20 benchmark with improved noise robustness.
The Event Probability Mask (EPM) is a probabilistic framework for labeling pixel-wise spatio-temporal data in neuromorphic, or event-based, camera streams. It computes the likelihood that a real event—registered by a dynamic vision sensor (DVS)—should have occurred at each pixel within a specified temporal window, based on the underlying intensity dynamics captured by synchronous active pixel sensor (APS) frames, camera intrinsics, and inertial measurement unit (IMU) data. EPM enables objective benchmarking of event denoising, provides soft-label ground truth for training event denoising networks, and permits principled calibration of neuromorphic sensor parameters (Baldwin et al., 2020).
1. Formal Definition of Event Probability Mask
Consider an APS intensity video representing pixel radiance at spatial coordinate and time . The DVS log-intensity is
where and are gain and offset parameters, respectively. Ideal (noise-free) DVS pixels trigger events when the log-intensity undergoes a change of at least :
For any real (possibly noisy) event at location and time 0, the goal is to assign a soft label 1 indicating the probability that an event "should" have occurred in the preceding APS exposure window 2. The binary event indicator function is defined by:
3
Under pure rotational motion and piecewise-constant illumination, the EPM is given by:
4
where 5. Here,
- 6 is the per-pixel image velocity, computed from IMU angular velocity 7 and camera intrinsics 8 via 9, with 0 the skew-symmetric matrix of 1.
- 2, where 3 is the APS measurement and 4 the combined offset.
2. EPM Computation Algorithm
The calculation of the EPM proceeds as follows:
- Input data:
- Continuous DVS event stream 5
- APS frame sequence 6 for 7 and exposure time 8
- Synchronized IMU gyroscope 9
- Camera intrinsics 0
- Parameters: contrast threshold 1, APS-to-DVS offset 2
- Procedure:
- Preprocess APS: compute spatial gradients 3 for all frames 4.
- For each APS frame 5 at time 6:
- Interpolate 7 over 8.
- Compute 9 using camera intrinsics and IMU.
- Compute 0.
- Compute 1.
- Compute 2 using the EPM closed-form expression.
- Store 3 for all pixels and frames as the event probability masks.
3. Benchmarking and Denoising Applications
EPM enables several key applications in neuromorphic vision processing:
- Objective Denoising Benchmark (RPMD): For any denoised event indicator 4, its log-probability under the EPM is:
5
The oracle classifier 6 if 7, else 8. The Relative Plausibility Measure of Denoising (RPMD) quantifies denoising performance as:
9
Lower RPMD indicates better denoising, with 0 being optimal.
- Training Event Denoising CNN (EDnCNN): EPM supplies soft ground truth masks for supervised learning of pixel-wise binary classifiers 1. Training can minimize one of three equivalent loss functions—maximum likelihood, 2 loss, or classification error to 3—for large datasets.
- Internal Parameter Calibration: The optimal contrast threshold 4 and offset 5 are estimated by maximizing the likelihood of observed raw DVS events under the EPM, i.e.,
6
A two-stage one-dimensional search (first over 7, then 8) is sufficient for practical purposes.
4. Experimental Evaluation and DVSNOISE20
The DVSNOISE20 benchmark dataset underpins empirical analyses of EPM and EDnCNN:
- Platform: DAVIS346 (346×260 DVS pixels, 40 fps APS, 6-axis IMU)
- Scenes: 16 static indoor/outdoor scenes per 2-axis gimbal rotation, each recorded 9 s
- APS settings: 41–56 fps, 0
- Labels: EPM-generated “soft” masks across all pixels/frames serve as denoising ground truth
- Volume: 48 sequences, approximately 1 events in total
Experimental Results
- Real-world Denoising: EDnCNN trained on EPM achieved an average RPMD gain of 148 points over raw data in leave-one-scene-out tests, outperforming eight other denoising filters in 12 out of 16 scenes (2, Wilcoxon signed-rank test).
- Simulated Noise Robustness: In ESIM-simulated data with injected background-activity noise, raw-data RPMD scaled linearly with noise rate, whereas EDnCNN maintained near-optimal performance; other methods degraded more quickly.
- Generalization: EDnCNN demonstrated qualitative denoising efficacy on DVSFLOW16 and IROS18 datasets, preserving edges amid translational motion and multiple moving objects.
5. Limitations and Future Directions
Several constraints underlie the current EPM formulation:
- Assumes pure rotational motion and static scenes during APS exposures to avoid occlusions.
- Neglects rapid illumination flicker and requires constant scene brightness.
- EPM values diminish at low velocities (low event rates), reducing sensitivity for noise discrimination.
Potential extensions include accommodating translation-induced depth variation, explicitly segmenting dynamic foreground/background components, and addressing flickering illumination. Integrating EPM-based principles into unsupervised loss formulations could further reduce reliance on synchronized APS and IMU signals.
6. Summary Table: EPM Key Components and Applications
| Component/Application | Description | Formula/Method |
|---|---|---|
| Event Probability Mask | Soft label: 3 under null hypothesis of event occurrence | 4 closed-form as above |
| Denoising Benchmark (RPMD) | Relative plausibility measure for denoising | 5 |
| Training EDnCNN | EPM as ground truth for supervised learning | Minimize maximum likelihood/6/classification loss |
| Parameter Calibration | Estimate 7 maximizing event likelihood under EPM | 2-stage 1D search via 8 |
EPM constitutes a mathematically principled framework for probabilistic event labeling in neuromorphic vision, enabling standardized objective benchmarks and effective supervised learning paradigms for event denoising (Baldwin et al., 2020).