FireNet: Lightweight & Real-Time Detection

Updated 23 August 2025

FireNet is a multi-domain framework that integrates lightweight CNNs, U-Net-inspired segmentation, and event-based reconstruction for efficient fire and smoke detection.
It utilizes advanced strategies like dropout, temporal augmentation, and inverted residual blocks to balance high accuracy with low computational cost.
Its innovative applications span IoT safety, aerial disaster response, and bio-inspired cyber defense, ensuring robust performance in operationally critical scenarios.

FireNet refers to a family of architectures, methodologies, and systems spanning lightweight fire and smoke detection for edge IoT deployments, real-time fire perimeter segmentation for aerial disaster response, event-driven video reconstruction, and cyber defense protocols inspired by biological regulatory networks. Common to all references is an emphasis on efficiency, robust real-time performance, and deployment in resource-constrained or operationally critical scenarios.

1. Lightweight Fire and Smoke Detection for Embedded IoT Platforms

FireNet, as introduced in "FireNet: A Specialized Lightweight Fire & Smoke Detection Model for Real-Time IoT Applications" (Jadon et al., 2019), is a convolutional neural network (CNN) designed from scratch to optimize both model size and detection accuracy for real-time fire safety systems deployable on edge devices such as Raspberry Pi.

Architecture: The network is shallow (14 layers), featuring three convolutional layers (3×3 kernels, filter counts doubling across layers), each paired with pooling and dropout (0.5 in convolutional, 0.2 in dense layers), followed by a flatten layer and two dense layers (256, 128 neurons), culminating in a softmax output for binary classification ("fire" vs. "non-fire"). The input image size is 64×64×3. The total parameter count is 646,818 and disk footprint is 7.45 MB.
Activation and Regularization: ReLU activation is employed throughout the feature extraction layers; dropout regularization is used at rates higher than typical, a configuration found to yield improved generalization and control overfitting in the reported experiments.
Performance: Evaluated on both a custom real-world dataset and the Foggia fire/smoke dataset, FireNet achieved 93.91% accuracy (precision 97%, recall 94%, F-measure 95%) on the custom set and 96.53% accuracy on Foggia’s dataset, with real-time throughput above 24 fps on a Raspberry Pi 3B (1.2 GHz CPU, 1 GB RAM). False positive and negative rates remain low (<2%–4%).
Deployment and Integration: FireNet is designed for IoT integration, supporting rapid alerting via AWS S3 (cloud media upload) and Twilio (SMS/MMS notifications), and interfacing with onboard sensors for differentiating fire from smoke events. The hardware stack includes a camera module, smoke sensor, distinct sound alarms, and microcontroller-based ADC; future iterations target replacement of the microcontroller with components such as the MCP3008 ADC to further simplify the hardware pipeline.
Innovations: The architecture targets edge device constraints natively without retrofitting larger models for portability, using dataset diversity (web-mined, public, and self-shot real-world samples) to overcome overfitting encountered in homogeneous datasets.

2. Real-Time Fire Perimeter Segmentation in Disaster Response Scenarios

In "FireNet: Real-time Segmentation of Fire Perimeter from Aerial Video" (Doshi et al., 2019), FireNet refers to an encoder–decoder segmentation model tailored for rapid wildfire perimeter detection from aerial infrared video streams.

Model Design: The architecture is U-Net-inspired, with an encoder of ResNet blocks, batch normalization, and downsampling layers (pruned for efficiency), coupled to a decoder with deconvolution and skip connections for spatial fidelity.
Temporal Consistency: Importantly, segmentation uses temporal augmentation—previous predictions (at t–1, t–3, t–5) are input to the model, stabilizing per-frame predictions across time.
Loss Function: Training minimizes a continuous Dice similarity coefficient:

$\mathcal{L}_{DSC} = -\frac{2 \sum_{i=1}^{n} s_i r_i}{\sum_{i=1}^{n} s_i + \sum_{i=1}^{n} r_i + \epsilon}$

where $s_i$ is continuous output, $r_i$ the ground truth for pixel $i$ .

Data Annotation: The system utilizes 400,000 frames annotated with expert guidance, maintaining high quality despite real-world class imbalance (active fire in ~100,000 frames).
Runtime: The pruned model achieves 20 fps on a standard Nvidia K80 GPU, with an F1-score of 92; higher scores (up to 95) are achievable at lower throughput (3–5 fps).
Operational Impact: FireNet is deployed in environments requiring rapid situational awareness, providing geolocated fire boundaries for analysts and responders and improving public safety outcomes relative to previous manual annotation workflows.

3. Portable Fire Recognition and Architectural Comparisons

"KutralNet: A Portable Deep Learning Model for Fire Recognition" (Ayala et al., 2020) systematically benchmarks FireNet against KutralNet, focusing on model compactness and computational efficiency.

Architectural Differences: FireNet employs standard convolutions; KutralNet introduces inverted residual blocks (as in MobileNetV2), depth-wise convolutions ( $C_{out} = C_{in} \times K$ ), and octave convolution for partitioning feature extraction into high- and low-frequency components.
Parameter Reduction: KutralNet variants present up to 71% fewer parameters than FireNet (e.g., 139K–185K vs. 646K), with correspondingly lower FLOPs (e.g., Mobile Octave reducing flops to 24.6M).
Performance: On the FireNet dataset, FireNet achieves AUROC as high as 0.96 and test accuracy around 89%; KutralNet records comparable accuracy and AUROC (~0.92–0.96), with the trade-off of much lower resource consumption.
Generalization: The addition of black-image augmentation—enriching the "no-fire" class—boosts KutralNet’s test accuracy and generalization, approaching or exceeding deeper models.

4. Event-Based Frame Reconstruction for Vision Applications

In traffic sign detection contexts, FireNet denotes a fully convolutional (sometimes recurrent) network tasked with reconstructing video frames from sparse event camera data (Wzorek et al., 2022, Jeziorek et al., 2022).

Processing Pipeline: FireNet converts asynchronous event streams (Dynamic Vision Sensor outputs encoding pixel, timestamp, polarity) into frame-based images suitable for conventional detectors (YOLOv4).
Architectural Simplification: The network is often structured as a streamlined U-Net variant; in (Jeziorek et al., 2022), FireNet is reduced to six blocks with up/downscaling layers omitted, leading to approximately 280× fewer parameters and much faster computation compared to E2VID.
Operational Parameters: Reconstruction time is linear in the number of events, e.g., $t_{reconstruction}(N) \approx aN + b$ , with empirical times ranging from 19.15 ms to 64.06 ms for varying event counts.
Performance Caveats: While enabling legacy detection pipelines, FireNet-induced frame reconstructions are relatively blurred, yielding lower detection metrics (e.g., 72.67% [email protected]) than direct event-based representation fusion (up to 89.9% [email protected]).

5. Cyber Defense: Firewall Regulatory Networks ("FireNet" as Autonomous Protocol)

In autonomous cyber defense, "FireNet" as realized in Firewall Regulatory Networks (FRN) (Duan et al., 24 Apr 2025) refers to a distributed, bio-inspired firewall architecture:

Distributed Decision Engines: Each firewall device is equipped with a local engine, communicating via activation/inhibition signals encoding policy changes in response to dynamic risks and utility measurements.
Access Control Vector (ACV): The network state is captured as

$\langle \tau_1^1, \tau_1^2, \ldots, \tau_1^m \mid \ldots \mid \tau_n^1, \tau_n^2, \ldots, \tau_n^m \rangle$

for $n$ firewalls and $m$ rules.

Feedback Regulation and Cascades: Borrowing from Biological Regulatory Networks, local configuration changes propagate through regulatory cascades with well-defined priorities (global/local risk > global/local utility).
Utility-Risk Constraints: Policy synthesis ensures service reachability and security satisfy mission-level constraints, with risk and utility quantified as:

$U_s = \left(\frac{\text{Number of reachable flows}}{\text{Total possible flows}}\right) \times U_{ms}$

$R_{sj} = [1 - \prod_{i=1}^n (1 - L_{ij}^s)] \times I_j$

where $L_{ij}^s = w_i^s (1 - \Gamma_{ij}^s)$ .

Protocol Efficiency: The FRN protocol synthesizes local interaction steps via constraint solving (e.g., SMT), converging to ACV states matching global risk/utility thresholds even in large, heterogeneous firewall networks.

6. Impact, Limitations, and Future Directions

FireNet architectures across these domains consistently target the convergence of compactness, speed, and accuracy for real-world deployment. The use of dropout, pruning, novel convolutional blocks, temporal strategies, and biological inspiration addresses domain-specific resource constraints and robustness needs.

A plausible implication is that FireNet-style models are poised for broad adoption in embedded vision for safety-critical applications, autonomous disaster response, and cyber defense. However, challenges remain—in particular, maintaining high-fidelity inference under hardware limitations (especially in event-driven reconstruction and multiclass detection tasks), adapting protocols for larger heterogeneous firewall networks, and further reducing false positives/negatives without expanding computational cost.

The cross-domain advances reflected by FireNet underscore a trend toward design paradigms balancing minimal computation and maximal operational utility, with open-sourcing of datasets/models (e.g., CTFilm20K, FIReNet) likely to accelerate progress in each area.