YOLO-FireAD: Efficient Fire Detection via Attention-Guided Inverted Residual Learning and Dual-Pooling Feature Preservation (2505.20884v1)
Abstract: Fire detection in dynamic environments faces continuous challenges, including the interference of illumination changes, many false detections or missed detections, and it is difficult to achieve both efficiency and accuracy. To address the problem of feature extraction limitation and information loss in the existing YOLO-based models, this study propose You Only Look Once for Fire Detection with Attention-guided Inverted Residual and Dual-pooling Downscale Fusion (YOLO-FireAD) with two core innovations: (1) Attention-guided Inverted Residual Block (AIR) integrates hybrid channel-spatial attention with inverted residuals to adaptively enhance fire features and suppress environmental noise; (2) Dual Pool Downscale Fusion Block (DPDF) preserves multi-scale fire patterns through learnable fusion of max-average pooling outputs, mitigating small-fire detection failures. Extensive evaluation on two public datasets shows the efficient performance of our model. Our proposed model keeps the sum amount of parameters (1.45M, 51.8% lower than YOLOv8n) (4.6G, 43.2% lower than YOLOv8n), and mAP75 is higher than the mainstream real-time object detection models YOLOv8n, YOL-Ov9t, YOLOv10n, YOLO11n, YOLOv12n and other YOLOv8 variants 1.3-5.5%.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Overview: What is this paper about?
This paper introduces a new computer vision model called YOLO-FireAD that can spot fires and smoke in videos or images quickly and accurately. The goal is to help detect fires early—even when lighting changes, the background is busy, or flames are small—while keeping the model fast and small so it can run on cameras or edge devices.
What questions were the researchers asking?
The researchers focused on three simple questions:
- How can we reduce false alarms (like confusing bright lights for fire) and missed detections (especially tiny, early-stage fires)?
- How can we keep the model both fast and accurate, so it works in real time?
- How can we avoid losing important details when shrinking images inside the model (which often hurts small-fire detection)?
How did they do it? Methods explained simply
Think of a fire detector model like a team with three parts: a “Backbone” that collects features, a “Neck” that mixes them together, and a “Head” that decides where the fire is and how likely it is.
The paper improves this team with two key ideas:
- Attention-guided Inverted Residual Block (AIR)
- Everyday idea: Attention is like a smart spotlight that highlights important parts of an image (the flame) and dims distractions (like reflections or bright windows).
- What it does: AIR combines “attention” with an efficient building block called an “inverted residual.” It first simplifies the image features, then uses attention in two ways:
- Spatial attention: focuses on where the fire is in the image (positions).
- Channel attention: focuses on what kinds of color/texture patterns look like fire.
- Why it helps: It makes the model better at telling real flames apart from look-alikes without slowing it down. It also reduces the number of “parameters” (the model’s internal settings) by about 39% compared to a baseline.
- Dual Pool Downscale Fusion Block (DPDF)
- Everyday idea: When you shrink an image, you can lose detail. DPDF keeps two kinds of information:
- Max pooling: grabs the strongest signals (like sharp, bright flame edges).
- Average pooling: keeps smooth patterns (like smoke spreading).
- The model learns how much of each to use and blends them. This helps it keep important small-fire details that are usually lost during downscaling.
- It uses “partial convolution” to cut computation while keeping accuracy high.
Other helpful ideas explained simply:
- YOLO (“You Only Look Once”): A popular fast detector that finds objects in one pass over the image.
- Precision: Of the things you said were fire, how many really were?
- Recall: Of all the real fires, how many did you find?
- mAP (mean Average Precision): A summary score of accuracy across many test conditions. mAP50-95 means we check accuracy at different overlap levels.
- IoU (Intersection over Union): How well your predicted box covers the real fire box (higher is better).
- Parameters and GFLOPs: Roughly, how big and how “computationally heavy” the model is. Lower is faster and lighter.
What did they find, and why does it matter?
Across two public fire datasets, YOLO-FireAD was both more accurate and more efficient than well-known fast detectors (like YOLOv8n, YOLOv9t, YOLOv10n, YOLO11n, YOLOv12n):
Key results on the main dataset:
- Higher overall accuracy: mAP50-95 of 34.6% (1.8% better than YOLOv8n).
- Better at stricter matching: mAP75 was higher than mainstream YOLO models by about 1.3–5.5%.
- Fewer false alarms, more correct fire finds: Precision rose to 75.3% (about +10.7% vs YOLOv8n).
- Much lighter and faster:
- Parameters: 1.45 million (about 51.8% fewer than YOLOv8n).
- Compute (GFLOPs): 4.6G (about 43.2% fewer).
- Model size: about 3.3 MB.
On another dataset (to test generalization):
- YOLO-FireAD had the best mAP50-95 among all compared models, showing it handles different scenes well.
Ablation study (testing the two modules separately):
- AIR alone reduced size and improved precision.
- DPDF alone improved detection of small fires and overall accuracy.
- Using both together gave the best performance.
Why it matters:
- Early and reliable detection of small or partially hidden fires in tough conditions (changing light, smoke, crowded backgrounds) can help prevent disasters.
- Being small and efficient means it can run on edge devices like cameras, drones, or small computers without needing a big server.
What is the impact, and what’s next?
Impact:
- Safer buildings, forests, and cities: Faster, more reliable fire spotting can trigger alarms or alerts sooner.
- Real-time use: Because the model is lightweight, it’s practical for on-site systems (security cameras, robots, drones) that need quick decisions.
- Lower cost: Smaller models require less powerful hardware.
Limitations and future directions:
- In extreme lighting (very bright reflections or glare) or very fast-moving flames/smoke, the model can still make mistakes.
- The authors plan to:
- Combine infrared with normal video to see better through smoke and handle tricky lighting.
- Optimize deployment on different edge devices so it runs smoothly everywhere.
In short, YOLO-FireAD is a smart, small, and fast fire detector that keeps more useful details and focuses on the right parts of the image, making it better at catching real fires and ignoring look-alikes.
Collections
Sign up for free to add this paper to one or more collections.