YOLOv8-s: Lightweight Real-Time Detector

Updated 25 December 2025

YOLOv8-s is a lightweight, anchor-free object detector optimized for high throughput and real-time detection.
It employs a modified CSPDarknet backbone with C2f modules and a PAN-style neck to achieve a balance between computational efficiency and accuracy.
Widely applied in ITS, barcode recognition, UAV inspection, and industrial monitoring, it provides practical deployment benefits on edge devices.

YOLOv8 Small Variant

The YOLOv8 Small variant (YOLOv8-s or YOLOv8s) is a lightweight, single-stage, anchor-free object detector engineered for high throughput and solid accuracy in real-time detection scenarios, notably in edge deployment contexts. As a canonical trade-off model within the YOLOv8 family, it integrates Ultralytics’ C2f modules in a streamlined CSPDarknet backbone, a PAN-style neck, and decoupled prediction heads, achieving competitive results across benchmarks while maintaining a modest parameter and computational footprint. This variant is extensively applied in domains requiring fast inference, such as intelligent transportation systems, barcode/QR code recognition, UAV-based small-object detection, and civil infrastructure inspection (Amin et al., 18 Dec 2025, Pandya et al., 28 Nov 2025, Taffese et al., 12 Jan 2025, Khalili et al., 8 Aug 2024, Chen et al., 28 Jul 2025, Chen et al., 26 Sep 2025, Yaseen, 28 Aug 2024, Hussain, 3 Jul 2024, Zhang, 6 Mar 2025).

1. Architectural Characteristics and Core Modules

YOLOv8-s follows a modular convolutional design built for parameter efficiency and multi-scale feature extraction.

Backbone: The network backbone is based on modified CSPDarknet, with C2f (“Cross Stage Partial with fused layers”) modules providing channel-wise and spatial gradient flow for more efficient representation. The backbone stages typically downsample through 3×3 convolutions, yielding feature maps at progressively coarser spatial scales (Hussain, 3 Jul 2024, Amin et al., 18 Dec 2025).
SPPF Layer: A Spatial Pyramid Pooling—Fast layer sits at the bridge between the backbone and the neck, increasing the receptive field while exhibiting little computational overhead (Amin et al., 18 Dec 2025, Pandya et al., 28 Nov 2025).
Neck: Features from different backbone stages are fused via a PAN-style (Path Aggregation Network) or FPN/PAN hybrid neck, which, in the standard design, generates three main output scales (stride 8, 16, 32). Some derivative works replace PAN with BiFPN or Hierarchical feature fusion for enhanced multi-scale performance, particularly for small-object scenarios (Chen et al., 28 Jul 2025, Chen et al., 26 Sep 2025).
Detection Head: The prediction head is anchor-free and operates with three scales. Each head is split/decoupled for independent regression and classification branches. Typical output shapes are (80×80×(C+5)), (40×40×(C+5)), and (20×20×(C+5)), where C is the number of classes (Amin et al., 18 Dec 2025, Hussain, 3 Jul 2024).
Activation & Normalization: SiLU activation (also known as Swish) is used throughout convolutional layers for smoother gradients, coupled with BatchNorm (Hussain, 3 Jul 2024).
Parameter and FLOP Profile: The model comprises 11.1–11.2 million parameters and 28.6 GFLOPs per 640×640 image (Amin et al., 18 Dec 2025, Hussain, 3 Jul 2024, Taffese et al., 12 Jan 2025).

2. Model Complexity and Scaling Trade-offs

YOLOv8-s stands as an intermediate point in the YOLOv8 family, optimized for cases where nano models underperform and medium/large models are resource-prohibitive.

Variant	Params (M)	FLOPs (G)	[email protected]	Precision	Recall	File Size (MB)
YOLOv8-nano	3.0–3.2	8.7	0.811–0.918	0.964 (LPR)	0.876–	~12
YOLOv8-small	11.1–11.2	28.6	0.846–0.933	0.945 (LPR)	0.874 (LPR)	~45
YOLOv8-medium	25.9	78.9	0.85–0.94	0.946 (LPR)	0.912 (LPR)	~90

Values compiled from (Amin et al., 18 Dec 2025, Taffese et al., 12 Jan 2025, Hussain, 3 Jul 2024). [email protected] and precision/recall are dataset-dependent.

YOLOv8-s achieves a significant increase in accuracy and recall over nano models for a ~3× computational and parameter cost, while medium models provide marginal accuracy gains at more than double the cost again. On tasks such as license plate recognition and crack detection, YOLOv8-s consistently achieves a favorable accuracy-speed-complexity balance (Amin et al., 18 Dec 2025, Taffese et al., 12 Jan 2025).

3. Training Configuration and Loss Functions

Optimizers: Training may use AdamW or SGD. Default schedules include:
- AdamW: initial learning rate 0.01, momentum 0.937, weight decay 0.005, batch size 64 for 500 epochs (Amin et al., 18 Dec 2025).
- SGD: initial learning rate 0.01, momentum 0.937, weight decay 0.0005, batch size 16 for 100–300 epochs (Taffese et al., 12 Jan 2025, Hussain, 3 Jul 2024).
Data Augmentation: Typical augmentations include mosaic, mixup, color jitter (HSV), random flips, rotation, blur, cropping, and online geometric transforms, though specifics are dataset-dependent (Pandya et al., 28 Nov 2025, Taffese et al., 12 Jan 2025, Hussain, 3 Jul 2024).
Loss Terms: The objective is the sum of objectness, classification, and box regression losses:
- $L_\text{total} = L_\text{obj} + L_\text{cls} + L_\text{box}$
- Classification: Cross-entropy.
- Box regression: MSE or CIoU/PIoU, depending on the variant. Decoupled heads may also employ dynamic label assignment and, in some works, CIoU is replaced with PIoU to mitigate anchor-box enlargement artifacts (Amin et al., 18 Dec 2025, Khalili et al., 8 Aug 2024).

4. Detection Performance and Benchmarks

YOLOv8-s delivers real-time or near-real-time inference across multiple tasks and datasets, providing competitive accuracy.

License-Plate Recognition (Amin et al., 18 Dec 2025):
- Plate detection: Precision 0.945, Recall 0.874, [email protected] 0.933, [email protected]:0.95 0.683.
- Character recognition: Precision 0.92, Recall 0.86, [email protected] 0.91, [email protected]:0.95 0.673.
Crack Detection (Taffese et al., 12 Jan 2025):
- [email protected] = 0.846, Precision = 0.862, Recall = 0.768, F1-score ≈ 0.812.
- Inference throughput: ~120–140 FPS on NVIDIA V100/RTX 4070 GPUs.
Barcode/QR Detection (Pandya et al., 28 Nov 2025):
- Accuracy: 97.10%, Precision: 85.41%, Recall: 88.11%.
- mAP (overall): 89.67%. [email protected]: 59%. [email protected]:0.95: 53%.
COCO and Transfer Learning (Yaseen, 28 Aug 2024, Hussain, 3 Jul 2024):
- COCO [email protected] ≈ 58.5% for YOLOv8-s (varies by implementation and dataset splits).

Performance is robust relative to model size, and the mAP gap between small and larger YOLOv8 models is modest (often ≤5 pp) given the reduction in parameters and latency.

5. Application-Specific Variants and Small-Object Extensions

Multiple works derive from YOLOv8-s to target small-object detection and efficiency for edge scenarios.

SOD-YOLOv8: Adds a fourth high-resolution detection head (stride 2) and replaces PANet with GFPN. Incorporates C2f-EMA attention and PIoU loss; measured recall and precision boosts from 40.1%→43.9% and 51.2%→53.9%, mAP0.5 from 40.6%→45.1% (Khalili et al., 8 Aug 2024).
YOLOv8s-p2: Integrates BiFPN with learnable weights and incorporates a stride-4 detection head, raising recall for tiny rice spikelets and improving [email protected] by 3.1% over baseline (Chen et al., 28 Jul 2025).
HierLight-YOLO-S: Substitutes C2f with IRDCB, standard downsampling with LDown, and replaces the PANet neck with a hierarchical feature fusion (HEPAN). Adds a P2 (160×160) detection head, reducing parameter count by ~30% and increasing small-object AP by +3.3 points (Chen et al., 26 Sep 2025).
FDM-YOLO: Removes the largest detection head and adds a high-resolution P2 head, introduces Fast-C2f modules (PConv-based), dynamic upsampling (Dysample), and lightweight EMA attention, reducing parameter count by 38% and improving [email protected] from 38.4% to 42.5% on VisDrone (Zhang, 6 Mar 2025).

These modifications typically target high recall and AP for objects <32 px, crucial in UAV, traffic, and field monitoring.

6. Edge Deployment and Practical Implications

Due to its parameter counts (11–11.2M), compute requirements (28.6 GFLOPs), and architecture, YOLOv8-s is well-suited for contemporary high-end and mid-tier edge devices.

Latency and Throughput: YOLOv8-s achieves sub-millisecond to millisecond per-image inference on modern GPUs (e.g., A100, RTX 3090), and 15–60 FPS on Jetson Xavier/Orin, based on task and optimizations (Amin et al., 18 Dec 2025, Hussain, 3 Jul 2024, Taffese et al., 12 Jan 2025).
Resource Profile: File size ≈45 MB (FP32); quantization or pruning can reduce this further.
Suitability: Recommended for deployment scenarios balancing moderate-to-high accuracy with strict latency and resource constraints—including ITS (Intelligent Transportation Systems), mobile device vision, video analytics, real-time industrial inspection, and on-device inference pipelines.

A notable use case involves a pipeline where YOLOv8-nano detects candidate regions (e.g., license plates), passing the region to YOLOv8-s for fine-grained tasks such as character or small-object localization, leveraging the strengths of both model sizes (Amin et al., 18 Dec 2025).

7. Limitations and Comparative Positioning

YOLOv8-s, while effective, does exhibit certain limitations:

Extremely Small Objects/Occlusions: Although competitive, baseline YOLOv8-s sometimes underperforms with extremely small or heavily occluded objects unless explicitly modified via high-resolution heads or advanced multi-scale fusion (Pandya et al., 28 Nov 2025, Khalili et al., 8 Aug 2024, Chen et al., 28 Jul 2025).
Inference Speed Reporting: Some evaluation papers omit direct FPS throughput; inferences are generally extrapolated from FLOP counts or hardware reports.
Further Compression: For ultra-constrained TinyML microcontrollers, YOLOv8-nano or special pruned/quantized variants are favored, though YOLOv8-s remains more robust for complex multi-class scenarios (Elshamy et al., 21 Oct 2024).
Absolute Accuracy Ceilings: Larger YOLOv8 variants (medium, large) can yield slightly higher mAP at the cost of increased latency and size, suggesting use-case dependent model selection (Taffese et al., 12 Jan 2025, Hussain, 3 Jul 2024).

References

"Next-Generation License Plate Detection and Recognition System using YOLOv8" (Amin et al., 18 Dec 2025)
"Barcode and QR Code Object Detection: An Experimental Study on YOLOv8 Models" (Pandya et al., 28 Nov 2025)
"Benchmarking YOLOv8 for Optimal Crack Detection in Civil Infrastructure" (Taffese et al., 12 Jan 2025)
"SOD-YOLOv8 -- Enhancing YOLOv8 for Small Object Detection in Traffic Scenes" (Khalili et al., 8 Aug 2024)
"An Improved YOLOv8 Approach for Small Target Detection of Rice Spikelet Flowering in Field Environments" (Chen et al., 28 Jul 2025)
"HierLight-YOLO: A Hierarchical and Lightweight Object Detection Network for UAV Photography" (Chen et al., 26 Sep 2025)
"What is YOLOv8: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector" (Yaseen, 28 Aug 2024)
"YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision" (Hussain, 3 Jul 2024)
"A lightweight model FDM-YOLO for small target improvement based on YOLOv8" (Zhang, 6 Mar 2025)
"P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving" (Elshamy et al., 21 Oct 2024)