YOLOv8 Nano Variant

Updated 25 December 2025

YOLOv8 Nano is a lightweight object detection model that employs aggressive depth and width multipliers, a CSP-derived backbone, PAN-FPN neck, and an anchor-free head for efficient real-time performance.
It achieves 3.2M parameters and 8.7 GFLOPs at 640x640 resolution, demonstrating up to 91.8% mAP in specialized tasks such as license plate recognition and barcode detection.
Advanced adaptations like depthwise separable convolutions and optimized feature fusion enable YOLOv8 Nano to maintain competitive accuracy while operating on resource-limited devices.

The YOLOv8 Nano variant is the smallest configuration of the YOLOv8 object detection family, engineered to deliver real-time inference performance and high parameter efficiency for deployment on resource-constrained devices. It employs a depth and width scaling strategy within the core Ultralytics YOLOv8 architecture, featuring a CSPDarknet-derived backbone, PAN/FPN-based neck, and a decoupled anchor-free detection head. Parameter counts are reduced through aggressive scaling, lightweight module replacements, and explicit width-multiplier strategies, while maintaining adequate accuracy on standard detection benchmarks and several real-world tasks, including license plate and barcode recognition, civil infrastructure crack detection, and TinyML edge scenarios.

1. Architecture and Scaling Principles

YOLOv8 Nano employs the following architectural strategies to minimize model size while retaining detection capability:

Depth/Width Multipliers: Depth and width multipliers of 0.33 and 0.25, respectively, are applied to the standard YOLOv8 "Extra Large" base architecture, resulting in 3.2 million parameters and 8.7 GFLOPs at a 640×640 input resolution (Hussain, 3 Jul 2024).
C2f Backbone: The CSPNet-inspired Cross-Stage Partial (C2f) modules provide channel-wise feature fusion and gradient backpropagation efficiency, but are instantiated with minimal channel per layer and reduced repetition.
PAN-FPN Neck: The Small and Nano variants retain a path aggregation network (PAN-FPN) for multi-scale feature fusion but scale channel dimensions and number of blocks to fit tight memory and compute constraints.
Anchor-free Head: Prediction heads are anchor-free and decoupled, directly regressing bounding boxes and class probabilities without pre-defined anchor grids (Yaseen, 28 Aug 2024).
Depthwise and Partial Convolution: In many Nano-style YOLOv8 derivatives, standard convolutions are replaced by depthwise-separable convolutions or partial-conv (PConv) mechanisms for further reduction of FLOPs (Elshamy et al., 21 Oct 2024, Zhang, 6 Mar 2025).

A minimal classification-only Nano variant (P-YOLOv8-small) achieves an even greater reduction using a 0.5× width multiplier, trimmed C2f blocks, channel pruning, and quantization-aware training, yielding only 1,451,098 parameters and a 2.84 MB binary (Elshamy et al., 21 Oct 2024).

2. Quantitative Model Characteristics

The following table summarizes core characteristics of YOLOv8 Nano as reported across several sources:

Model	Params (M)	FLOPs (G)	[email protected] (%)	Inference Latency (ms)
YOLOv8-n	3.2	8.7	37.3	80.4 (CPU, ONNX)
YOLOv8-n (LPR)	3.2	–	91.8 (LPR)	–
P-YOLOv8-n-cls	1.45	2.1*	– (Clsf. 99.5)	0.28 (A100 cls)
FDM-YOLO-n**	≈2.2	11.7	35.8	–

* At 224×224 for classification, not detection ** HierLight-YOLO-Nano variant, similar parameter target (Chen et al., 26 Sep 2025)

On domain-specific tasks (license plate detection, barcode scanning, etc.), the Nano variant consistently approaches or exceeds 90% [email protected] (Amin et al., 18 Dec 2025, Pandya et al., 28 Nov 2025), confirming utility for lightweight, real-time detection pipelines.

3. Key Module Adaptations for Lightweighting

Depthwise Separable Convolution: Many YOLOv8 Nano derivatives incorporate DWConv in both the backbone and neck to minimize both parameter count and multiply-accumulate cost (Elshamy et al., 21 Oct 2024, Chen et al., 26 Sep 2025).
Lite Fusion and Bottlenecks: Some variants implement Fast-C2f blocks, composed of partial convolution and channel split/concat with groupwise or depthwise operations, reducing FLOPs vs. standard C2f while retaining essential connectivity (Zhang, 6 Mar 2025).
Feature Fusion Optimization: Advanced nano-oriented networks often employ thin BiFPN or HEPAN-style necks, with learnable edge weights and channel compression to limit redundancy and computational branching while retaining multi-scale context (Chen et al., 28 Jul 2025, Chen et al., 26 Sep 2025).
Head Configuration: Standard Nano variants use three detection heads (P3: 80×80, P4: 40×40, P5: 20×20 at 640 input), but specialized small-object detectors add a fourth head at P2 (160×160) to enhance small target recall at modest compute cost (Chen et al., 28 Jul 2025, Khalili et al., 8 Aug 2024).

4. Training Strategy and Hyperparameterization

Training YOLOv8 Nano models applies recipes consistent with resource-constrained regimes. Key points include:

Optimizers: Stochastic Gradient Descent (SGD) with momentum or AdamW, with typical initial learning rates of 0.01–0.001 and cosine annealing or fixed schedule (Taffese et al., 12 Jan 2025).
Batch Sizes: Often 32–64, maximizing throughput under memory constraints.
Augmentations: Include Mosaic and MixUp, geometric (rotation, flip, scale jitter), HSV/color jitter, to enhance generalization in data-sparse or low-resolution settings (Taffese et al., 12 Jan 2025).
Loss Functions: CIoU or vanilla MSE for regression, BCE for class/objectness, sometimes replaced by custom low-overhead losses (e.g., Powerful-IoU) in ultralight variants (Khalili et al., 8 Aug 2024).

Quantization-aware training and channel pruning further reduce model size for microcontroller deployment, sometimes reducing memory footprint below 3MB while sustaining high accuracy (Elshamy et al., 21 Oct 2024).

5. Empirical Results Across Benchmarks and Applications

YOLOv8 Nano has demonstrated reliability in several real-world low-resource detection scenarios:

Barcode/QR Code Detection: Achieved 88.95% accuracy on custom barcode/QR datasets (416×416), confirming suitability for scanning applications in embedded vision systems (Pandya et al., 28 Nov 2025).
License Plate Recognition: Precision of 0.964 and [email protected] of 0.918 reported for license plate recognition on real-world LPR datasets, enabling high-accuracy, low-latency LPR pipelines (Amin et al., 18 Dec 2025).
Crack Detection (Civil Infrastructure): [email protected] of 80.9% at ≈120 FPS, highlighting utility for mobile or edge inspection targets (Taffese et al., 12 Jan 2025).
TinyML Classification: 99.5% accuracy on the State Farm Distracted Driver dataset (P-YOLOv8-small), with sub-3MB flash size and inference rates of 20–45 FPS on embedded ARM-class hardware (Elshamy et al., 21 Oct 2024).

On general detection benchmarks (COCO, Roboflow100), YOLOv8-n yields [email protected] in the 33–37% range, trading off accuracy for speed and deployment simplicity (Hussain, 3 Jul 2024, Yaseen, 28 Aug 2024).

6. Model Scaling and Trade-Offs

The YOLOv8 Nano variant's main advantage lies in the Pareto-optimal balance of latency, memory, and mAP for edge scenarios. Direct scaling of the depth/width multipliers produces an accuracy/latency gradient within the YOLOv8 family:

Model	Params (M)	FLOPs (B)	COCO AP (%)	CPU (ms/img)
YOLOv8-n	3.2	8.7	37.3	80.4
YOLOv8-s	11.2	28.6	44.9	128.4
YOLOv8-m	25.9	78.9	50.2	234.7

Compared to YOLOv8-small, the Nano variant reduces parameter count by 3.5× and FLOPs by >3×, at a cost of 7.6 percentage points AP on COCO; in task-specific settings this gap may collapse or reverse, particularly under tight power or memory budgets (Hussain, 3 Jul 2024).

7. Domain-Specific Enhancements and Limitations

Recent nano-scale YOLOv8 derivatives specialize further for small object detection or real-time field deployment:

Small-Object Bias: Models such as HierLight-YOLO-N introduce an additional high-resolution detection head at P2 (stride=4) and apply hierarchical multi-scale feature fusion with IRDCB and LDown modules, yielding substantial (~2–2.5 [email protected]) improvements on UAV and VisDrone small-target benchmarks (Chen et al., 26 Sep 2025).
Real-World Limitations: While nano variants excel on resource-bound platforms, absolute recall and performance under dense occlusion, extreme class imbalance, or visually ambiguous scenes remain below medium and large counterparts. Applications requiring ultra-high recall or ultrafine delineation may necessitate scaling up or using domain-specific customizations (Amin et al., 18 Dec 2025, Zhang, 6 Mar 2025).

There is a persistent trade-off between model size, computational efficiency, and average precision. Many nano-scale deployments leverage quantization, pruning, or fusion to compress models further while accepting small absolute losses in mAP.

References: