Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 45 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 24 tok/s Pro

GPT-4o 96 tok/s Pro

Kimi K2 206 tok/s Pro

GPT OSS 120B 457 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

FasterSeg: Searching for Faster Real-time Semantic Segmentation (1912.10917v2)

Published 23 Dec 2019 in cs.CV and cs.LG

Abstract: We present FasterSeg, an automatically designed semantic segmentation network with not only state-of-the-art performance but also faster speed than current methods. Utilizing neural architecture search (NAS), FasterSeg is discovered from a novel and broader search space integrating multi-resolution branches, that has been recently found to be vital in manually designed segmentation models. To better calibrate the balance between the goals of high accuracy and low latency, we propose a decoupled and fine-grained latency regularization, that effectively overcomes our observed phenomenons that the searched networks are prone to "collapsing" to low-latency yet poor-accuracy models. Moreover, we seamlessly extend FasterSeg to a new collaborative search (co-searching) framework, simultaneously searching for a teacher and a student network in the same single run. The teacher-student distillation further boosts the student model's accuracy. Experiments on popular segmentation benchmarks demonstrate the competency of FasterSeg. For example, FasterSeg can run over 30% faster than the closest manually designed competitor on Cityscapes, while maintaining comparable accuracy.

Citations (185)

View on Semantic Scholar

Collections

Summary

Overview of FasterSeg: Searching for Faster Real-time Semantic Segmentation

The paper presents FasterSeg, a semantic segmentation network that aims to deliver state-of-the-art performance with enhanced real-time speed using Neural Architecture Search (NAS). FasterSeg is crafted within an expanded search space that innovatively integrates multi-resolution branches, a feature identified as critical in manually curated segmentation models. This research tackles the trade-off between high accuracy and low latency, proposing a decoupled latency regularization technique to mitigate issues where networks might skew towards low latency at the cost of accuracy. The paper also introduces a collaborative search framework that concurrently seeks both a teacher and a student network, with the teacher-student distillation benefiting the student model's performance. The experiments on several segmentation benchmarks underscore FasterSeg’s ability to perform over 30% faster than the nearest manually designed competitor while maintaining comparable accuracy.

Technical Contributions

The paper's significant contributions include:

Multi-resolution Branching: FasterSeg’s NAS search space allows the network to automatically search and aggregate multi-resolution branches. This flexibility enables the architecture to enhance real-time segmentation by leveraging various resolutions, ensuring both high-resolution detail preservation and necessary contextual aggregation.
Latency Regularization: The research introduces a novel latency regularization approach with decoupled granularity to counteract "architecture collapse," where architectures might unnaturally lean towards low latency at the expense of accuracy. This method adjusts the framework’s sensitivity to different modules within the architecture, such as operators and expansion ratios, for better calibration of latency and accuracy.
Collaborative Teacher-Student Search: The paper describes a seamless extension into teacher-student co-searching. This framework enables simultaneous exploration of complex teacher networks and simplified student networks, where knowledge transfer or distillation from teacher to student networks can further improve accuracy.

Empirical Results and Implications

FasterSeg demonstrated considerable practical advancement by achieving high-speed operation on challenging benchmarks such as Cityscapes, CamVid, and BDD with highly competitive accuracy. The paper reports FasterSeg running at 163.9 FPS with a 71.5% mIoU on the Cityscapes test set using a full resolution (1024x2048), marking a notable improvement over previous designs. On CamVid and BDD datasets, it maintained rapid speeds while achieving leading or comparable accuracies, suggesting robustness and versatility across different environments.

These empirical advancements underscore the potential shift toward automated network design for real-time applications. The approach reflects a crucial balance between computational efficiency and practical applicability in high-stakes environments such as autonomous driving, where segmentation latency and precision are vitally important.

Future Directions

The research sets a precedent for further exploration of NAS and its integration with multi-resolution approaches in semantic segmentation tasks. Future investigations might focus on enhancing the search space complexities to accommodate broader application constraints, evolving the collaborative search toward larger, more generalized network constructs, or refining latency regularizations to adapt dynamically to real-world data challenges. Additionally, optimizing NAS frameworks for hardware-specific environments may yield further deployment benefits, aligning computational neural advancements closer to practical hardware implementations.

In summary, FasterSeg presents a compelling case paper of leveraging NAS for semantic segmentation, urging continued development in model efficiency and accuracy while opening pathways for thoughtful network architecture automation under real-time constraints.