Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 109 tok/s
Gemini 3.0 Pro 52 tok/s Pro
Gemini 2.5 Flash 159 tok/s Pro
Kimi K2 203 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

FastCaps: A Design Methodology for Accelerating Capsule Network on Field Programmable Gate Arrays (2509.03103v1)

Published 3 Sep 2025 in cs.AR

Abstract: Capsule Network (CapsNet) has shown significant improvement in understanding the variation in images along with better generalization ability compared to traditional Convolutional Neural Network (CNN). CapsNet preserves spatial relationship among extracted features and apply dynamic routing to efficiently learn the internal connections between capsules. However, due to the capsule structure and the complexity of the routing mechanism, it is non-trivial to accelerate CapsNet performance in its original form on Field Programmable Gate Array (FPGA). Most of the existing works on CapsNet have achieved limited acceleration as they implement only the dynamic routing algorithm on FPGA, while considering all the processing steps synergistically is important for real-world applications of Capsule Networks. Towards this, we propose a novel two-step approach that deploys a full-fledged CapsNet on FPGA. First, we prune the network using a novel Look-Ahead Kernel Pruning (LAKP) methodology that uses the sum of look-ahead scores of the model parameters. Next, we simplify the nonlinear operations, reorder loops, and parallelize operations of the routing algorithm to reduce CapsNet hardware complexity. To the best of our knowledge, this is the first work accelerating a full-fledged CapsNet on FPGA. Experimental results on the MNIST and F-MNIST datasets (typical in Capsule Network community) show that the proposed LAKP approach achieves an effective compression rate of 99.26% and 98.84%, and achieves a throughput of 82 FPS and 48 FPS on Xilinx PYNQ-Z1 FPGA, respectively. Furthermore, reducing the hardware complexity of the routing algorithm increases the throughput to 1351 FPS and 934 FPS respectively. As corroborated by our results, this work enables highly performance-efficient deployment of CapsNets on low-cost FPGA that are popular in modern edge devices.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.