Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 97 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 36 tok/s
GPT-5 High 34 tok/s Pro
GPT-4o 91 tok/s
GPT OSS 120B 462 tok/s Pro
Kimi K2 217 tok/s Pro
2000 character limit reached

LearningGroup: A Real-Time Sparse Training on FPGA via Learnable Weight Grouping for Multi-Agent Reinforcement Learning (2210.16624v1)

Published 29 Oct 2022 in cs.AR and cs.LG

Abstract: Multi-agent reinforcement learning (MARL) is a powerful technology to construct interactive artificial intelligent systems in various applications such as multi-robot control and self-driving cars. Unlike supervised model or single-agent reinforcement learning, which actively exploits network pruning, it is obscure that how pruning will work in multi-agent reinforcement learning with its cooperative and interactive characteristics. \par In this paper, we present a real-time sparse training acceleration system named LearningGroup, which adopts network pruning on the training of MARL for the first time with an algorithm/architecture co-design approach. We create sparsity using a weight grouping algorithm and propose on-chip sparse data encoding loop (OSEL) that enables fast encoding with efficient implementation. Based on the OSEL's encoding format, LearningGroup performs efficient weight compression and computation workload allocation to multiple cores, where each core handles multiple sparse rows of the weight matrix simultaneously with vector processing units. As a result, LearningGroup system minimizes the cycle time and memory footprint for sparse data generation up to 5.72x and 6.81x. Its FPGA accelerator shows 257.40-3629.48 GFLOPS throughput and 7.10-100.12 GFLOPS/W energy efficiency for various conditions in MARL, which are 7.13x higher and 12.43x more energy efficient than Nvidia Titan RTX GPU, thanks to the fully on-chip training and highly optimized dataflow/data format provided by FPGA. Most importantly, the accelerator shows speedup up to 12.52x for processing sparse data over the dense case, which is the highest among state-of-the-art sparse training accelerators.

Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube