Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NetSmith: An Optimization Framework for Machine-Discovered Network Topologies (2404.02357v1)

Published 2 Apr 2024 in cs.AR and cs.DC

Abstract: Over the past few decades, network topology design for general purpose, shared memory multicores has been primarily driven by human experts who use their insights to arrive at network designs that balance the competing goals of performance requirements (e.g., latency, bandwidth) and cost constraints (e.g., router radix, router counts). On the other hand, there have been automatic NoC synthesis methods for SoCs to optimize for application-specific communication and objectives such as resource usage or power. Unfortunately, these techniques do not lend themselves to the general-purpose context, where directly applying these previous NoC synthesis techniques in the general-purpose context yields poor results, even worse than expert-designed networks. We design and develop an automatic network design methodology - NetSmith - to design networks for general-purpose, shared memory multicores that comprehensively outperform expert-designed networks. We employ NetSmith in the context of interposer networks for chiplet-based systems where there has been significant recent work on network topology design (e.g., Kite, Butter Donut, Double Butterfly). NetSmith generated topologies are capable of achieving significantly higher throughput (50% to 75% higher) while also reducing average hop count by 8% to 13.5%) than previous expert-designed and synthesized networks. Full system simulations using PARSEC benchmarks demonstrate that the improved network performance translates to improved application performance with up to 11% mean speedup over previous NoI topologies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Transcom: Transforming stream communication for load balance and efficiency in networks-on-chip. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-44, page 237–247, New York, NY, USA, 2011. Association for Computing Machinery.
  2. Recent proposals for tiled architectures. Poster Abstract of the HiPEAC ACACES-2005 Summer School, pages 47–50, 2005.
  3. Noc synthesis flow for customized domain specific multiprocessor systems-on-chip. IEEE Transactions on Parallel and Distributed Systems, 16(2):113–129, 2005.
  4. Generation of application specific fault tolerant irregular noc topologies using tabu search. In 2019 IX Brazilian Symposium on Computing Systems Engineering (SBESC), pages 1–8, 2019.
  5. Kite: A family of heterogeneous interposer topologies enabled via accurate interconnect modeling. In 2020 57th ACM/IEEE Design Automation Conference (DAC), pages 1–6, 2020.
  6. The parsec benchmark suite: Characterization and architectural implications. In 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 72–81, 2008.
  7. Disintegrating manycores: Which applications lose and why? In Proceedings of the 16th International Workshop on Network on Chip Architectures, NoCArc ’23, page 3–8, New York, NY, USA, 2023. Association for Computing Machinery.
  8. Routing, scheduling and channel assignment in wireless mesh networks: Optimization models and algorithms. Ad Hoc Networks, 8(6):545–563, 2010.
  9. K.-C. Chang. Low-power algorithm for automatic topology generation for application-specific networks on chips. IET Computers & Digital Techniques, 2:239–249(10), May 2008.
  10. Automated techniques for synthesis of application-specific network-on-chip architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 27(8):1425–1438, 2008.
  11. Genetic algorithm based topology generation for application specific network-on-chip. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pages 3156–3159, 2010.
  12. Lef: An effective routing algorithm for two-dimensional meshes. IEICE Transactions on Information and Systems, E102.D(10):1925–1941, 2019.
  13. Introduction to algorithms, second edition. 2001.
  14. Deadlock-free message routing in multiprocessor interconnection networks. 1988.
  15. Deadlock-free oblivious routing for arbitrary topologies. In 2011 IEEE International Parallel & Distributed Processing Symposium, pages 616–627, 2011.
  16. J. Duato. A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Transactions on Parallel and Distributed Systems, 6(10):1055–1067, 1995.
  17. Throughput-oriented noc topology generation and analysis for high performance socs. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 17(10):1433–1446, 2009.
  18. Ebda: A new theory on design and verification of deadlock-free interconnection networks. ISCA ’17, page 703–715, New York, NY, USA, 2017. Association for Computing Machinery.
  19. Automated synthesis of custom networks-on-chip for real world applications. In Proceedings of the 39th International Conference on Computer-Aided Design, ICCAD ’20, New York, NY, USA, 2020. Association for Computing Machinery.
  20. A review of dynamic power management methods in noc under emerging design considerations. In 2009 NORCHIP, pages 1–6, 2009.
  21. Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual, 2023.
  22. Juris Hartmanis. Computers and intractability: A guide to the theory of np-completeness (michael r. garey and david s. johnson). SIAM Review, 24(1):90–91, 1982.
  23. Die stacking is happening. IEEE Micro, 38(1):22–28, 2018.
  24. Application-specific network-on-chip synthesis with topology-aware floorplanning. In 2012 25th Symposium on Integrated Circuits and Systems Design (SBCCI), pages 1–6, 2012.
  25. Noc architectures for silicon interposer systems: Why pay for more wires when you can get them (from your interposer) for free? 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, pages 458–470, 2014.
  26. Area optimization with non-linear models in core mapping for system-on-chips. In 2019 8th International Conference on Modern Circuits and Systems Technologies (MOCAST), pages 1–4, 2019.
  27. Measuring and understanding throughput of network topologies. In SC ’16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 761–772, 2016.
  28. Enabling interposer-based disintegration of multi-core processors. 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 546–558, 2015.
  29. Input versus output queueing on a space-division packet switch. IEEE Transactions on Communications, 35(12):1347–1356, 1987.
  30. Application-aware deadlock-free oblivious routing. In Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA ’09, page 208–219, New York, NY, USA, 2009. Association for Computing Machinery.
  31. John H. Lau. Chiplet Heterogeneous Integration, pages 413–439. Springer Singapore, Singapore, 2021.
  32. Optimizing the heterogeneous network on-chip design in manycore architectures. pages 184–189, 09 2017.
  33. Efficient reconfigurable global network-on-chip designs towards heterogeneous cpu-gpu systems: An application-aware approach. In 2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pages 439–444, 2017.
  34. A fast custom network topology generation with floorplanning for noc-based systems. In 2011 IEEE International Conference on IC Design & Technology, pages 1–4, 2011.
  35. The gem5 simulator: Version 20.0+. CoRR, abs/2007.03152, 2020.
  36. Layered routing in irregular networks. IEEE Transactions on Parallel and Distributed Systems, 17(1):51–65, 2006.
  37. An ilp-based floorplan-aware path synthesis technique for application-specific noc design. In 2016 3rd International Conference on Recent Advances in Information Technology (RAIT), pages 543–548, 2016.
  38. Mapping and physical planning of networks-on-chip architectures with quality-of-service guarantees. In Proceedings of the 2005 Asia and South Pacific Design Automation Conference, ASP-DAC ’05, page 27–32, New York, NY, USA, 2005. Association for Computing Machinery.
  39. Designing application-specific networks on chips with floorplan information. In 2006 IEEE/ACM International Conference on Computer Aided Design, pages 355–362, 2006.
  40. Pioneering chiplet technology and design for the amd epyc™ and ryzen™ processor families : Industrial product. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), pages 57–70, 2021.
  41. High-performance, low-complexity deadlock avoidance for arbitrary topologies/routings. In Proceedings of the 2018 International Conference on Supercomputing, ICS ’18, page 129–138, New York, NY, USA, 2018. Association for Computing Machinery.
  42. Network-on-chip design for heterogeneous multiprocessor system-on-chip. In 2014 IEEE Computer Society Annual Symposium on VLSI, pages 486–491, 2014.
  43. L. Schwiebert. Deadlock-free oblivious wormhole routing with cyclic dependencies. IEEE Transactions on Computers, 50(9):865–876, 2001.
  44. Ncde: In-network caching for directory entries to expedite data access in tiled-chip multiprocessors. IEEE Access, 11:3080–3095, 2023.
  45. Knights landing: Second-generation intel xeon phi product. IEEE Micro, 36(2):34–46, 2016.
  46. Linear-programming-based techniques for synthesis of network-on-chip architectures. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14(4):407–420, 2006.
  47. A new generation of cluster interconnect. White Paper, SiCortex Inc, 2006.
  48. Cost-effective design of scalable high-performance systems using active and passive interposers. In 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 728–735, 2017.
  49. Dsent - a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, pages 201–210, 2012.
  50. S. Tosun. Application-specific topology generation algorithms for network-on-chip design. IET Computers & Digital Techniques, 6:318–333(15), September 2012.
  51. Fault-tolerant topology generation method for application-specific network-on-chips. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 34(9):1495–1508, 2015.
  52. Optimization models for three on-chip network problems. ACM Trans. Archit. Code Optim., 13(3), sep 2016.
  53. Modular routing design for chiplet-based systems. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), pages 726–738, 2018.
  54. Application-specific network-on-chip synthesis: Cluster generation and network component insertion. In 2011 12th International Symposium on Quality Electronic Design, pages 1–6, 2011.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com