Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
136 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Path to Fault- and Intrusion-Resilient Manycore Systems on a Chip (2307.01783v1)

Published 4 Jul 2023 in cs.CR, cs.AR, and cs.DC

Abstract: The hardware computing landscape is changing. What used to be distributed systems can now be found on a chip with highly configurable, diverse, specialized and general purpose units. Such Systems-on-a-Chip (SoC) are used to control today's cyber-physical systems, being the building blocks of critical infrastructures. They are deployed in harsh environments and are connected to the cyberspace, which makes them exposed to both accidental faults and targeted cyberattacks. This is in addition to the changing fault landscape that continued technology scaling, emerging devices and novel application scenarios will bring. In this paper, we discuss how the very features, distributed, parallelized, reconfigurable, heterogeneous, that cause many of the imminent and emerging security and resilience challenges, also open avenues for their cure though SoC replication, diversity, rejuvenation, adaptation, and hybridization. We show how to leverage these techniques at different levels across the entire SoC hardware/software stack, calling for more research on the topic.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. M. Castro, B. Liskov et al., “Practical byzantine fault tolerance,” in OsDI, vol. 99, no. 1999, 1999, pp. 173–186.
  2. R. L. Merlino and J. A. Goree, “Dusty plasmas in the laboratory, industry, and space,” PHYSICS TODAY., vol. 57, no. 7, pp. 32–39, 2004.
  3. J. R. Celaya, P. Wysocki, V. Vashchenko, S. Saha, and K. Goebel, “Accelerated aging system for prognostics of power semiconductor devices,” in 2010 Ieee Autotestcon.   IEEE, 2010, pp. 1–6.
  4. S. Adee, “The hunt for the kill switch,” IEEE Spectrum, vol. 45, no. 5, pp. 34–39, 2008.
  5. F. Imeson, S. Nejati, S. Garg, and M. Tripunitara, “{{\{{Non-Deterministic}}\}} timers for hardware trojan activation (or how a little randomness can go the wrong way),” in 10th USENIX Workshop on Offensive Technologies (WOOT 16), 2016.
  6. S. T. King, J. Tucek, A. Cozzie, C. Grier, W. Jiang, and Y. Zhou, “Designing and implementing malicious hardware.” Leet, vol. 8, pp. 1–8, 2008.
  7. K. Yang, M. Hicks, Q. Dong, T. Austin, and D. Sylvester, “A2: Analog malicious hardware,” in 2016 IEEE symposium on security and privacy (SP).   IEEE, 2016, pp. 18–37.
  8. I. Kuon, R. Tessier, J. Rose et al., “Fpga architecture: Survey and challenges,” Foundations and Trends® in Electronic Design Automation, vol. 2, no. 2, pp. 135–253, 2008.
  9. Xilinx2019, “Ug1085: Zynq ultrascale+ device technical reference manual,” Xilinx, 2019.
  10. J. L. Bentley, “Multidimensional divide-and-conquer,” Communications of the ACM, vol. 23, no. 4, pp. 214–229, 1980.
  11. L. Brownsword and T. Oberndorf, “The opportunities and complexities of applying commercial-off-the-shelf components.”
  12. D. Doan, “Commercial off the shelf (cots) security issues and approaches,” NAVAL POSTGRADUATE SCHOOL MONTEREY CA, Tech. Rep., 2006.
  13. A. Namazi and M. Nourani, “Gate-level redundancy: A new design-for-reliability paradigm for nanotechnologies,” IEEE transactions on very large scale integration (VLSI) systems, vol. 18, no. 5, pp. 775–786, 2009.
  14. R. E. Lyons and W. Vanderkulk, “The use of triple-modular redundancy to improve computer reliability,” IBM journal of research and development, vol. 6, no. 2, pp. 200–209, 1962.
  15. L. A. C. Benites and F. L. Kastensmidt, “Automated design flow for applying triple modular redundancy (tmr) in complex digital circuits,” in 2018 IEEE 19th Latin-American Test Symposium (LATS).   IEEE, 2018, pp. 1–4.
  16. K. S. Morgan, D. L. McMurtrey, B. H. Pratt, and M. J. Wirthlin, “A comparison of tmr with alternative fault-tolerant design techniques for fpgas,” IEEE transactions on nuclear science, vol. 54, no. 6, pp. 2065–2072, 2007.
  17. X. Han, M. Donato, R. I. Bahar, A. Zaslavsky, and W. Patterson, “Design of error-resilient logic gates with reinforcement using implications,” in Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016, pp. 191–196.
  18. J. D. Lohn and S. P. Colombano, “A circuit representation technique for automated circuit design,” IEEE Transactions on Evolutionary Computation, vol. 3, no. 3, pp. 205–219, 1999.
  19. D. Jeon, S. Park, S. Pregl, T. Mikolajick, and W. Weber, “Reconfigurable thin-film transistors based on a parallel array of si-nanowires,” vol. 129, pp. 1 245 041 – 1 245 049, 2021.
  20. G. Martin and H. Chang, “System-on-chip design,” in ASICON 2001. 2001 4th International Conference on ASIC Proceedings (Cat. No. 01TH8549).   IEEE, 2001, pp. 12–17.
  21. W. Wolf, A. A. Jerraya, and G. Martin, “Multiprocessor system-on-chip (mpsoc) technology,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 27, no. 10, pp. 1701–1713, 2008.
  22. B.-G. Chun, P. Maniatis, and S. Shenker, “Diverse replication for single-machine byzantine-fault tolerance.” in USENIX Annual Technical Conference, 2008, pp. 287–292.
  23. K. ElDefrawy and T. Kaczmarek, “Byzantine fault tolerant software-defined networking (sdn) controllers,” in 2016 IEEE 40th annual computer software and applications conference (COMPSAC), vol. 2.   IEEE, 2016, pp. 208–213.
  24. V. S. Júnior, L. C. Lung, M. Correia, J. da Silva Fraga, and J. Lau, “Intrusion tolerant services through virtualization: A shared memory approach,” in 2010 24th IEEE International Conference on Advanced Information Networking and Applications.   IEEE, 2010, pp. 768–774.
  25. J. Ghorpade, J. Parande, M. Kulkarni, and A. Bawaskar, “Gpgpu processing in cuda architecture,” arXiv preprint arXiv:1202.4347, 2012.
  26. “Virtualized programmable logic controllers,” 2021, accessed on: Feb, 14, 2023. [Online]. Available: controleng.com/articles/virtualized-programmable-logic-controllers/
  27. J. Hajda, R. Jakuszewski, and S. Ogonowski, “Security challenges in industry 4.0 plc systems,” Applied Sciences, vol. 11, no. 21, p. 9785, 2021.
  28. C. Wulf, M. Willig, and D. Göhringer, “A survey on hypervisor-based virtualization of embedded reconfigurable systems,” in 2021 31st International Conference on Field-Programmable Logic and Applications (FPL).   IEEE, 2021, pp. 249–256.
  29. I. Advanced Micro Devices, “Amd/xilinx intellectual property,” 2023, accessed on: May 1st, 2023. [Online]. Available: https://www.xilinx.com/products/intellectual-property.html
  30. I. Corporation, “Intel fpga intellectual property,” 2023, accessed on: May 1st, 2023. [Online]. Available: https://www.intel.com/content/www/us/en/products/details/fpga/intellectual-property.html
  31. P. Bellows and B. Hutchings, “Jhdl-an hdl for reconfigurable systems,” in Proceedings. IEEE symposium on FPGAs for custom computing machines (Cat. No. 98TB100251).   IEEE, 1998, pp. 175–184.
  32. N. Budhiraja, K. Marzullo, F. B. Schneider, and S. Toueg, “The primary-backup approach,” Distributed systems, vol. 2, pp. 199–216, 1993.
  33. X. Defago, A. Schiper, and N. Sergent, “Semi-passive replication,” in Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281), 1998, pp. 43–50.
  34. D. Ernst, N. S. Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner, and T. Mudge, “Razor: a low-power pipeline based on circuit-level timing speculation,” in Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36., 2003, pp. 7–18.
  35. S. Das, D. Roberts, S. Lee, S. Pant, D. Blaauw, T. Austin, K. Flautner, and T. Mudge, “A self-tuning dvs processor using delay-error detection and correction,” IEEE Journal of Solid-State Circuits, vol. 41, no. 4, pp. 792–804, 2006.
  36. S. Kim, I. Kwon, D. Fick, M. Kim, Y.-P. Chen, and D. Sylvester, “Razor-lite: A side-channel error-detection register for timing-margin recovery in 45nm soi cmos,” in 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers, 2013, pp. 264–265.
  37. L. Lamport, “Paxos made simple,” ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001), pp. 51–58, 2001.
  38. T. Distler, C. Cachin, and R. Kapitza, “Resource-efficient byzantine fault tolerance,” IEEE Transactions on Computers, vol. 65, no. 9, pp. 2807–2819, 2016.
  39. G. S. Veronese, M. Correia, A. N. Bessani, L. C. Lung, and P. Verissimo, “Efficient byzantine fault-tolerance,” IEEE Transactions on Computers, vol. 62, no. 1, pp. 16–30, 2011.
  40. D. Kozhaya, J. Decouchant, V. Rahli, and P. Esteves-Verissimo, “Pistis: an event-triggered real-time byzantine-resilient protocol suite,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 9, pp. 2277–2290, 2021.
  41. A. Shoker, V. Rahli, J. Decouchant, and P. Esteves-Verissimo, “Intrusion resilience systems for modern vehicles,” in In the 97th IEEE Vehicular Technology Conference (VTC2023).   IEEE, 2023.
  42. “Autosar standard,” 2023, accessed on: Feb, 14, 2023. [Online]. Available: https://www.autosar.org/
  43. K. Group, “Opengl,” 2023, accessed on: April, 19, 2023. [Online]. Available: https://www.opengl.org/
  44. R.-V. International, “Risc-v,” 2023, accessed on: April, 19, 2023. [Online]. Available: https://riscv.org/
  45. J. Zeppenfeld, A. Bouajila, A. Herkersdorf, and W. Stechele, “Towards scalability and reliability of autonomic systems on chip,” in 2010 13th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops, 2010, pp. 73–80.
  46. Y. Huang, C. Kintala, N. Kolettis, and N. D. Fulton, “Software rejuvenation: Analysis, module and applications,” in Twenty-fifth international symposium on fault-tolerant computing. Digest of papers.   IEEE, 1995, pp. 381–390.
  47. Y. Yang, Z. Chen, Y. Liu, T.-Y. Ho, Y. Jin, and P. Zhou, “How secure is split manufacturing in preventing hardware trojan?” ACM Transactions on Design Automation of Electronic Systems (TODAES), vol. 25, no. 2, pp. 1–23, 2020.
  48. A. Shoker, “Digital sovereignty strategies for every nation,” 2022.
  49. A. T. Sheikh, A. Shoker, and P. Esteves-Verissimo, “System on chip rejuvenation in the wake of persistent attacks,” in the 16th European Workshop on Systems Security (EuroSec), EuroSys-W.   IEEE, 2023.
  50. D. Silva, R. Graczyk, J. Decouchant, M. Volp, and P. Esteves-Verissimo, “Threat adaptive byzantine fault tolerant state-machine replication,” in 2021 40th International Symposium on Reliable Distributed Systems (SRDS).   Los Alamitos, CA, USA: IEEE Computer Society, sep 2021, pp. 78–87. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/SRDS53918.2021.00017
  51. J.-P. Bahsoun, R. Guerraoui, and A. Shoker, “Making bft protocols really adaptive,” in 2015 IEEE International Parallel and Distributed Processing Symposium.   IEEE, 2015, pp. 904–913.
  52. E. Sakic, N. Ðerić, and W. Kellerer, “Morph: An adaptive framework for efficient and byzantine fault-tolerant sdn control plane,” IEEE Journal on Selected Areas in Communications, vol. 36, no. 10, pp. 2158–2174, 2018.
  53. I. P. Gouveia, M. Völp, and P. Esteves-Verissimo, “Behind the last line of defense: Surviving soc faults and intrusions,” Computers & Security, vol. 123, p. 102920, 2022.
  54. J. Behl, T. Distler, and R. Kapitza, “Hybrids on steroids: Sgx-based high performance bft,” in Proceedings of the Twelfth European Conference on Computer Systems, 2017, pp. 222–237.
  55. B.-G. Chun, P. Maniatis, S. Shenker, and J. Kubiatowicz, “Attested append-only memory: Making adversaries stick to their word,” ACM SIGOPS Operating Systems Review, vol. 41, no. 6, pp. 189–204, 2007.
  56. S. Gupta, S. Rahnama, S. Pandey, N. Crooks, and M. Sadoghi, “Dissecting bft consensus: In trusted components we trust!” arXiv preprint arXiv:2202.01354, 2022.
  57. R. Kapitza, J. Behl, C. Cachin, T. Distler, S. Kuhnle, S. V. Mohammadi, W. Schröder-Preikschat, and K. Stengel, “Cheapbft: Resource-efficient byzantine fault tolerance,” in Proceedings of the 7th ACM european conference on Computer Systems, 2012, pp. 295–308.
  58. J. Lind, O. Naor, I. Eyal, F. Kelbert, E. G. Sirer, and P. Pietzuch, “Teechain: a secure payment network with asynchronous blockchain access,” in Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019, pp. 63–79.
  59. D. L. J. R. D. Jacob and R. L. T. Moscibroda, “Trinc: Small trusted hardware for large distributed systems.”
  60. M. K. Aguilera, N. Ben-David, R. Guerraoui, A. Murat, A. Xygkis, and I. Zablotchi, “ubft: Microsecond-scale bft using disaggregated memory,” in Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, 2023, pp. 862–877.
  61. M. Correia, N. Neves, and P. Verissimo, “How to tolerate half less one byzantine nodes in practical distributed systems,” in Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004., 2004, pp. 174–183.
  62. J. Decouchant, D. Kozhaya, V. Rahli, and J. Yu, “Damysus: Streamlined bft consensus leveraging trusted components,” in Proceedings of the Seventeenth European Conference on Computer Systems, ser. EuroSys ’22.   New York, NY, USA: Association for Computing Machinery, 2022, p. 1–16. [Online]. Available: https://doi.org/10.1145/3492321.3519568
Citations (2)

Summary

We haven't generated a summary for this paper yet.