On the Optimal Design of Triple Modular Redundancy Logic for SRAM-based FPGAs (0710.4688v1)

Published 25 Oct 2007 in cs.AR

Abstract: Triple Modular Redundancy (TMR) is a suitable fault tolerant technique for SRAM-based FPGA. However, one of the main challenges in achieving 100% robustness in designs protected by TMR running on programmable platforms is to prevent upsets in the routing from provoking undesirable connections between signals from distinct redundant logic parts, which can generate an error in the output. This paper investigates the optimal design of the TMR logic (e.g., by cleverly inserting voters) to ensure robustness. Four different versions of a TMR digital filter were analyzed by fault injection. Faults were randomly inserted straight into the bitstream of the FPGA. The experimental results presented in this paper demonstrate that the number and placement of voters in the TMR design can directly affect the fault tolerance, ranging from 4.03% to 0.98% the number of upsets in the routing able to cause an error in the TMR circuit.

Citations (254)

View on Semantic Scholar

Summary

The paper investigates how optimizing the placement and number of majority voters in Triple Modular Redundancy (TMR) logic enhances fault tolerance in SRAM-based FPGAs against routing-induced Single Event Upsets (SEUs).
Fault injection experiments showed that a medium-scale partitioning strategy (TMR_p2) offered the best trade-off, reducing SEU impact to 0.98%, a substantial improvement over finer partitions.
Different TMR configurations result in unique trade-offs concerning hardware resource usage (area, bitstream size) and performance (clock frequency), which are critical factors for reliable FPGA designs.

Optimal Design of Triple Modular Redundancy Logic for SRAM-Based FPGAs

This paper addresses the optimization of Triple Modular Redundancy (TMR) systems within the context of SRAM-based Field Programmable Gate Arrays (FPGAs). The primary focus is on the strategic placement and number of majority voters to enhance fault tolerance in response to Single Event Upsets (SEUs), specifically within the FPGA routing mechanism.

The SRAM-based FPGAs are characterized by their susceptibility to SEUs due to the open-ended configuration nature of their programming. The fault-tolerant TMR technique, which effectively mitigates issues like SEUs, involves replicating logic circuits threefold and mediating their outputs through majority voting. However, in practice, ensuring invulnerability in programmable environments remains a persistent challenge due to upsets in the FPGA routing, which can lead to critical connections among redundant logic components. Such events compromise the integrity of TMR by misaligning output logic, affecting the circuit's accuracy.

The research conducted investigates how different configurations of TMR logic influence system robustness through a specific case paper using a digital filter. Four variations of TMR filter designs were analyzed. These versions differed primarily in their logic partitions, with the number of inserted majority voters dictated by the extent of partitioning:

TMR_p1 involves maximum logic partitioning with frequent voter insertion after each combinational block.
TMR_p2 utilizes medium-scale partitioning.
TMR_p3 places voters solely at the final output of the design.
TMR_p3_nv excludes voter implementation in register sections.

The efficacy of these partitions was measured using a fault injection technique which inserted faults randomly into the FPGA configuration bitstream to simulate SEUs. The results highlighted that meticulous placement of voters within the circuitry significantly enhances robustness against SEUs affecting the routing. Particularly, TMR_p2 optimized the trade-off between area cost and fault tolerance by curtailing the SEU impact to only 0.98%. This denotes a nearly fourfold improvement over finer partitions like TMR_p1, which reduced susceptibility to around 4.03% of potential routing upsets.

In parallel, the paper also explores the hardware demands imposed by different TMR configurations, noting area resources, bitstream size, and performance ratings. It confirms that each design variation yields unique trade-offs in terms of logic usage, routing efficiency, and final clock frequencies.

The implications of these findings are substantive for applications necessitating high reliability and resilience against SEUs in FPGAs, potentially influencing future design methodologies. By establishing an optimal logic partition, designers can significantly mitigate the probability of faults translating into system-level failures, thus contributing to robust FPGA-based systems.

Future research avenues may incorporate analysis on the effect of such partition strategies on power dissipation characteristics and explore alternate mitigation techniques that might better safeguard against the routing-induced vulnerabilities. Additionally, synergy between TMR logic partitioning and targeted FPGA floorplanning could provide further resilience against SEU-induced disruptions, presenting intriguing possibilities for continued exploration in this domain.

PDF Markdown

On the Optimal Design of Triple Modular Redundancy Logic for SRAM-based FPGAs (0710.4688v1)

Summary

Optimal Design of Triple Modular Redundancy Logic for SRAM-Based FPGAs

Related Papers