Speaker
Description
The adoption of modern commercial grade SRAM-Based FPGAs, unencumbered by export restrictions, allows to take advantage of its growing computing power and reduced cost, volume, mass and power consumption for space applications ranging from control tasks to signal processing, from software-defined radio to machine learning, both in the “traditional-space” and the “new-space”.
The use of SRAM allows new execution models, for instance the FPGA task switching during the mission using partial reconfiguration, and new applications, for instance future proof upgradeable modules following technical advances in algorithms and changes in the environment. However, even in the more relaxed scenario of new-space, the lack of proper qualification, reliability evaluation and SEU mitigation may jeopardize the mission objectives.
Due to the higher cost and longer time for testing FPGA systems in its operational environment in space, one can use fault injection which has the advantage of being used since the earlier stages of engineering reducing the cost of corrections in the design. Fault injection is a powerful tool in reliability and fault tolerance analysis, supporting engineering activities of rapid prototyping, reliability aware design space exploration and selective hardening.
Emulation-based fault injection uses the real FPGA hardware, reusing existing circuitry supporting device test and configuration, to emulate the radiation effects from the space environment. In our implementation we use the Xilinx’s Internal Configuration Access Port (ICAP) to manipulated the FPGA configuration memory (CRAM), the contents of the memory blocks (BRAM) and the flip-flips from the configurable logic blocks (CLB). The emulation-based approach does not require complex facilities, has a low cost, allows the design to be tested near its nominal speed, and allows the test to be focused in selected design modules.
To achieve its goals in supporting the engineering, the fault injection must be cheap and fast, to allow evaluation of several alternative design solutions, and must be consistent with radiation effects, to drive the engineering in the right direction. Also, as fault tolerance and mitigation techniques are introduced in the FPGA design, the fault injection becomes increasingly complex, requiring more knowledge about the device and its behavior under radiation.
We present results from laser cartography, heavy ions micro-beam scanning, and static tests on Xilinx 7 Series FPGAs, under heavy ions, fast neutrons and thermal neutrons. Laser cartography and micro-beam was used to understand the organization of the FPGA memory while static tests were used to leverage statistical profiles of occurrence of single- and multiple-bit SEUs (SBU, MBU) in memory. This information was used to enhance the fault injector and devise new fault injection methodologies, including improvement of the consistency between fault injection and radiation tests, interoperation with memory scrubbing and other fault tolerance and mitigation techniques, increasing the fault injection speed to reduce fault injection campaign time, and extending the fault injector to the Xilinx UltraScale+ devices family. These results are supported by case-study applications including modules generated by high-level synthesis, softcore microprocessors and convolutional neural networks using different techniques of fault mitigation.