Dr
Gerard Rauwerda
(Recore Systems)
**Architecture**
The multi-core DSP sub-system comprises the following key building blocks:
• The Xentium® is a programmable high-performance DSP processor core that is efficient and offers high-precision;
• Network-on-Chip (NoC) technology provides sufficient bandwidth, flexibility and predictability which are required for interconnecting DSP cores and I/O interfaces in streaming DSP applications.
The presented multi-core DSP sub-system consists of programmable fixed-point (and floating-point in the future) DSP cores that are connected by a NoC. After initialization by the host processor, the multi-core DSP sub-system will autonomously run compute-intensive DSP functions.
**Network-on-Chip**
The NoC provides the bandwidth and flexibility that is required for streaming DSP applications. The communication bandwidth in a NoC scales with the number of cores. In conventional bus architectures, additional processors share the original bandwidth and will eventually create a bottleneck. A NoC ensures predictable performance due to its point-to-point connections, in contrast to the unpredictability of a shared bus. Moreover, NoCs allow disabling inactive parts of the network, which is essential for energy-efficiency and dependability.
Using transparent I/O interfaces it is possible to extend the NoC across the chip boundaries. Several I/O interfaces are available on the multi-core DSP architecture, such as SpaceWire bridge interfaces, bridges to external Analog-to-Digital Convertor (ADC) and Digital-to-Analog Convertor (DAC) devices. All NoC interfaces employ memory-mapped communication.
**Xentium DSP**
The Xentium is a programmable high-performance 32/40-bit fixed-point DSP core for inclusion in multi-core systems-on-chip. High-performance is achieved by exploiting instruction level parallelism using parallel execution slots. The Very Long Instruction Word (VLIW) architecture of the Xentium features 10 parallel execution slots and includes support for Single Instruction Multiple Data (SIMD) and zero-overhead loops. The Xentium is designed to meet the following objectives: high-performance, optimized energy profile, easily programmable and memory mapped I/O.
**Xentium DSP – Datapath**
The Xentium datapath contains parallel execution units and register files. The different execution units can all perform 32-bit scalar and vector operations. For vector operations the operands are interpreted as 2-element vectors. The elements of these vectors are the low and high half-word (16-bit) parts of a 32-bit word. In addition several units can perform 40-bit scalar operations for improved accuracy. All operations can be executed conditionally. The Xentium datapath provides powerful processing performance: 4 16-bit MACs per processor clock cycle or 2 32-bit MACs per cycle or 2 16-bit complex MACs per cycle.
Currently, the fixed-point Xentium datapath is being upgraded to support floating-point operations as well.
**Xentium DSP – Tightly-coupled Data Memory**
Private local memories are available y is available at the Xentium DSP. The tightly-coupled data memory is organized in parallel memory banks to allow simultaneous access by different resources. The data memory can be simultaneously accessed by the Xentium core as well as other cores connected through the NoC. By default the data memory in the Xentium tile is organized in 4 banks of 4 kBytes each, implemented using SRAM cells. The size of the memory banks is parametrizable at design-time.
**Software Development and Debugging**
The software development for the Xentium is supported by a C compiler, an assembler, a linker, a simulator, a debugger, and a number of utilities. The compiler translates C source code into Xentium assembly language source code.
In order to ease the software development on the multi-core DSP architecture, the architecture has been equipped with multi-core DSP debug infrastructure. The Xentium DSP cores have integrated hardware debug support to intrusively debug all registers in the Xentium datapath. Also, a cross-trigger unit allows the debugging of multiple Xentium cores in parallel. The debug infrastructure interfaces with standard GDB debug tools.
Summary
Next generation digital signal processors for space applications have to be programmable, high performance and low power. Reconfigurability of digital signal processors for spacecrafts, such as instantly changing the payload processing functionality on spacecraft while they are operational becomes important. As an example, the functionality of a spacecraft can be updated in space to increase the operational lifetime of the spacecraft.
We present a multi-core DSP architecture for streaming Digital Signal Processing for on-board payload data processing (OBPDP) applications. In the Massively Parallel Processor Breadboarding (MPPB) study and in the Scalable Sensor Data Processor (SSDP) the Network-on-Chip (NoC) based multi-core DSP sub-system is integrated together with a conventional general purpose processor (LEONx) sub-system. Generally, the LEONx sub-system acts as the host processor, initializing and controlling the multi-core DSP sub-system.
Dr
Gerard Rauwerda
(Recore Systems)
Mr
Jordy Potman
(Recore Systems)
Dr
Kim Sunesen
(Recore Systems)
Dr
Tom Bruintjes
(Recore Systems)
Dr
Tung Hoang Thanh
(Recore Systems)