

#### **Dyplo in Space**

Mike Looijmans

distributio



Embedded in your future



- Introduction
- What is Dyplo?
- Status of space-enabled Dyplo
  - Radiation hardened FPGA support
  - RTEMS support
- Questions





#### System Expert at TOPIC Embedded Products



## Topic today

- Real Embedded company; 170 employees
  - 125+ embedded software developers
  - 25+ FPGA designers
  - 10+ board designers
- Founded in 1996, privately owned
- ▲ 3 Business units:



Embedded

in your future

4

TOPIC

- ▲ Since 1996: Consultancy: the Netherlands
- Since 2006: Project execution: Europe and North America
- Since 2014: Product development and sales: world wide



ALLIANCE PROGRAM PREMIER MEMBER 1 of 10 worldwide











- Dynamic loading of processes on the FPGA using partial reconfiguration
  - And static assignment
- Infrastructure for streaming data between these processes
  - With run-time routing
- Interfaces for streaming data between FPGA logic and CPU programs
  - latency, throughput, resource usage







- Use the FPGA as accelerator
  - Offload work to the FPGA
- Acquire data from external sources
  - Real-time
- Transfer data to external targets
  - Output and control





#### What is Dyplo

TOPIC



EMBEDDED PRODUCTS





- Create Dyplo-enabled project
  - Or add Dyplo to existing project
- Configure Dyplo
  - Resources, streams, nodes, bandwidth
- Create "static" design
  - Connect I/O nodes to external components
- Add node functionality
  - C/C++ code or VHDL/Verilog
- More changes







## Dyplo – on Zynq series

#### Carrier board: Topic Florida-GEN







#### Dyplo – on Kintex series - 1













#### Dyplo – on Virtex 5 (ML506)

















- Available for Linux, Windows and RTEMS
- Driver represents Dyplo infrastructure in posix file interface
- Directly usable from command shell
  - Example: stream content from file directly to a datastream
    on the backplane: cat file > /dev/dyplow0
- C++ API available for applications
  - Other language APIs can be made (using ioctl)
- Open Source (GITHUB)







- Make Dyplo IP compatible with radiation hardened Xilinx Virtex 5QV
- Add RTEMS support
- Demonstrate
  - Running demo
  - LEON softcore
  - Image processing
  - SpaceWire







#### Xilinx Evaluation board: ML506



EMBEDDED PRODUCTS



- Dyplo driver (and C++ lib) for RTEMS
- Partial reconfiguration on Virtex 5 with Dyplo concept
- But: ISE13.2 version required for Virtex5 QV
  - Xilinx has stopped support
  - Immature w.r.t. partial reconfiguration
  - Not all Dyplo functionality works
    - Difference windows/linux
    - Combining functionality in fixed and partial nodes







- Creating a SOC with LEON2-FT uses >15% of logic
- Overhead wrapper
  - (400 LUTS/1BlockRAM) relatively high on our smaller test breadboard (1/3 logic of QV)







#### Working with ESA on Dyplo

- Dyplo driver for RTEMS available medium 2017
- Dyplo support for Virtex 5 QV (Radiation hardened) available medium 2017

Embedded

in your future

Ideas for Dyplo FT (Fault Tolerant)



## Dyplo performance

- Dyplo infrastructure
  - Transports data between nodes
  - "Backplane"
- Node <-> Dyplo backplane bandwidth
  - Data from backplane to node and vice versa
- CPU <-> Dyplo backplane bandwidth
  - From CPU/DMA node to CPU (or memory)



#### Dyplo performance - infra

- Dyplo infrastructure bandwidth
- Mostly determined by clock
  - ▲ 100 MHz on low end fabric (e.g. Artix)
  - ▲ Can be over 200MHz (e.g. Kintex)
- One 32-bit word per clock per lane
  - ▲ 1..4 lanes
  - Lowest 400 Mbyte/s (3,2 Gbit/s) for Zynq 7xxx at 100 Mhz
     Highest 1600 Mbyte/s (12,8 Gbit/s) for Zynq 7xxx at 100 Mhz





## Dyplo performance - node

- Node <-> Dyplo backplane bandwidth
- Clock speed
  - Same as backplane clock
- All streams share the bandwidth
  - ▲ 400 Mbyte/s (3.2 Gbit/s)
  - ▲ 800 Mbyte/s (6.4 Gbit/s)
- Input and output simultaneous

e.g. Zynq 7015 at 100 Mhz e.g. Zynq 7030 at 200 Mhz

Embedded

in your future



# Dyplo performance - CPU

- CPU <-> Dyplo backplane bandwidth
- Using memory mapped FIFO
  - On Zynq7xxx: 25 MB/s (0.2Gbps)
  - On PCIe card: 3 MB/s read, 20MB/s write (due to bus latency)
- Using DMA
  - On Zynq7xxx: 600 MB/s (limited by HP port)
    - ▲ 1200MB/s through ACP when targeting L2 or L1 cache
  - On PCIe: 800MB/s (limited by design)
    - ▲ Theoretical: 2.5 GB/s for 4-lane gen2







- FPGA resource usage depends on user configuration
- CPU interface
  - ▲ ~ 4000 registers (157200 available @ XC7Z030 =<2.6%)
  - ▲ ~ 2500 LUT6 (78600 available @ XC7Z030 =<3.1%)
  - ▲ 16 BRAM36 (265 available @ XC7Z030 =<6.3%)
- node wrappers
  - ~ 75-300 registers
  - ▲ ~ 50-200 LUT6
  - 1 BRAM36 per input stream

per node (=<0.2%) per node (=<0.3%)

per node (=<0.4%)











