Accelerating Millions of Short Reads Mapping on a Heterogeneous Architecture with FPGA Accelerator

Wen TANG,  Wendi WANG,  Bo DUAN,  Chunming ZHANG,  Guangming TAN,  Peiheng ZHANG,  Ninghui SUN
High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences


Abstract

The explosion of Next Generation Sequencing (NGS) data with over one billion reads per run poses a great challenge to the capability of current computing systems. Recently, with the advances on modern FPGA technology, there has been a resurgence in research aimed at specialized system design that leveraging FPGAs for accelerating large-scale scientific applications. In this paper, we proposed a CPU-FPGA heterogeneous architecture for accelerating a short reads mapping algorithm, which was built upon the concept of hash-index. In order to overcome the obstacles of abundant irregular hash table accesses and huge memory footprints in original algorithm, we proposed several optimizations that reorder hash table accesses and compress empty hash buckets. In particular, by extracting and mapping the most time-consuming and basic operations to specialized processing elements (PEs), our new algorithm is favorable to efficient acceleration on FPGAs. The proposed architecture is implemented and evaluated on a customized FPGA accelerator card with a Xilinx Virtex LX330 FPGA resided. Limited by available data transfer bandwidth, our current accelerator, which operates at 175MHz, integrates up to 100 PEs. Compared to a Intel six-cores CPU, the speedup of our accelerator ranges from 22.2 times to 42.9 times.