Impulse Accelerated Technologies and Pico Computing teamed together to speed the processing of a common Genome Analysis Tool Kit (GATK) workflow by 16X when compared to the original software-only implementation. GATK, which is used in bioinformatics research, is a software library used to create workflows for genomic sequencing. It’s also a collection of tools for translational medicine in projects, such as the The Cancer Genome Atlas (TCGA).
Working from GATK code originally developed in Java from the Broad Institute at Harvard, the Impulse/Pico team was able to parallelize the framework and improve performance. The team built the proof of concept on an inexpensive Pico Computing M-501 FPGA-based acceleration module. The M-501 is a small hardware module, about the size of a business card, x8 Gen2 PCIe connected board using a Xilinx Virtex 6 FPGA. The accelerator was able to provide millions of available gates.
Using this architecture, the GATK framework was parallelized to run as twelve individual streaming processes. The improvement in parallel computing throughput was achieved within the Impulse C toolset. The average number of clock cycles per base was reduced to 25. The optimized C code was compiled with the Impulse C tools into synthesizable VHDL. The VHDL was exported to Xilinx ISE and mapped against the Virtex 6 for optimal place and route.
An Impulse C model was developed for the GATK framework, together with an example application based on the coverage sample included in the GATK. The coverage sample classifies the data available at each location as callable, poor quality, low coverage, or no coverage. The Impulse C model was implemented on the M-501 PCIe board from Pico Computing. This board includes a Virtex 6 LX240T with 512MB of on-board memory.