Imec announces whole genome data analysis available in just a few hours
imec, a world-leading research and innovation hub in nanoelectronics and digital technologies, announced elPrep5, the newest version of its software platform for DNA analysis. Obtaining identical results, elPrep5 is eight to 16 times faster than the genome analysis toolkit (GATK) — the widely-accepted standard reference. The imec platform encompasses the full analysis pipeline from data preparation to variant calling on a similar hardware infrastructure, opening new opportunities and efficiency gains for hospitals and medical practitioners.
“This is the breakthrough we have been anticipating for years. Finally, we can run the entire DNA analysis pipeline with a single software platform solution, and faster than ever,” said imec researcher Dr. Charlotte Herzeel. “Because variant calling is the most complex step, gathering results up to 16 times faster than the previous method has resulted in a four- to nine-fold reduction in time, all while retaining GATK4-identical results. For the medical sector, this allows massive efficiency gains because the time between sampling and diagnosis dramatically decreases and doctors can run analyses overnight. Moreover, since many hospitals run their analyses via rented cloud solutions, the reduced throughput times can immediately result in a cost reduction per analysis.”
Performing this analysis is a computational-heavy challenge. Despite substantial cost reductions for DNA analysis over the past decade, runtimes — up to two to three days for a whole genome — were still prone to improvement. Now, imec’s elPrep5 can perform a whole genome analysis within a few hours without compromising the quality of the output. Extensive validations show completely identical outputs to its industry counterparts in GATK, SAMtools and Picard.
By taking advantage of its parallel execution framework, elPrep5 performs the complete analysis after a single pass through the data. This architecture avoids the intensive read and write processes of fragments of data in and out of the memory. elPrep5 is written in Go, an open-source programming language developed by Google, and can be run on standard servers that most hospitals have locally or in the cloud. ElPrep5 extends and improves the elPrep4 functionality and performance by including variant calling as the final step to encompass the whole DNA analysis pipeline and by realizing additional efficiency gains in the process.
ElPrep5 targets users in the pharmaceutical industry, scientific research, medical laboratories, sequencing service providers, sequencing vendors and hospitals. The speedups brought by elPrep5 enable these industries to move from research runs into clinical practice and further scale their operations. Several industrial partners have already expressed interest to integrate elPrep5 into their daily operations.