IJBBB 2016 Vol.6(2): 59-67 ISSN: 2010-3638
doi: 10.17706/ijbbb.2016.6.2.59-67
doi: 10.17706/ijbbb.2016.6.2.59-67
Disk Partition Techniques Assesment and Analysis Applied to Genomic Assemblers Based on Bruijn Graphs
Nelson Enrique Vera-Parra, Ruben Javier Medina-Daza, Cristian Alejandro Rojas-Quintero
Abstract—In this paper an assessment of several de-novo genomic assembler tools based on de Bruijn graph is made, with the purpose to measure the impact of the use of disk partitioning techniques regarding the computational requirements and generate a framework for bioinformatics researchers to let them identify advantages, disadvantages, bottlenecks and challenges of the assemblers using those techniques.
Assessed assemblers using disk partitioning techniques were: Minia and EPGA, the assessed assemblers that do not use disk partitioning were: ABySS and SOAPDenovo2. The parameters measured were the following: occupied space in RAM, processing time, parallelization and disk read and write access. A dataset was used with 36,504,800 short reads corresponding to 14th human chromosome. The assessment was made for two kmers size: 31 and 55. The results obtained were the following: The tools based on disk partitioning techniques showed the less RAM use. The tools with more I/O transfer intensity were the ones using disk partitioning techniques. The techniques that achieved more parallelization were the ones using disk partitioning.
Index Terms—Assemblers, assembly, bioinformatics, kmer count, minimizers.
Nelson Enrique Vera-Parra and Cristian Alejandro Rojas-Quintero are with GICOGE Research Group, Distrital University Francisco José de Caldas, Carrera 7 No. 40B – 53, Bogotá D.C., Colombia (e-mail: nelsonenriquevera@gmail.com).
Ruben Javier Medina-Daza is with NIDE-GEFEM Research Group, Distrital University Francisco José de Caldas, Carrera 7 No. 40B – 53, Bogotá D.C., Colombia.
Assessed assemblers using disk partitioning techniques were: Minia and EPGA, the assessed assemblers that do not use disk partitioning were: ABySS and SOAPDenovo2. The parameters measured were the following: occupied space in RAM, processing time, parallelization and disk read and write access. A dataset was used with 36,504,800 short reads corresponding to 14th human chromosome. The assessment was made for two kmers size: 31 and 55. The results obtained were the following: The tools based on disk partitioning techniques showed the less RAM use. The tools with more I/O transfer intensity were the ones using disk partitioning techniques. The techniques that achieved more parallelization were the ones using disk partitioning.
Index Terms—Assemblers, assembly, bioinformatics, kmer count, minimizers.
Nelson Enrique Vera-Parra and Cristian Alejandro Rojas-Quintero are with GICOGE Research Group, Distrital University Francisco José de Caldas, Carrera 7 No. 40B – 53, Bogotá D.C., Colombia (e-mail: nelsonenriquevera@gmail.com).
Ruben Javier Medina-Daza is with NIDE-GEFEM Research Group, Distrital University Francisco José de Caldas, Carrera 7 No. 40B – 53, Bogotá D.C., Colombia.
Cite: Nelson Enrique Vera-Parra, Ruben Javier Medina-Daza, Cristian Alejandro Rojas-Quintero, "Disk Partition Techniques Assesment and Analysis Applied to Genomic Assemblers Based on Bruijn Graphs," International Journal of Bioscience, Biochemistry and Bioinformatics vol. 6, no. 2, pp. 59-67, 2016.
General Information
ISSN: 2010-3638 (Online)
Abbreviated Title: Int. J. Biosci. Biochem. Bioinform.
Frequency: Quarterly
DOI: 10.17706/IJBBB
Editor-in-Chief: Prof. Ebtisam Heikal
Abstracting/ Indexing: Electronic Journals Library, Chemical Abstracts Services (CAS), Engineering & Technology Digital Library, Google Scholar, and ProQuest.
E-mail: ijbbb@iap.org
-
Sep 29, 2022 News!
IJBBB Vol 12, No 4 has been published online! [Click]
-
Jun 23, 2022 News!
News | IJBBB Vol 12, No 3 has been published online! [Click]
-
Dec 20, 2021 News!
IJBBB Vol 12, No 1 has been published online! [Click]
-
Sep 23, 2021 News!
IJBBB Vol 11, No 4 has been published online! [Click]
-
Jun 25, 2021 News!
IJBBB Vol 11, No 3 has been published online! [Click]
- Read more>>