Russian versionEnglish version   
Volume 12   Issue 1   Year 2017
Parallel algorithm for global alignment of long aminoacid and nucleotide sequences

Ruslan K. Tetuev, Maxim I. Pyatkov, Anton N. Pankratov

Institute of Mathematical Problems of Biology RAS - the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences

 
Abstract. A parallel algorithm is developed for the global alignment of long DNA and protein sequences. The algorithm uses an arbitrary substitution matrix. An affine internal and end gap penalty systems might be set up separately for each of the two sequences. The possibility to control the choice of an optimal alignment from the set of alternative ones is implemented. The two parameters of the parallel algorithm which are called grid steps were introduced. They are used to split the global alignment matrix into blocks of specified size. The analysis of these parameters was performed in order to obtain optimal values that reduce either time complexity or memory complexity of the algorithm.  It is shown that when using the block sizes that provide lowest memory complexity, the algorithm allows to align two long sequences of length L using the memory of size O(L4/3). It is shown that the algorithm is well scaled on multi-core systems, reaching superlinear execution time acceleration. The algorithm is implemented as a high-performance parallel web application in JavaScript and is freely available at http://sbars.impb.ru/aligner.html.
 
Key words: global alignment, affine penalty system, terminal inserts, parallel computations, superlinear acceleration.
Table of Contents Original Article
Math. Biol. Bioinf.
2017;12(1):137-150
doi: 10.17537/2017.12.137
published in Russian

Abstract (rus.)
Abstract (eng.)
Full text (rus., pdf)
References

 

  Copyright IMPB RAS © 2005-2018