Parallel Pairwise Sequence Alignment Algorithm Based on Longest Common Subsequence
Main Article Content
Abstract
- There is an emerging paradigm in the field of computing towards parallelism at increasing levels. Among these, multi-core processors are fast becoming the norm in the world of modern computers. The potential enhancement in performance would allow certain fundamental procedures in molecular biology, such as biological sequence alignments of DNA and protein sequences, to be done faster, paving the way for more efficient multiple genome comparison. However, in order to harness the full power of multi-core processors, effective parallel algorithms are needed. This work aimed to develop a suitable parallel longest common subsequence (LCS) algorithm for pairwise sequence alignment. The proposed parallel LCS (PLCS) performed approximately 23-30% better than the traditional serial LCS, when using the median run-time as the measure.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
I/we certify that I/we have participated sufficiently in the intellectual content, conception and design of this work or the analysis and interpretation of the data (when applicable), as well as the writing of the manuscript, to take public responsibility for it and have agreed to have my/our name listed as a contributor. I/we believe the manuscript represents valid work. Neither this manuscript nor one with substantially similar content under my/our authorship has been published or is being considered for publication elsewhere, except as described in the covering letter. I/we certify that all the data collected during the study is presented in this manuscript and no data from the study has been or will be published separately. I/we attest that, if requested by the editors, I/we will provide the data/information or will cooperate fully in obtaining and providing the data/information on which the manuscript is based, for examination by the editors or their assignees. Financial interests, direct or indirect, that exist or may be perceived to exist for individual contributors in connection with the content of this paper have been disclosed in the cover letter. Sources of outside support of the project are named in the cover letter.
I/We hereby transfer(s), assign(s), or otherwise convey(s) all copyright ownership, including any and all rights incidental thereto, exclusively to the Journal, in the event that such work is published by the Journal. The Journal shall own the work, including 1) copyright; 2) the right to grant permission to republish the article in whole or in part, with or without fee; 3) the right to produce preprints or reprints and translate into languages other than English for sale or free distribution; and 4) the right to republish the work in a collection of articles in any other mechanical or electronic format.
We give the rights to the corresponding author to make necessary changes as per the request of the journal, do the rest of the correspondence on our behalf and he/she will act as the guarantor for the manuscript on our behalf.
All persons who have made substantial contributions to the work reported in the manuscript, but who are not contributors, are named in the Acknowledgment and have given me/us their written permission to be named. If I/we do not include an Acknowledgment that means I/we have not received substantial contributions from non-contributors and no contributor has been omitted.
References
2. A. Driga, P. Lu, J. Schaeffer, D. Szafron, K. Charter and I. Parsons, “FastLSA: A Fast, Linear-Space, Parallel and Sequential Algorithm for Sequence Alignment,” Algorithmica, vol. 45, pp. 337-335, 2006.
3. K.-M. Chao and L. X. Zhang, Sequence Comparison: Theory and Method. Springer-Verlag, 2009.
4. C. M. Fraser, J. Eisen, R. D. Fleischmann, K. A. Ketchum, and S. Peterson, “Comparative genomics and understanding of microbial biology,” Emerging Infectious Diseases, vol. 6, no. 5, pp. 505–512, 2000.
5. S. J. Shyu and C. Y. Tsai, “Finding the longest common subsequence for multiple biological sequences by ant colony optimization,” Computers and Operations Research, vol. 36, pp. 73–91, 2009.
6. S. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, vol. 48, no. 3, pp. 443-453, 1970.
7. S. R. .Eddy, “What is a hidden Markov model?” Nat Biotech, vol. 22, no. 10, pp. 1315-1316, 2004.
8. A. Buttari, J. Langou, J. Kurzak and J. Dongarra, “A class of parallel tiled linear algebra algorithms for multicore architectures,” Parallel Computing, vol. 35. pp. 38-53, January 2009.
9. D. Geer, “Chip makers turn to multicore processors,” Computer, vol. 38, no. 5, pp. 11-13, 2005.
10. L. Bergroth, H. Hakonen, and T. Raita, “A survey of longest common subsequence algorithms,” Proc. 7th International Symposium on String Processing Information Retrieval (SPIRE’00), Spain, 2000, pp. 39-48.
11. M. A. Weiss, Data Structures and algorithm analysis in C, 2nd Ed. Addison-Wesley, 1997.
12. T. T. Binnewies, Y. Motro, P. F. Hallin, O. Lund, D. Dunn, T. La, D. J. Hampson, M. Bellgard, T. M. Wassenaar and D. W. Ussery, “Ten years of bacterial genome sequencing: comparative-genomics based discoveries,” Functional and Integrative Genomics, vol. 6, pp. 165-185, 2006.