Alignment of protein-coding sequences with frameshift extension penalties (1508.04783v1)

Published 19 Aug 2015 in cs.DS, cs.CE, and q-bio.GN

Abstract: We introduce an algorithm for the alignment of protein- coding sequences accounting for frameshifts. The main specificity of this algorithm as compared to previously published protein-coding sequence alignment methods is the introduction of a penalty cost for frameshift ex- tensions. Previous algorithms have only used constant frameshift penal- ties. This is similar to the use of scoring schemes with affine gap penalties in classical sequence alignment algorithms. However, the overall penalty of a frameshift portion in an alignment cannot be formulated as an affine function, because it should also incorporate varying codon substitution scores. The second specificity of the algorithm is its search space being the set of all possible alignments between two coding sequences, under the classical definition of an alignment between two DNA sequences. Previous algorithms have introduced constraints on the length of the alignments, and additional symbols for the representation of frameshift openings in an alignment. The algorithm has the same asymptotic space and time complexity as the classical Needleman-Wunsch algorithm.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Alignment of protein-coding sequences with frameshift extension penalties (1508.04783v1)

Summary

Related Papers