472,982 Members | 2,487 Online

# Looping inside Needleman-Wunsch algorithm & good values for theSimilairity Matrix

Hi everyone,
Concerning the Needleman-Wunsch algorithm (cf.
http://en.wikipedia.org/wiki/Needleman-Wunsch_algorithm) I have
noticed a possible loop.

Inside the algorithm there is an important decision making mechanism.
Its a "if, else if, else if" structure like:

if(ScoreValue == DiagonalValue + SimilarityValue(i, j)
{
blah;
}
else if(Score == Left + d)
{
blah;
}
else if(Score == Up + d)
{
blah;
}
I've been playing with value of the similarity matrix and for certain
values I can get the algorithms to stick as there is no else clause to
offer an out if none of the above conditions are met. Now I know that
I could introuduce an else statement and avoid the loop but it does
not address the fundamental probalm of constructing a useful
Similarity matrix.

It's on this point issue that I want to ask for help. The similaity
Matrix that I built for my task is ultimately causing this problem. So
i need to make it better. My question is: what properties should a
good Similarity matrix have to avoid this loop? How does one go about
constructing an effective Similarity matrix? What properties should it
have? I'm not referring to the simple structure that score 1 for a
match and 0 for a mismatch. i'm referring to more complex
structures...

Thanking you,
Al.
Oct 9 '08 #1
1 3057
On Thu, 09 Oct 2008 03:33:17 -0700, al*****@altavista.com
<al*****@altavista.comwrote:
[...]
if(ScoreValue == DiagonalValue + SimilarityValue(i, j)
{
blah;
}
else if(Score == Left + d)
{
blah;
}
else if(Score == Up + d)
{
blah;
}
I've been playing with value of the similarity matrix and for certain
values I can get the algorithms to stick as there is no else clause to
offer an out if none of the above conditions are met. Now I know that
I could introuduce an else statement and avoid the loop but it does
not address the fundamental probalm of constructing a useful
Similarity matrix.
of those three conditions, at least one _must_ be true. This is because
of the way the result matrix is calculated in the first place. Each
element in the matrix is always one of those three, and so when you
compare it again later, one of those three conditions must hold.

Are you saying that you're using a similarity matrix that somehow violates
that assumption in the algorithm? Can you explain in more detail how that
happens?
It's on this point issue that I want to ask for help. The similaity
Matrix that I built for my task is ultimately causing this problem. So
i need to make it better. My question is: what properties should a
good Similarity matrix have to avoid this loop? How does one go about
constructing an effective Similarity matrix? What properties should it
have?
I would be surprised if there is much, if any, practical experience with
this algorithm here in this newsgroup. If you have questions that are
specific to its implementation in C#, then this is a great newsgroup to
ask questions. But for this kind of question -- how to create a
similarity matrix for use in this particular algorithm -- I think you need
to find people who actually have first-hand experience with the algorithm.

For what it's worth, as near as I can tell from my admittedly cursory look
at the algorithm over the course of these threads is that the only real
requirement of the matrix is that it should be symmetric, with positive
values for matches and negative values for non-matches. (It appears that
0 is also valid, though whether that's valid for matches I am not sure).

As long as the matrix meets those conditions (and possibly even broader
conditions...I'm not sure even symmetry is required, though you may get
weird output if it isn't), I think the algorithm should work as stated.

I could be mistaken, of course. That's why it'd be better for you to
track down someone who's actually used this algorithm before. The rest of
us may not have the time to do a sufficient review of the algorithm to be
able to answer the question intelligently. :)

Pete
Oct 9 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.