473,396 Members | 1,671 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Serious Bug System.Collections Sort

There is a longer article about this subject here:
http://www.codeproject.com/useritems/SortedList_Bug.asp
See the main article and the reply thread started by Robert Rohde.

Alternatively look at this code:

ArrayList a=new ArrayList();

string s1 = "-0.67:-0.33:0.33";
string s2 = "0.67:-0.33:0.33";
string s3 = "-0.67:0.33:-0.33";

a.Add(s1);
a.Add(s2);
a.Add(s3);

a.Sort();
for (int i=0; i<3; i++) Console.WriteLine( a[i] );

Console.WriteLine();

a.Clear();
a.Add(s1);
a.Add(s3);
a.Add(s2);

a.Sort();
for (int i=0; i<3; i++) Console.WriteLine( a[i] );

This code produces the following six lines of output:

-0.67:0.33:-0.33
0.67:-0.33:0.33
-0.67:-0.33:0.33

-0.67:-0.33:0.33
-0.67:0.33:-0.33
0.67:-0.33:0.33

Note that the .Sort produces different outputs depending on the order
the strings are added.

It looks like the Sort algorithm is ignoring the "-" mark.

This is a very serious Bug impacting the System.Collections Array,
SortedList etc.

Nov 9 '06 #1
14 1960
This appears to relate to the culture-specific comparer, which is presumably
ignoring symbols (one of the CompareOptions flags); perhaps switch to
ordinal comparison, which resolves this. You may be able to use
StringComparer.Ordinal; can't remember if that exists in 1.1, but if not
something like this should do (and use it in the Sort() calls).

public class OrdinalStringComparer : IComparer
{
public readonly static OrdinalStringComparer Singleton = new
OrdinalStringComparer();
private OrdinalStringComparer() { }
public int Compare(object x, object y)
{
return string.CompareOrdinal((string) x, (string) y);
}
}

Marc
Nov 9 '06 #2
I don't agree....

string s2 = "0.67:-0.33:0.33";
string s3 = "-0.67:0.33:-0.33";

Console.WriteLine( String.Compare(s2,s3));
Console.WriteLine( String.Compare(s2,s3));

returns -1 and 1 showing that the Sting.Compare function is working as
expected.

I don't think this is a culture issue and writing an ICompared for
evxery System.Collections string comparison is a nighmare.

Nov 9 '06 #3
Well... it would appear that the problem is a dodgy comparer; try this
(using your previous values for s1, s2, s3):

Console.WriteLine(Comparer.Default.Compare(s1, s2));
Console.WriteLine(Comparer.Default.Compare(s2, s3));
Console.WriteLine(Comparer.Default.Compare(s3, s1));

Yields 1, 1, 1 meaning there is a comparer loop. Oops! Anybody [MS?] want to
comment on whether this is intentional or a fubar? It would appear to
violate the "transitive" rule of comparers, which should be enforced for
non-zero results (0 being transitive as long as it agrees in each
direction).

Actually, it's quite lucky that this returns at all! Perhaps there is a
panic "oops, shouldn't possibly have taken more than n^2 iterations...".

Anyway, the point of my post was that this can be avoided using an ordinal
comparer. And you can re-use the same one each time... no need to write
anything more.

Marc
Nov 9 '06 #4
wi************@gmail.com wrote:
I don't agree....

string s2 = "0.67:-0.33:0.33";
string s3 = "-0.67:0.33:-0.33";

Console.WriteLine( String.Compare(s2,s3));
Console.WriteLine( String.Compare(s2,s3));

returns -1 and 1 showing that the Sting.Compare function is working as
expected.

I don't think this is a culture issue and writing an ICompared for
evxery System.Collections string comparison is a nighmare.
>From the docs, note how it mentions that the hyphen might be given a
low weight so that similar words will sort together:

"The .NET Framework supports word, string, and ordinal sort rules. A
word sort performs a culture-sensitive comparison of strings in which
certain nonalphanumeric Unicode characters might have special weights
assigned to them. For example, the hyphen ("-") might have a very small
weight assigned to it so that "coop" and "co-op" appear next to each
other in a sorted list. A string sort is similar to a word sort, except
that there are no special cases and all nonalphanumeric symbols come
before all alphanumeric Unicode characters. An ordinal sort compares
strings based on the numeric value of each Char in the string. For more
information about word, string, and ordinal sort rules, see the
System.Globalization.CompareOptions topic.

Comparison and search procedures are case-sensitive by default and use
the culture associated with the current thread unless specified
otherwise. By definition, any string, including the empty string (""),
compares greater than a null reference, and two null references compare
equal to each other.

If your application makes security decisions based on the result of a
comparison or case change operation, then the operation should use the
invariant culture to ensure the result is not affected by the value of
the current culture. For more information, see the
CultureInfo.InvariantCulture topic."

I don't know if this is what is causing the behavior you reported, but
at least it's worth looking into.

Chris

Nov 9 '06 #5
Yes, Marc, sorry you are right.

The code

string s1 = "-0.67:-0.33:0.33";
string s2 = "0.67:-0.33:0.33";
string s3 = "-0.67:0.33:-0.33";

Console.WriteLine( String.Compare(s1,s2));
Console.WriteLine( String.Compare(s2,s3));
Console.WriteLine( String.Compare(s3,s1));

Console.WriteLine();

Console.WriteLine( String.CompareOrdinal(s1, s2));
Console.WriteLine( String.CompareOrdinal(s2, s3));
Console.WriteLine( String.CompareOrdinal(s3, s1));

returns

1
1
1

-3
3
3

Ie String.Compare can not handle the "-" marks but
String.CompareOrdinal can.

Is there not some regional setting you can put once in the code so
String.Compare and ArrayList.Sort all work that way from then on? Would
much prefer this...

----

Note it annoys me that the documentation writes:

"The .NET Framework supports word, string, and ordinal sort
rules....For example, the hyphen ("-") might have a very small weight
assigned to it so that "coop" and "co-op" appear next to each other in
a sorted list...."

But in fact the hypen is being completly ignored rather than given a
low weight.

I think this behaviour by default is very undesirable.

Nov 9 '06 #6
And FWIW, I didn't see this issue officially reported on the feedback
site. You may wish to post it there.

Nov 9 '06 #7
Well, it isn't being ignored. If it was being ignored I would expect the
result to be 0,0,0.
I think this behaviour by default is very undesirable.
I think it is buggy, but it isn't in System.Collections - it is in
String.CompareTo. I can't think of a valid occasion in a well-ordered system
when a < b < c < a or a b c a. It just makes no logical sense unless
you are Maurits Escher.

Now, a = b = c = a = 0 I could live with (i.e. hyhpens completely ignored).

Marc
Nov 9 '06 #8
If you play around with examples like this:

s1 = "0.67:0.33:-0.33";
s2 = "0.67:-0.33:0.33";
s3 = "-0.67:0.33:0.33";

Console.WriteLine( String.Compare(s1,s2));
Console.WriteLine( String.Compare(s2,s3));
Console.WriteLine( String.Compare(s3,s1));

It all looks OK. But when you put two hyphens into the lines the
String.Compare gets confused and gives silly results.

OK the String.CompareOrdinal function fixes the problem but this is not
behaviour by design surely?

We need a comment from our lord and master MS....

Nov 9 '06 #9
>Chris Dunaway wrote:
>
And FWIW, I didn't see this issue officially reported on the feedback
site. You may wish to post it there.
Chris, what is the feedback site? I would liek to post it there. It's
really annoying and you never know, they might take an interest.

Nov 9 '06 #11
Thanks very much that's great. I will keep an eye on it.

Nov 9 '06 #12
See my previous post; since you didn't seem familiar with "connect" I posted
it as a bug. Feel free to go in and click "validate"... and vote!
http://connect.microsoft.com/VisualS...dbackID=236900

And note: I wasn't trying to steal your thunder... just to get it logged
with the least fuss...

Marc
Nov 9 '06 #13
Still no MS viewpoint?

For ref, I think this is actually quite important, as it could (as
illustrated on the now-deleted CodeProject link) cause a whole range of
sort-critical operations to fail... SortedList etc, or any custom
collections that assume that .Sort() might actually work, and then
trust the results.

Any thoughts?

Marc

Nov 13 '06 #14
Yes absolutely, it cost me lots of time in the SortedList failing. I
think it's to do with the hyphen algorithm which is designed to sort
words with hyphens in a smart way. Obviously with two hyphens in the
word it fails. Anyone who puts two hyphens in strings is taking a huge
risk - yet using hyphens in strings is common enough. A really
dangerous bug I would say, but I guess Microsoft haven't seen that
yet. You could mention on the article that it causes problems with
SortedList and you think it's a very serious problem. I think we have
discovered a real corker and it deserves to be given a lot of attention.

Nov 13 '06 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: audipen | last post by:
I have a problem with System.Type.GetType method. If you try out the following code in C# console app .. System.Type t = System.Type.GetType("System.DateTime"); System.Type t1 =...
4
by: xixi | last post by:
i have a very serious memory problem, we have db2 udb v8.1 load on a HP titanium machine with 4 G memory, it is 64bit machine, currently on DB2 instance , i have three databases, but only one is...
12
by: Sunny | last post by:
Hi All, I have a serious issue regarding classes scope and visibility. In my application, i have a class name "TextFile", and also a few other classes like "TotalWords", "TotalLines" and etc..,...
1
by: Sky Sigal | last post by:
Hello: Is there a way to get the path of the current website I am working on during designTime (when current Context == null)? System.Web.HttpRuntime.AppDomainAppPath is not available during...
2
by: WJ | last post by:
This post is a follow up from the original post dated Oct 16, 2004 "I have this problem, pls help!" created by Paul FI. These bugs are rather serious and we would like to know how to get around. ...
4
by: nhmark64 | last post by:
Hi, Does System.Collections.Generic.Queue not have a Synchronized method because it is already in effect synchronized, or is the Synchronized functionality missing from...
6
by: Arthur Dent | last post by:
How do you sort a generic collection derived from System.Collections.ObjectModel.Collection? Thanks in advance, - Arthur Dent
6
by: fooshm | last post by:
Hello, I would like to implement the following code written in java using c# java: public void op (List myList) { .... myList.add(myObjecy); Collections.sort(myList); }
2
by: Fred Heida | last post by:
Hi, i'm trying to (using managed C++) implment the IEnumerable<Tinterface on my class.. but have a problem with the 2 GetEnumerator method required.... what i have done is... ...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.