C# String Comparison, IndexOf and Related | | |
Hi Everyone,
I've been looking through these .NET groups and can't find the exact
answer I want, so I'm asking.
Can someone let me know the best way (you feel) to search a C# string
for an occurance of a CASE INSENSITIVE substring, returning the found
position. I'm speaking of larger strings to search as well ~50K-500K.
Here's what I have so far:
* ToUpper/ToLower and IndexOf would be quite slow, right? as strings
are immutable and these search strings are larger to begin with.
* RegEx could be the answer, but I'm not sure pattern matching would
be the right solution for this problem
* Any unsafe code, Boyer-Moore using pointers or inline assembly (if
that's possible), would seem the best, but well, it's unsafe code
* I've found a MapTable example here in the C# nj (thanks maptable
person), and think this might be the best solution
Any help is appreciated, thanks in advance!
BILL | | | | re: C# String Comparison, IndexOf and Related
Bill,
I did some tests. I created a 5 MB file and loaded it into a
streamreader. I assigned all of the text from the file into a string object.
I did a tolower and it returned the index of the specified substring
immediately. I also used some of the globalization classes that allows you
to do indexof with an ignorecase parameter. That also returned the index
immediately. I don't have any numbers as far as time that it took to run but
during debugging it literally stepped over the line of code doing the
comparison with no pause whatsoever.
here is the globalization code. I used a very simple text comparison below.
CultureInfo culture = new CultureInfo("en-us");
int index = culture.CompareInfo.IndexOf("this is a
TEST","test",System.Globalization.CompareOptions.I gnoreCase);
HTH
--
Lateralus [MCAD]
"BILL" <titirein@yahoo.com> wrote in message
news:cd9ff955.0408271546.3d336489@posting.google.c om...[color=blue]
> Hi Everyone,
>
> I've been looking through these .NET groups and can't find the exact
> answer I want, so I'm asking.
>
> Can someone let me know the best way (you feel) to search a C# string
> for an occurance of a CASE INSENSITIVE substring, returning the found
> position. I'm speaking of larger strings to search as well ~50K-500K.
> Here's what I have so far:
>
> * ToUpper/ToLower and IndexOf would be quite slow, right? as strings
> are immutable and these search strings are larger to begin with.
>
> * RegEx could be the answer, but I'm not sure pattern matching would
> be the right solution for this problem
>
> * Any unsafe code, Boyer-Moore using pointers or inline assembly (if
> that's possible), would seem the best, but well, it's unsafe code
>
> * I've found a MapTable example here in the C# nj (thanks maptable
> person), and think this might be the best solution
>
> Any help is appreciated, thanks in advance!
> BILL[/color] | | | | re: C# String Comparison, IndexOf and Related
Thanks Lateralus - although I was a bit skeptical of the results,
after doing similar tests, I think I've changed my thinking on the
matter. I ran some IndexOf/ToUpper and related code on a few older
boxes I have here (eg, 500Mhz AMD, 512M) and didn't see any real
performance degradation either.
So - here's my question to everyone- if I'm not looking to do
heavy-duty work with these strings I think I'm best off using .NET
methods. The original question might have resulted from my being
trained as an anal-C++-guy, if so ... sorry all :)
"Lateralus [MCAD]" <dnorm252_at_yahoo.com> wrote in message news:<eLdKCmKjEHA.1904@TK2MSFTNGP09.phx.gbl>...[color=blue]
> Bill,
> I did some tests. I created a 5 MB file and loaded it into a
> streamreader. I assigned all of the text from the file into a string object.
> I did a tolower and it returned the index of the specified substring
> immediately. I also used some of the globalization classes that allows you
> to do indexof with an ignorecase parameter. That also returned the index
> immediately. I don't have any numbers as far as time that it took to run but
> during debugging it literally stepped over the line of code doing the
> comparison with no pause whatsoever.
>
> here is the globalization code. I used a very simple text comparison below.
>
> CultureInfo culture = new CultureInfo("en-us");
>
> int index = culture.CompareInfo.IndexOf("this is a
> TEST","test",System.Globalization.CompareOptions.I gnoreCase);
>
> HTH
> --
> Lateralus [MCAD]
>
>
> "BILL" <titirein@yahoo.com> wrote in message
> news:cd9ff955.0408271546.3d336489@posting.google.c om...[color=green]
> > Hi Everyone,
> >
> > I've been looking through these .NET groups and can't find the exact
> > answer I want, so I'm asking.
> >
> > Can someone let me know the best way (you feel) to search a C# string
> > for an occurance of a CASE INSENSITIVE substring, returning the found
> > position. I'm speaking of larger strings to search as well ~50K-500K.
> > Here's what I have so far:
> >
> > * ToUpper/ToLower and IndexOf would be quite slow, right? as strings
> > are immutable and these search strings are larger to begin with.
> >
> > * RegEx could be the answer, but I'm not sure pattern matching would
> > be the right solution for this problem
> >
> > * Any unsafe code, Boyer-Moore using pointers or inline assembly (if
> > that's possible), would seem the best, but well, it's unsafe code
> >
> > * I've found a MapTable example here in the C# nj (thanks maptable
> > person), and think this might be the best solution
> >
> > Any help is appreciated, thanks in advance!
> > BILL[/color][/color] | | | | re: C# String Comparison, IndexOf and Related
Bill,
I can understand where youre coming from. Whenever our applications need
heavy string manipulation on large amounts of data we would always write the
dll in C++. There is nothing scientific about my next statement because I
never ran any "true" tests. We had a c++ dll that would manipulate large
strings up to 10MB in size. When it was rewritten in c# we didn't notice any
degredation in the speed becides it's first time executing since it gets
compiled the first time. So basically I found that the systems I've worked
on there is no need to turn to C++ as there was in the past. Of course there
are going to be times that you will need to, but for this one I think you're
ok with C#.
--
Lateralus [MCAD]
"BILL" <titirein@yahoo.com> wrote in message
news:cd9ff955.0408281520.10b3b67a@posting.google.c om...[color=blue]
> Thanks Lateralus - although I was a bit skeptical of the results,
> after doing similar tests, I think I've changed my thinking on the
> matter. I ran some IndexOf/ToUpper and related code on a few older
> boxes I have here (eg, 500Mhz AMD, 512M) and didn't see any real
> performance degradation either.
>
> So - here's my question to everyone- if I'm not looking to do
> heavy-duty work with these strings I think I'm best off using .NET
> methods. The original question might have resulted from my being
> trained as an anal-C++-guy, if so ... sorry all :)
>
>
> "Lateralus [MCAD]" <dnorm252_at_yahoo.com> wrote in message
> news:<eLdKCmKjEHA.1904@TK2MSFTNGP09.phx.gbl>...[color=green]
>> Bill,
>> I did some tests. I created a 5 MB file and loaded it into a
>> streamreader. I assigned all of the text from the file into a string
>> object.
>> I did a tolower and it returned the index of the specified substring
>> immediately. I also used some of the globalization classes that allows
>> you
>> to do indexof with an ignorecase parameter. That also returned the index
>> immediately. I don't have any numbers as far as time that it took to run
>> but
>> during debugging it literally stepped over the line of code doing the
>> comparison with no pause whatsoever.
>>
>> here is the globalization code. I used a very simple text comparison
>> below.
>>
>> CultureInfo culture = new CultureInfo("en-us");
>>
>> int index = culture.CompareInfo.IndexOf("this is a
>> TEST","test",System.Globalization.CompareOptions.I gnoreCase);
>>
>> HTH
>> --
>> Lateralus [MCAD]
>>
>>
>> "BILL" <titirein@yahoo.com> wrote in message
>> news:cd9ff955.0408271546.3d336489@posting.google.c om...[color=darkred]
>> > Hi Everyone,
>> >
>> > I've been looking through these .NET groups and can't find the exact
>> > answer I want, so I'm asking.
>> >
>> > Can someone let me know the best way (you feel) to search a C# string
>> > for an occurance of a CASE INSENSITIVE substring, returning the found
>> > position. I'm speaking of larger strings to search as well ~50K-500K.
>> > Here's what I have so far:
>> >
>> > * ToUpper/ToLower and IndexOf would be quite slow, right? as strings
>> > are immutable and these search strings are larger to begin with.
>> >
>> > * RegEx could be the answer, but I'm not sure pattern matching would
>> > be the right solution for this problem
>> >
>> > * Any unsafe code, Boyer-Moore using pointers or inline assembly (if
>> > that's possible), would seem the best, but well, it's unsafe code
>> >
>> > * I've found a MapTable example here in the C# nj (thanks maptable
>> > person), and think this might be the best solution
>> >
>> > Any help is appreciated, thanks in advance!
>> > BILL[/color][/color][/color] | | | | re: C# String Comparison, IndexOf and Related
Lateralus - Thanks! It's hard to leave my C++/MASM behind, but you're
->absolutely<- right, I'll attack these problems when needed now. Any
different opinions on this thread are always welcome, but I think I've
found my answer...
BILL
"Lateralus [MCAD]" <dnorm252_at_yahoo.com> wrote in message news:<#WWfv1WjEHA.3896@TK2MSFTNGP15.phx.gbl>...[color=blue]
> Bill,
> I can understand where youre coming from. Whenever our applications need
> heavy string manipulation on large amounts of data we would always write the
> dll in C++. There is nothing scientific about my next statement because I
> never ran any "true" tests. We had a c++ dll that would manipulate large
> strings up to 10MB in size. When it was rewritten in c# we didn't notice any
> degredation in the speed becides it's first time executing since it gets
> compiled the first time. So basically I found that the systems I've worked
> on there is no need to turn to C++ as there was in the past. Of course there
> are going to be times that you will need to, but for this one I think you're
> ok with C#.
>
> --
> Lateralus [MCAD]
>[/color]
<snip> | | | | re: C# String Comparison, IndexOf and Related
BILL <titirein@yahoo.com> wrote:[color=blue]
> I've been looking through these .NET groups and can't find the exact
> answer I want, so I'm asking.
>
> Can someone let me know the best way (you feel) to search a C# string
> for an occurance of a CASE INSENSITIVE substring, returning the found
> position. I'm speaking of larger strings to search as well ~50K-500K.
> Here's what I have so far:[/color]
<snip>
In addition to the previous comments, you may wish to consider using
CompareInfo.IndexOf (source, value, CompareOptions.IgnoreCase)
You can get a CompareInfo reference from a CultureInfo - you could use
the current culture (CultureInfo.CurrentCulture) or the invariant one
(CultureInfo.InvariantCulture).
--
Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet
If replying to the group, please do not mail me too |  | Similar C# / C Sharp bytes | | | /bytes/about
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 226,223 network members.
|