473,385 Members | 1,829 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

High speed string processing

Hi

Does anyone know a good starting point about high speed string processing in
C#?

What I need is a very fast routine for a case insensitive "contains". e ==
E == é == ë == ...

Kind regards

Alexander
Nov 16 '05 #1
6 4696
hi Alexander
you need to be using the StringBuilder class , you can read more about it
on these links
http://msdn.microsoft.com/library/de...us/cpref/html/
frlrfsystemtextstringbuilderclasstopic.asp
http://www.csharpfriends.com/Article...x?articleID=13
Mohamed Mahfouz
MEA Developer Support Center
ITworx on behalf of Microsoft EMEA GTSC

Nov 16 '05 #2
Simplest is to use already available methods on string.

1. Do a ToLower
2. Replace all occurences of "E", "é", "ë" with "e" in both the base and
matching string.
3. Use the Contains method on the base string.

Or write yourself one. Convert the string to a character array and then
walk thru the characters one by one.

HTH

-vJ

"Alexander Muylaert" <Re********@pandora.be> wrote in message
news:%2****************@TK2MSFTNGP10.phx.gbl...
Hi

Does anyone know a good starting point about high speed string processing
in
C#?

What I need is a very fast routine for a case insensitive "contains". e
==
E == é == ë == ...

Kind regards

Alexander

Nov 16 '05 #3
"Vijaye Raji" <no*************@hotmail.com> wrote in news:uJYQV5OQEHA.4000
@TK2MSFTNGP10.phx.gbl:
1. Do a ToLower
2. Replace all occurences of "E", "?", "?" with "e" in both the base and
matching string.
3. Use the Contains method on the base string.


This causes string copies as string are immutable. Unless you are doing
single operations, its anything but "high speed" as he requested. And Im not
even sure StringBuilder will match his needs - it depends on what he means by
high speed.

For our needs we have a lot of ASCII text that we have to manage (its
sockets, thus its always ascii text, never will be other) so we have written
our own class. It's fully managed and quite fast. And if its not fast enough
we can compile it unmanaged (its Delphi, we can compile either way) and
export a managed interface.
--
Chad Z. Hower (a.k.a. Kudzu) - http://www.hower.org/Kudzu/
"Programming is an art form that fights back"

Empower ASP.NET with IntraWeb
http://www.atozed.com/IntraWeb/
Nov 16 '05 #4
You're right. Using "string" for these would be horrible.

I was suggesting the algorithm to use, should have been more careful about
the use of the word "string"

-vJ

"Chad Z. Hower aka Kudzu" <cp**@hower.org> wrote in message
news:Xn******************@127.0.0.1...
"Vijaye Raji" <no*************@hotmail.com> wrote in news:uJYQV5OQEHA.4000
@TK2MSFTNGP10.phx.gbl:
1. Do a ToLower
2. Replace all occurences of "E", "?", "?" with "e" in both the base and
matching string.
3. Use the Contains method on the base string.


This causes string copies as string are immutable. Unless you are doing
single operations, its anything but "high speed" as he requested. And Im
not
even sure StringBuilder will match his needs - it depends on what he means
by
high speed.

For our needs we have a lot of ASCII text that we have to manage (its
sockets, thus its always ascii text, never will be other) so we have
written
our own class. It's fully managed and quite fast. And if its not fast
enough
we can compile it unmanaged (its Delphi, we can compile either way) and
export a managed interface.
--
Chad Z. Hower (a.k.a. Kudzu) - http://www.hower.org/Kudzu/
"Programming is an art form that fights back"

Empower ASP.NET with IntraWeb
http://www.atozed.com/IntraWeb/

Nov 16 '05 #5
Hi

I'm a Delphi programmer myself and I would need the function in a very tight
loop. Called often while typing. It is a search function on +- 5000 -
500000 strings while typing. And it is not the first char, if it is inside
the string, we have a hit, if it is not inside the string, we don't have a
hit.

The Q_strings library from delphi is really great for me. This also doesn't
translate the string first. It works with a mapping table. MapTable[Char]
of char = ...

This is very fast. I need the same or better speed.

Kind regards

Alexander
Nov 16 '05 #6
On Sun, 23 May 2004 09:46:10 +0200, "Alexander Muylaert"
<Re********@pandora.be> wrote:
Does anyone know a good starting point about high speed string processing in
C#?

What I need is a very fast routine for a case insensitive "contains". e ==
E == é == ë == ...


There are some algorithms in Cormen's (et. al) "Algorithms",
and in Knuth's "The Art of Computer Programming: Searching and
Sorting". You can also find the same algorithms in books on compiler
design. But these deal specifically with matching the string, and not
the replacement and case insensitivity.
I'd recommend using a separate buffer for what the user is
typing in. Every time a character is pressed, convert it to lower
case, and change your funky characters to the regular ones (é to e).
Also I wouldn't try to find each string in what the user typed, rather
I'd try to find what the user typed in the list of strings starting at
different offsets. Having the list of strings sorted, and doing a
binary search will make that a manageable task.
Reverse all of the strings in your list, and search backwards
through what the user typed. E.G. when the user types 'r' you only
have to search through your list for strings that end with 'r'. This
way you never have to back up.
That should be fast enough, but if it's not, sort the list by
length first, then alphabetically. Keep a list of indexes into the
list as to where the first word of each length is, then in your binary
search only search the strings which match the length you're looking
for.
500000 is a lot of strings. But, in a binary search, at most
you have to look at 19 strings to find a match, or determine there is
no match. Which is very manageable for processing while a user is
typing.
If you want to go one step further, use a separate thread for
matching the string. People tend to pause and burst when typing.
Nov 16 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Peter Kleiweg | last post by:
I implemented a lexer in Pylly and compared it to the version I had written in Flex. Processing 219062 lines took 0.9 seconds in C (from Flex), and 5 minutes 54 second in Python (from Pylly), a...
11
by: Fred Bennett | last post by:
I have a simulation project in which data can naturally be held in structures for processing. There are calls to multiple functions involved. Execution speed is an issue. Do I take a big hit for...
354
by: Montrose... | last post by:
After working in c# for a year, the only conclusion I can come to is that I wish I knew c. All I need is Linux, the gnu c compiler and I can do anything. Web services are just open sockets...
3
by: Ian Taite | last post by:
Hello, I'm exploring why one of my C# .NET apps has "high" memory usage, and whether I can reduce the memory usage. I have an app that wakes up and processes text files into a database...
3
by: Alexander Muylaert | last post by:
Hi I've made this small routine that performs a case insensitive String.IndexOf(). It is badly coded, but still it is 30% faster than the "ToUpper" equivalent. Perhaps someone can give me...
5
by: Joe Reazor | last post by:
I've got an asp.net page that has approximately 1,440 controls on it. You must think I'm crazy, but we need to have the page organized in this fashion. As you can imagine, the page takes a very...
12
by: PD | last post by:
I am currently making a dating website. I want to have some information on how to structure the database and the php files so that I can achieve speed and efficiency. Can some one please give...
12
by: ThunderMusic | last post by:
Hi, We have a part of our application that deals with millions of records and do some processing of them. We've achieved a pretty good performance gain by developping a custom DateTime.ToString...
8
by: SaltyBoat | last post by:
Needing to import and parse data from a large PDF file into an Access 2002 table: I start by converted the PDF file to a html file. Then I read this html text file, line by line, into a table...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.