High speed string processing

Alexander Muylaert

Hi

Does anyone know a good starting point about high speed string processing in
C#?

What I need is a very fast routine for a case insensitive "contains". e ==
E == é == ë == ...

Kind regards

Alexander

Nov 16 '05 #1

Subscribe Post Reply

4696

Mohamoss

hi Alexander
you need to be using the StringBuilder class , you can read more about it
on these links
http://msdn.microsoft.com/library/de...us/cpref/html/
frlrfsystemtextstringbuilderclasstopic.asp
http://www.csharpfriends.com/Article...x?articleID=13
Mohamed Mahfouz
MEA Developer Support Center
ITworx on behalf of Microsoft EMEA GTSC

Nov 16 '05 #2

Vijaye Raji

Simplest is to use already available methods on string.

1. Do a ToLower
2. Replace all occurences of "E", "é", "ë" with "e" in both the base and
matching string.
3. Use the Contains method on the base string.

Or write yourself one. Convert the string to a character array and then
walk thru the characters one by one.

HTH

-vJ

"Alexander Muylaert" <Re********@pandora.be> wrote in message
news:%2****************@TK2MSFTNGP10.phx.gbl...

Hi

Does anyone know a good starting point about high speed string processing
in
C#?

What I need is a very fast routine for a case insensitive "contains". e
==
E == é == ë == ...

Kind regards

Alexander

Nov 16 '05 #3

Chad Z. Hower aka Kudzu

"Vijaye Raji" <no*************@hotmail.com> wrote in news:uJYQV5OQEHA.4000
@TK2MSFTNGP10.phx.gbl:

1. Do a ToLower
2. Replace all occurences of "E", "?", "?" with "e" in both the base and
matching string.
3. Use the Contains method on the base string.

This causes string copies as string are immutable. Unless you are doing
single operations, its anything but "high speed" as he requested. And Im not
even sure StringBuilder will match his needs - it depends on what he means by
high speed.

For our needs we have a lot of ASCII text that we have to manage (its
sockets, thus its always ascii text, never will be other) so we have written
our own class. It's fully managed and quite fast. And if its not fast enough
we can compile it unmanaged (its Delphi, we can compile either way) and
export a managed interface.
--
Chad Z. Hower (a.k.a. Kudzu) - http://www.hower.org/Kudzu/
"Programming is an art form that fights back"

Empower ASP.NET with IntraWeb
http://www.atozed.com/IntraWeb/

Nov 16 '05 #4

Vijaye Raji

You're right. Using "string" for these would be horrible.

I was suggesting the algorithm to use, should have been more careful about
the use of the word "string"

-vJ

"Chad Z. Hower aka Kudzu" <cp**@hower.org> wrote in message
news:Xn******************@127.0.0.1...

"Vijaye Raji" <no*************@hotmail.com> wrote in news:uJYQV5OQEHA.4000
@TK2MSFTNGP10.phx.gbl:
1. Do a ToLower
2. Replace all occurences of "E", "?", "?" with "e" in both the base and
matching string.
3. Use the Contains method on the base string.

This causes string copies as string are immutable. Unless you are doing
single operations, its anything but "high speed" as he requested. And Im
not
even sure StringBuilder will match his needs - it depends on what he means
by
high speed.

For our needs we have a lot of ASCII text that we have to manage (its
sockets, thus its always ascii text, never will be other) so we have
written
our own class. It's fully managed and quite fast. And if its not fast
enough
we can compile it unmanaged (its Delphi, we can compile either way) and
export a managed interface.
--
Chad Z. Hower (a.k.a. Kudzu) - http://www.hower.org/Kudzu/
"Programming is an art form that fights back"

Empower ASP.NET with IntraWeb
http://www.atozed.com/IntraWeb/

Nov 16 '05 #5

Alexander Muylaert

Hi

I'm a Delphi programmer myself and I would need the function in a very tight
loop. Called often while typing. It is a search function on +- 5000 -
500000 strings while typing. And it is not the first char, if it is inside
the string, we have a hit, if it is not inside the string, we don't have a
hit.

The Q_strings library from delphi is really great for me. This also doesn't
translate the string first. It works with a mapping table. MapTable[Char]
of char = ...

This is very fast. I need the same or better speed.

Kind regards

Alexander

Nov 16 '05 #6

Greg Miller

On Sun, 23 May 2004 09:46:10 +0200, "Alexander Muylaert"
<Re********@pandora.be> wrote:

Does anyone know a good starting point about high speed string processing in
C#?

What I need is a very fast routine for a case insensitive "contains". e ==
E == é == ë == ...

There are some algorithms in Cormen's (et. al) "Algorithms",
and in Knuth's "The Art of Computer Programming: Searching and
Sorting". You can also find the same algorithms in books on compiler
design. But these deal specifically with matching the string, and not
the replacement and case insensitivity.
I'd recommend using a separate buffer for what the user is
typing in. Every time a character is pressed, convert it to lower
case, and change your funky characters to the regular ones (é to e).
Also I wouldn't try to find each string in what the user typed, rather
I'd try to find what the user typed in the list of strings starting at
different offsets. Having the list of strings sorted, and doing a
binary search will make that a manageable task.
Reverse all of the strings in your list, and search backwards
through what the user typed. E.G. when the user types 'r' you only
have to search through your list for strings that end with 'r'. This
way you never have to back up.
That should be fast enough, but if it's not, sort the list by
length first, then alphabetically. Keep a list of indexes into the
list as to where the first word of each length is, then in your binary
search only search the strings which match the length you're looking
for.
500000 is a lot of strings. But, in a binary search, at most
you have to look at 19 strings to find a match, or determine there is
no match. Which is very manageable for processing while a user is
typing.
If you want to go one step further, use a separate thread for
matching the string. People tend to pause and burst when typing.

Nov 16 '05 #7

Similar topics

speed

by: Peter Kleiweg | last post by:

I implemented a lexer in Pylly and compared it to the version I had written in Flex. Processing 219062 lines took 0.9 seconds in C (from Flex), and 5 minutes 54 second in Python (from Pylly), a...

Python

Structures ans speed

by: Fred Bennett | last post by:

I have a simulation project in which data can naturally be held in structures for processing. There are calls to multiple functions involved. Execution speed is an issue. Do I take a big hit for...

C / C++

354

c# is a good way to learn c

by: Montrose... | last post by:

After working in c# for a year, the only conclusion I can come to is that I wish I knew c. All I need is Linux, the gnu c compiler and I can do anything. Web services are just open sockets...

.NET Framework

High Memory Usage Garbage Collection Question

by: Ian Taite | last post by:

Hello, I'm exploring why one of my C# .NET apps has "high" memory usage, and whether I can reduce the memory usage. I have an app that wakes up and processes text files into a database...

.NET Framework

High speed string processing

by: Alexander Muylaert | last post by:

Hi I've made this small routine that performs a case insensitive String.IndexOf(). It is badly coded, but still it is 30% faster than the "ToUpper" equivalent. Perhaps someone can give me...

C# / C Sharp

Page Processing Speed

by: Joe Reazor | last post by:

I've got an asp.net page that has approximately 1,440 controls on it. You must think I'm crazy, but we need to have the page organized in this fashion. As you can imagine, the page takes a very...

ASP.NET

How can I achieve efficiency and speed for high traffic dating web site

by: PD | last post by:

I am currently making a dating website. I want to have some information on how to structure the database and the php files so that I can achieve speed and efficiency. Can some one please give...

PHP

How can I speed up ToString of decimal and double?

by: ThunderMusic | last post by:

Hi, We have a part of our application that deals with millions of records and do some processing of them. We've achieved a pretty good performance gain by developping a custom DateTime.ToString...

.NET Framework

How to speed up a code loop with INSERT INTO query?

by: SaltyBoat | last post by:

Needing to import and parse data from a large PDF file into an Access 2002 table: I start by converted the PDF file to a html file. Then I read this html text file, line by line, into a table...

Microsoft Access / VBA

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing