473,583 Members | 3,295 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to replace multi-spaces within a string with single-space

Hello,

I was wondering if there is a method that exists to replace multi-spaces
within a string with single-space.
eg:
"12 3 4 56" --> "12 3 4 56"

I think this could be done by looking at each char within a loop and copying
the char to a stringBuilder instance
if current and previous char are not spaces...
But as always, I would prefer to use an existing method ;-)

Thanks,
José
Nov 13 '05 #1
2 5069
José Joye <jo*******@KILL THESPAMSbluewin .ch> wrote:
Yes, in my case, efficiency is a topic. However, I agree that my strings are
quite small and the number of multi-spaces should not be to many.

I was wondering if the Regex solution is slower than the other solutions. If
yes, do you know how much slower?


I really don't know. Could you post a sample selection of strings
(including whatever proportion would have no multi-spaces at all)? If
so, I can benchmark a few ways of doing it...

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 13 '05 #2
José Joye <jo*******@KILL THESPAMSbluewin .ch> wrote:
In fact, my strings are OCR-B lines read from Bank/Post Slips.
Each line should contains at most 80 chars. I have removed the Heading and
leading spaces with the Trim()
method.

So this can be some samples:
"0100 000187004>221 74101 02080003 95200208060+ 010184 473>"
"01 00000179008>0 00050175 7500100007054 24008+ 0103 97904>"
"01 0000006630 3>104922 351100079647820 000008+ 010194507>"
"0 00000000000 000111033122108 + 077782103 >"
"5 00002700>"
"0100000241504> 113730619000003 472360720026+ 010231043>"


Righto.

Running the code at the bottom, here are the results I got:

Benchmarking type MultiSpace
Run #1
RegexReplace 00:00:51.103483 2
RegexReplaceWit hTest 00:00:45.944313 6
CompiledRegexRe placeWithTest 00:00:15.342060 8
StringReplace 00:00:06.068726 4
StringBuilderSi ngleChar 00:00:03.625212 8
StringBuilderBl ock 00:00:02.183139 2
Run #2
RegexReplace 00:00:51.033382 4
RegexReplaceWit hTest 00:00:45.926038 4
CompiledRegexRe placeWithTest 00:00:15.031614 4
StringReplace 00:00:06.038683 2
StringBuilderSi ngleChar 00:00:03.665270 4
StringBuilderBl ock 00:00:02.133067 2

It looks like the StringBuilderBl ock method is the best by a reasonably
significant margin. The code for that on its own would be:

public static void FlattenSpaces (string x)
{
if (x.IndexOf (" ")==-1)
return x;

StringBuilder builder = new StringBuilder(x .Length);

int start=0;
while (true)
{
int nextDoubleSpace = x.IndexOf (" ", start);
if (nextDoubleSpac e==-1)
break;
builder.Append (x, start, nextDoubleSpace +1-start);
start = nextDoubleSpace +2;
while (start < x.Length && x[start]==' ')
start++;
}
builder.Append (x, start, x.Length-start);
return builder.ToStrin g();
}
Benchmark code (run with -runtwice on my box):
// See http://www.pobox.com/~skeet/csharp/benchmark.html
// for how to run this code.

using System;
using System.Text;
using System.Text.Reg ularExpressions ;

public class MultiSpace
{
static readonly string[] TestCases =
{
"0100 000187004>221 74101 02080003 95200208060+ "+
" 010184 473>",
"01 00000179008>0 00050175 7500100007054 24008+ "+
"0103 97904>",
"01 0000006630 3>104922 351100079647820 000008+ 010194507>",
"0 00000000000 000111033122108 + 077782103 >",
"5 00002700>",
"0100000241504> 113730619000003 472360720026+ 010231043>",
};

static long check;
static int iterations = 100000;

public static void Init(string[] args)
{
if (args.Length != 0)
iterations = Int32.Parse(arg s[0]);
}

public static void Reset()
{
check=0;
}

public static void Check()
{
if (check != 279*iterations)
throw new Exception ("Invalid check total: "+check);
}

[Benchmark]
public static void RegexReplace()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
string x=s;
x = Regex.Replace (x, " +", " ");
total+=x.Length ;
}
}
check=total;
}

[Benchmark]
public static void RegexReplaceWit hTest()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
string x=s;
if (x.IndexOf(" ")!=-1)
x = Regex.Replace (x, " +", " ");
total+=x.Length ;
}
}
check=total;
}

static Regex compiledRegex = new Regex (" +",
RegexOptions.Co mpiled);
[Benchmark]
public static void CompiledRegexRe placeWithTest()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
string x=s;
if (x.IndexOf(" ")!=-1)
x = compiledRegex.R eplace (x, " ");
total+=x.Length ;
}
}
check=total;
}

[Benchmark]
public static void StringReplace()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
string x=s;
while (x.IndexOf(" ")!=-1)
x=x.Replace(" ", " ");
total+=x.Length ;
}
}
check=total;
}

[Benchmark]
public static void StringBuilderSi ngleChar()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
if (s.IndexOf (" ")==-1)
{
total+=s.Length ;
continue;
}

StringBuilder builder = new StringBuilder(s .Length);
bool inSpace=false;
foreach (char c in s)
{
if (c==' ')
{
if (!inSpace)
builder.Append( c);
inSpace=true;
}
else
{
builder.Append( c);
inSpace=false;
}
}
total+=builder. ToString().Leng th;
}
}
check=total;
}

[Benchmark]
public static void StringBuilderBl ock()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string x in TestCases)
{
if (x.IndexOf (" ")==-1)
{
total+=x.Length ;
continue;
}

StringBuilder builder = new StringBuilder(x .Length);

int start=0;
while (true)
{
int nextDoubleSpace = x.IndexOf (" ", start);
if (nextDoubleSpac e==-1)
break;
builder.Append (x, start, nextDoubleSpace +1-start);
start = nextDoubleSpace +2;
while (start < x.Length && x[start]==' ')
start++;
}
builder.Append (x, start, x.Length-start);
total+=builder. ToString().Leng th;
}
}
check=total;
}
}
--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 13 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

37
4857
by: ajikoe | last post by:
Hello, Is anyone has experiance in running python code to run multi thread parallel in multi processor. Is it possible ? Can python manage which cpu shoud do every thread? Sincerely Yours, Pujo
4
4644
by: Frank Jona | last post by:
Intellisense with C# and a multi-file assembly is not working. With VB.NET it is working. Is there a fix availible? We're using VisualStudio 2003 Regards Frank
12
3856
by: * ProteanThread * | last post by:
but depends upon the clique: http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=954drf%24oca%241%40agate.berkeley.edu&rnum=2&prev=/groups%3Fq%3D%2522cross%2Bposting%2Bversus%2Bmulti%2Bposting%2522%26ie%3DUTF-8%26oe%3DUTF-8%26hl%3Den ...
1
1278
by: DBLWizard | last post by:
I have a multiframe page that when you click on a button on the page it changes the content in one of the other frames. This function works and I am using: parent.fraControl.location.replace(sHref) To change the content. But the Back button in Firefox keeps track of each of the pages. So that when the user does a Back button it only...
6
4875
by: Joe | last post by:
I have 2 multi-list boxes, 1 displays course categories based on a table called CATEGORIES. This table has 2 fields CATEGORY_ID, CATEGORY_NAME The other multi-list box displays courses based on a table called COURSES. This table has 2 fields CATEGORY_ID, COURSE_NAME. The CATEGORY_ID is a FK in COURSES and a PK in CATEGORIES. I want...
1
892
by: Torben Laursen | last post by:
Hi I have a dll that is beeing called by C++, VBA, Java and C# One of my customers does not like that I have bool in the argument list of some of the exported functions so I want to replace all my bool arguments with a int argument. But is -1 treated as false or true in the 4 languages above? Thanks Torben
5
5731
by: bobwansink | last post by:
Hi, I'm relatively new to programming and I would like to create a C++ multi user program. It's for a project for school. This means I will have to write a paper about the theory too. Does anyone know a good place to start looking for some theory on the subject of multi user applications? I know only bits and pieces, like about...
5
3256
by: dkelly925 | last post by:
Is there a way to add an If Statement to the following code so if data in a field equals "x" it will launch one report and if it equals "y" it would open another report. Anyone know how to modify this? Private Sub cmdPreview_Click() On Error GoTo Err_Handler 'Purpose: Open the report filtered to the items selected in the list box....
23
439
by: Umesh | last post by:
This is a basic thing. Say A=0100 0001 in ASCII which deals with 256 characters(you know better than me!) But we deal with only four characters and 2 bits are enough to encode them. I want to confirm if we can encode A in 2bits(say 00), B in 2 bits (01), C in 2 bits(10) and D in 2 bits by some program. I only use this four alphabet in my...
3
2156
by: lex __ | last post by:
I'm tryin to use regexp to replace multi-line c-style comments (like /* this /n */ ) with /n (newlines). I tried someting like re.sub('/\*(.*)/\*' , '/n' , file) but it doesn't work for multiple lines. besides that I want to keep all newlines as they were in the original file,so I can still use the original linenumbers (I want to use...
0
7893
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7821
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
1
7928
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
8188
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
1
5695
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5369
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3839
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2326
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1422
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.