473,804 Members | 3,744 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to replace multi-spaces within a string with single-space

Hello,

I was wondering if there is a method that exists to replace multi-spaces
within a string with single-space.
eg:
"12 3 4 56" --> "12 3 4 56"

I think this could be done by looking at each char within a loop and copying
the char to a stringBuilder instance
if current and previous char are not spaces...
But as always, I would prefer to use an existing method ;-)

Thanks,
José
Nov 13 '05 #1
2 5116
José Joye <jo*******@KILL THESPAMSbluewin .ch> wrote:
Yes, in my case, efficiency is a topic. However, I agree that my strings are
quite small and the number of multi-spaces should not be to many.

I was wondering if the Regex solution is slower than the other solutions. If
yes, do you know how much slower?


I really don't know. Could you post a sample selection of strings
(including whatever proportion would have no multi-spaces at all)? If
so, I can benchmark a few ways of doing it...

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 13 '05 #2
José Joye <jo*******@KILL THESPAMSbluewin .ch> wrote:
In fact, my strings are OCR-B lines read from Bank/Post Slips.
Each line should contains at most 80 chars. I have removed the Heading and
leading spaces with the Trim()
method.

So this can be some samples:
"0100 000187004>221 74101 02080003 95200208060+ 010184 473>"
"01 00000179008>0 00050175 7500100007054 24008+ 0103 97904>"
"01 0000006630 3>104922 351100079647820 000008+ 010194507>"
"0 00000000000 000111033122108 + 077782103 >"
"5 00002700>"
"0100000241504> 113730619000003 472360720026+ 010231043>"


Righto.

Running the code at the bottom, here are the results I got:

Benchmarking type MultiSpace
Run #1
RegexReplace 00:00:51.103483 2
RegexReplaceWit hTest 00:00:45.944313 6
CompiledRegexRe placeWithTest 00:00:15.342060 8
StringReplace 00:00:06.068726 4
StringBuilderSi ngleChar 00:00:03.625212 8
StringBuilderBl ock 00:00:02.183139 2
Run #2
RegexReplace 00:00:51.033382 4
RegexReplaceWit hTest 00:00:45.926038 4
CompiledRegexRe placeWithTest 00:00:15.031614 4
StringReplace 00:00:06.038683 2
StringBuilderSi ngleChar 00:00:03.665270 4
StringBuilderBl ock 00:00:02.133067 2

It looks like the StringBuilderBl ock method is the best by a reasonably
significant margin. The code for that on its own would be:

public static void FlattenSpaces (string x)
{
if (x.IndexOf (" ")==-1)
return x;

StringBuilder builder = new StringBuilder(x .Length);

int start=0;
while (true)
{
int nextDoubleSpace = x.IndexOf (" ", start);
if (nextDoubleSpac e==-1)
break;
builder.Append (x, start, nextDoubleSpace +1-start);
start = nextDoubleSpace +2;
while (start < x.Length && x[start]==' ')
start++;
}
builder.Append (x, start, x.Length-start);
return builder.ToStrin g();
}
Benchmark code (run with -runtwice on my box):
// See http://www.pobox.com/~skeet/csharp/benchmark.html
// for how to run this code.

using System;
using System.Text;
using System.Text.Reg ularExpressions ;

public class MultiSpace
{
static readonly string[] TestCases =
{
"0100 000187004>221 74101 02080003 95200208060+ "+
" 010184 473>",
"01 00000179008>0 00050175 7500100007054 24008+ "+
"0103 97904>",
"01 0000006630 3>104922 351100079647820 000008+ 010194507>",
"0 00000000000 000111033122108 + 077782103 >",
"5 00002700>",
"0100000241504> 113730619000003 472360720026+ 010231043>",
};

static long check;
static int iterations = 100000;

public static void Init(string[] args)
{
if (args.Length != 0)
iterations = Int32.Parse(arg s[0]);
}

public static void Reset()
{
check=0;
}

public static void Check()
{
if (check != 279*iterations)
throw new Exception ("Invalid check total: "+check);
}

[Benchmark]
public static void RegexReplace()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
string x=s;
x = Regex.Replace (x, " +", " ");
total+=x.Length ;
}
}
check=total;
}

[Benchmark]
public static void RegexReplaceWit hTest()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
string x=s;
if (x.IndexOf(" ")!=-1)
x = Regex.Replace (x, " +", " ");
total+=x.Length ;
}
}
check=total;
}

static Regex compiledRegex = new Regex (" +",
RegexOptions.Co mpiled);
[Benchmark]
public static void CompiledRegexRe placeWithTest()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
string x=s;
if (x.IndexOf(" ")!=-1)
x = compiledRegex.R eplace (x, " ");
total+=x.Length ;
}
}
check=total;
}

[Benchmark]
public static void StringReplace()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
string x=s;
while (x.IndexOf(" ")!=-1)
x=x.Replace(" ", " ");
total+=x.Length ;
}
}
check=total;
}

[Benchmark]
public static void StringBuilderSi ngleChar()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string s in TestCases)
{
if (s.IndexOf (" ")==-1)
{
total+=s.Length ;
continue;
}

StringBuilder builder = new StringBuilder(s .Length);
bool inSpace=false;
foreach (char c in s)
{
if (c==' ')
{
if (!inSpace)
builder.Append( c);
inSpace=true;
}
else
{
builder.Append( c);
inSpace=false;
}
}
total+=builder. ToString().Leng th;
}
}
check=total;
}

[Benchmark]
public static void StringBuilderBl ock()
{
long total=0;

for (int i = iterations; i>0; i--)
{
foreach (string x in TestCases)
{
if (x.IndexOf (" ")==-1)
{
total+=x.Length ;
continue;
}

StringBuilder builder = new StringBuilder(x .Length);

int start=0;
while (true)
{
int nextDoubleSpace = x.IndexOf (" ", start);
if (nextDoubleSpac e==-1)
break;
builder.Append (x, start, nextDoubleSpace +1-start);
start = nextDoubleSpace +2;
while (start < x.Length && x[start]==' ')
start++;
}
builder.Append (x, start, x.Length-start);
total+=builder. ToString().Leng th;
}
}
check=total;
}
}
--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 13 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

37
4901
by: ajikoe | last post by:
Hello, Is anyone has experiance in running python code to run multi thread parallel in multi processor. Is it possible ? Can python manage which cpu shoud do every thread? Sincerely Yours, Pujo
4
4679
by: Frank Jona | last post by:
Intellisense with C# and a multi-file assembly is not working. With VB.NET it is working. Is there a fix availible? We're using VisualStudio 2003 Regards Frank
12
3884
by: * ProteanThread * | last post by:
but depends upon the clique: http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=954drf%24oca%241%40agate.berkeley.edu&rnum=2&prev=/groups%3Fq%3D%2522cross%2Bposting%2Bversus%2Bmulti%2Bposting%2522%26ie%3DUTF-8%26oe%3DUTF-8%26hl%3Den ...
1
1293
by: DBLWizard | last post by:
I have a multiframe page that when you click on a button on the page it changes the content in one of the other frames. This function works and I am using: parent.fraControl.location.replace(sHref) To change the content. But the Back button in Firefox keeps track of each of the pages. So that when the user does a Back button it only redirects them back through each of the different content pages instead of back to the Lookup Page...
6
4900
by: Joe | last post by:
I have 2 multi-list boxes, 1 displays course categories based on a table called CATEGORIES. This table has 2 fields CATEGORY_ID, CATEGORY_NAME The other multi-list box displays courses based on a table called COURSES. This table has 2 fields CATEGORY_ID, COURSE_NAME. The CATEGORY_ID is a FK in COURSES and a PK in CATEGORIES. I want to populate the course list box based on any category(s)
1
900
by: Torben Laursen | last post by:
Hi I have a dll that is beeing called by C++, VBA, Java and C# One of my customers does not like that I have bool in the argument list of some of the exported functions so I want to replace all my bool arguments with a int argument. But is -1 treated as false or true in the 4 languages above? Thanks Torben
5
5773
by: bobwansink | last post by:
Hi, I'm relatively new to programming and I would like to create a C++ multi user program. It's for a project for school. This means I will have to write a paper about the theory too. Does anyone know a good place to start looking for some theory on the subject of multi user applications? I know only bits and pieces, like about transactions, but a compendium of possible approches to multi user programming would be very appreciated!
5
3288
by: dkelly925 | last post by:
Is there a way to add an If Statement to the following code so if data in a field equals "x" it will launch one report and if it equals "y" it would open another report. Anyone know how to modify this? Private Sub cmdPreview_Click() On Error GoTo Err_Handler 'Purpose: Open the report filtered to the items selected in the list box. 'Author: Allen J Browne, 2004. http://allenbrowne.com Dim varItem As Variant 'Selected items
23
439
by: Umesh | last post by:
This is a basic thing. Say A=0100 0001 in ASCII which deals with 256 characters(you know better than me!) But we deal with only four characters and 2 bits are enough to encode them. I want to confirm if we can encode A in 2bits(say 00), B in 2 bits (01), C in 2 bits(10) and D in 2 bits by some program. I only use this four alphabet in my work. Can u pl write a sample program to reach my goal?
3
2166
by: lex __ | last post by:
I'm tryin to use regexp to replace multi-line c-style comments (like /* this /n */ ) with /n (newlines). I tried someting like re.sub('/\*(.*)/\*' , '/n' , file) but it doesn't work for multiple lines. besides that I want to keep all newlines as they were in the original file,so I can still use the original linenumbers (I want to use linenumbers as a reference for later use.) I know that that will complicate things a bit more, so...
0
9706
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10569
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10315
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9140
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6847
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5519
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5651
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3815
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2990
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.