473,785 Members | 2,831 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Efficient fixed width string substition puzzle

Hi,

I'm looking for an efficient way to do this, because I know it will be heavily used :-)

I have a fixed width string and I need to substitute a substring of characters with new values. I
can do this with 2 substring calls, but it will need to rebuild the string just to write a few
characters.

Here is the simple, but inefficient, version:
string s = "0123456789 ";
string r = "abc"; // Value to substitute

int Offset = 3; // Starting index of substring to change
int Length = 3; // Length of substring

// Replace a substring with one of equal length, based on offset and length:
Console.WriteLi ne("Substring: " + s.Substring(Off set, Length)); // Displays "345"
Console.WriteLi ne("Original: [" + s + "]");

s = s.Substring(0, Offset) + r.PadLeft(3, ' ') + s.Substring(Off set + Length);

Console.WriteLi ne("Result: [" + s + "]");

This will take the string "0123456789 " and replace the characters starting at offset 3 with "abc".
The result is "012abc6789 "

I am guaranteeing that the lengths are the same, so in C/C++ I could do something like this with a
memcpy, but that isn't a very friendly way :-)

TIA,

Jami

Nov 16 '05 #1
8 5420
Use the StringBuilder class; it's optimized for things like this.

Tom Dacon
Dacon Software Consulting

"Jami Bradley" <jb******@isa-og.com> wrote in message
news:ab******** *************** *********@4ax.c om...
Hi,

I'm looking for an efficient way to do this, because I know it will be heavily used :-)
I have a fixed width string and I need to substitute a substring of characters with new values. I can do this with 2 substring calls, but it will need to rebuild the string just to write a few characters.

Here is the simple, but inefficient, version:
string s = "0123456789 ";
string r = "abc"; // Value to substitute

int Offset = 3; // Starting index of substring to change
int Length = 3; // Length of substring

// Replace a substring with one of equal length, based on offset and length: Console.WriteLi ne("Substring: " + s.Substring(Off set, Length)); // Displays "345" Console.WriteLi ne("Original: [" + s + "]");

s = s.Substring(0, Offset) + r.PadLeft(3, ' ') + s.Substring(Off set + Length);
Console.WriteLi ne("Result: [" + s + "]");

This will take the string "0123456789 " and replace the characters starting at offset 3 with "abc". The result is "012abc6789 "

I am guaranteeing that the lengths are the same, so in C/C++ I could do something like this with a memcpy, but that isn't a very friendly way :-)

TIA,

Jami

Nov 16 '05 #2
Jami,
but it will need to rebuild the string just to write a few characters.


Since strings are immutable, you'll always have to create a new string
one way or another. I'd use a StringBuilder or a char[] to reduce the
number of intermediate strings created.

Mattias

--
Mattias Sjögren [MVP] mattias @ mvps.org
http://www.msjogren.net/dotnet/ | http://www.dotnetinterop.com
Please reply only to the newsgroup.
Nov 16 '05 #3
Mattias Sjögren <ma************ ********@mvps.o rg> wrote:
but it will need to rebuild the string just to write a few characters.


Since strings are immutable, you'll always have to create a new string
one way or another. I'd use a StringBuilder or a char[] to reduce the
number of intermediate strings created.


Of these, I'd go for the StringBuilder option, creating it with the
right buffer size to start with, and then using:

builder.Append (s, 0, Offset);
builder.Append (r);
builder.Append (s, Offset+Length, s.Length-(Offset+Length) );

This should avoid creating any temporary objects other than the builder
itself.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #4
Thanks everyone for their tips. I decided to try all three options and time them to see what it
would take. I tried doing a simple 3 byte copy into the middle of the string - similar to what I
would expect during production. With 10M iterations, I had the following results:

// Test results on 2.4GHz P4:
// TestString: 10000000 Iterations in 3.8946357707474 seconds
// TestStringBuild er: 10000000 Iterations in 5.4614639570113 seconds
// TestCharArray: 10000000 Iterations in 2.0478365267094 seconds

Some interesting notes:
1. I needed to copy the StringBuilder back to a string so that I could loop - otherwise I would be
stepping on myself.
2. StringBuilder is slower than string! I presume this is mostly due to the extra copy.
3. As expected, the character array is the fastest.
4. All of these are extremely fast, I don't think the efficiency gains will matter! :-)

I hope this is useful to others. I've included the source below for those interested. Timing was
done with the PerformanceCoun ter.

Jami
----------------------------------------------------------------------------------

static void TestString(int Count)
{
string s = "0123456789 ";
string r = "abc"; // Value to substitute

int Offset = 3; // Starting index of substring to change
int Length = 3; // Length of substring

for (int Idx = 0; Idx < Count; ++Idx)
{
// Replace a substring with one of equal length, based on offset and length:
//Console.WriteLi ne("Substring: " + s.Substring(Off set, Length)); // Displays "345"
//Console.WriteLi ne("Original: [" + s + "]");
s = s.Substring(0, Offset) + r.PadLeft(3, ' ') + s.Substring(Off set + Length);
//Console.WriteLi ne("Result: [" + s + "]");
}
return;
}

static void TestStringBuild er(int Count)
{
string s = "0123456789 ";
string r = "abc"; // Value to substitute

int Offset = 3; // Starting index of substring to change
int Length = 3; // Length of substring

for (int Idx = 0; Idx < Count; ++Idx)
{
// Replace a substring with one of equal length, based on offset and length:
StringBuilder sb = new StringBuilder(s .Length);
sb.Append(s, 0, Offset);
sb.Append(r);
sb.Append(s, Offset+Length, s.Length-(Offset+Length) );
s = sb.ToString();
//Console.WriteLi ne("Result: [" + s + "]");
}
return;
}

static void TestCharArray(i nt Count)
{
char[] s = "0123456789".To CharArray();
char[] r = "abc".ToCharArr ay(); // Value to substitute

int Offset = 3; // Starting index of substring to change

for (int Idx = 0; Idx < Count; ++Idx)
{
r.CopyTo(s, Offset);
}
return;
}

On Tue, 3 Aug 2004 09:31:33 +0100, Jon Skeet [C# MVP] <sk***@pobox.co m> wrote:
Mattias Sjögren <ma************ ********@mvps.o rg> wrote:
>but it will need to rebuild the string just to write a few characters.


Since strings are immutable, you'll always have to create a new string
one way or another. I'd use a StringBuilder or a char[] to reduce the
number of intermediate strings created.


Of these, I'd go for the StringBuilder option, creating it with the
right buffer size to start with, and then using:

builder.Appe nd (s, 0, Offset);
builder.Appe nd (r);
builder.Appe nd (s, Offset+Length, s.Length-(Offset+Length) );

This should avoid creating any temporary objects other than the builder
itself.


Nov 16 '05 #5
And one more note :-)

I tried increasing the starting string to 300 bytes, so that it would be more like my problem, and
the timing results changed to the following:

TestString: 10000000 Iterations in 11.463164452465 3 seconds
TestStringBuild er: 10000000 Iterations in 10.010672026752 seconds
TestCharArray: 10000000 Iterations in 2.0776889241509 7 seconds

Not surprisingly, the character array moves ahead. It is interesting to see the StringBuilder pass
the string - makes some sense.

Enjoy,

Jami

On Tue, 03 Aug 2004 10:16:04 -0600, Jami Bradley <jb******@isa-og.com> wrote:
Thanks everyone for their tips. I decided to try all three options and time them to see what it
would take. I tried doing a simple 3 byte copy into the middle of the string - similar to what I
would expect during production. With 10M iterations, I had the following results:

// Test results on 2.4GHz P4:
// TestString: 10000000 Iterations in 3.8946357707474 seconds
// TestStringBuild er: 10000000 Iterations in 5.4614639570113 seconds
// TestCharArray: 10000000 Iterations in 2.0478365267094 seconds

Some interesting notes:
1. I needed to copy the StringBuilder back to a string so that I could loop - otherwise I would be
stepping on myself.
2. StringBuilder is slower than string! I presume this is mostly due to the extra copy.
3. As expected, the character array is the fastest.
4. All of these are extremely fast, I don't think the efficiency gains will matter! :-)

I hope this is useful to others. I've included the source below for those interested. Timing was
done with the PerformanceCoun ter.

Jami
----------------------------------------------------------------------------------

static void TestString(int Count)
{
string s = "0123456789 ";
string r = "abc"; // Value to substitute

int Offset = 3; // Starting index of substring to change
int Length = 3; // Length of substring

for (int Idx = 0; Idx < Count; ++Idx)
{
// Replace a substring with one of equal length, based on offset and length:
//Console.WriteLi ne("Substring: " + s.Substring(Off set, Length)); // Displays "345"
//Console.WriteLi ne("Original: [" + s + "]");
s = s.Substring(0, Offset) + r.PadLeft(3, ' ') + s.Substring(Off set + Length);
//Console.WriteLi ne("Result: [" + s + "]");
}
return;
}

static void TestStringBuild er(int Count)
{
string s = "0123456789 ";
string r = "abc"; // Value to substitute

int Offset = 3; // Starting index of substring to change
int Length = 3; // Length of substring

for (int Idx = 0; Idx < Count; ++Idx)
{
// Replace a substring with one of equal length, based on offset and length:
StringBuilder sb = new StringBuilder(s .Length);
sb.Append(s, 0, Offset);
sb.Append(r);
sb.Append(s, Offset+Length, s.Length-(Offset+Length) );
s = sb.ToString();
//Console.WriteLi ne("Result: [" + s + "]");
}
return;
}

static void TestCharArray(i nt Count)
{
char[] s = "0123456789".To CharArray();
char[] r = "abc".ToCharArr ay(); // Value to substitute

int Offset = 3; // Starting index of substring to change

for (int Idx = 0; Idx < Count; ++Idx)
{
r.CopyTo(s, Offset);
}
return;
}

On Tue, 3 Aug 2004 09:31:33 +0100, Jon Skeet [C# MVP] <sk***@pobox.co m> wrote:
Mattias Sjögren <ma************ ********@mvps.o rg> wrote:
>but it will need to rebuild the string just to write a few characters.

Since strings are immutable, you'll always have to create a new string
one way or another. I'd use a StringBuilder or a char[] to reduce the
number of intermediate strings created.


Of these, I'd go for the StringBuilder option, creating it with the
right buffer size to start with, and then using:

builder.Appen d (s, 0, Offset);
builder.Appen d (r);
builder.Appen d (s, Offset+Length, s.Length-(Offset+Length) );

This should avoid creating any temporary objects other than the builder
itself.


Nov 16 '05 #6
Jami Bradley <jb******@isa-og.com> wrote:
Thanks everyone for their tips. I decided to try all three options
and time them to see what it would take. I tried doing a simple 3
byte copy into the middle of the string - similar to what I would
expect during production. With 10M iterations, I had the following
results:

// Test results on 2.4GHz P4:
// TestString: 10000000 Iterations in 3.8946357707474 seconds
// TestStringBuild er: 10000000 Iterations in 5.4614639570113 seconds
// TestCharArray: 10000000 Iterations in 2.0478365267094 seconds

Some interesting notes:
1. I needed to copy the StringBuilder back to a string so that I could loop - otherwise I would be
stepping on myself.
2. StringBuilder is slower than string! I presume this is mostly due to the extra copy.
3. As expected, the character array is the fastest.
4. All of these are extremely fast, I don't think the efficiency gains will matter! :-)

I hope this is useful to others. I've included the source below for
those interested. Timing was done with the PerformanceCoun ter.


Your test isn't really fair:

1) You don't end up with a string at the end of the TestCharArray
method, which I thought was the point. Just adding a
string s = new string(r); at the end of the loop makes the
TestCharArray version the slowest on my box.

2) You're only allocating the char array (and copying the original
contents) once in TestCharArray - which is no good unless you know
ahead of time what size all the strings you need to work with will be,
*and* that the "surroundin g" string doesn't change between iterations -
and if that's the case, the StringBuilder case can be improved as well,
I suspect. (If it's not the case, you need to call ToCharArray on each
iteration, or use String.CopyTo if the first condition is true but not
the second.)

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #7
It is true that the tests aren't quite identical.

In my case, I am essentially dealing with data records similar to DBase 2 (fixed width records).

My usage will be a small class which owns a record (string or other type) and has get/set methods
to update pieces of the record by field name. The record is fixed length for each record type,
guaranteed.

I really don't care how the record is stored internally, whether it is a string, StringBuilders, or
character array. At the end of the manipulations, I want to grab a copy of the record as a string.
So the typical usage would be 1) create empty fixed length record, 2) set a bunch of fields (about
50 calls), 3) get the record as a string type.

I'm not sure how to improve StringBuilder, because even if I keep the record in a StringBuilder,
I'll need to copy it to make the Append calls.

Thanks,

Jami

On Tue, 3 Aug 2004 18:16:48 +0100, Jon Skeet [C# MVP] <sk***@pobox.co m> wrote:
Jami Bradley <jb******@isa-og.com> wrote:
Thanks everyone for their tips. I decided to try all three options
and time them to see what it would take. I tried doing a simple 3
byte copy into the middle of the string - similar to what I would
expect during production. With 10M iterations, I had the following
results:

// Test results on 2.4GHz P4:
// TestString: 10000000 Iterations in 3.8946357707474 seconds
// TestStringBuild er: 10000000 Iterations in 5.4614639570113 seconds
// TestCharArray: 10000000 Iterations in 2.0478365267094 seconds

Some interesting notes:
1. I needed to copy the StringBuilder back to a string so that I could loop - otherwise I would be
stepping on myself.
2. StringBuilder is slower than string! I presume this is mostly due to the extra copy.
3. As expected, the character array is the fastest.
4. All of these are extremely fast, I don't think the efficiency gains will matter! :-)

I hope this is useful to others. I've included the source below for
those interested. Timing was done with the PerformanceCoun ter.


Your test isn't really fair:

1) You don't end up with a string at the end of the TestCharArray
method, which I thought was the point. Just adding a
string s = new string(r); at the end of the loop makes the
TestCharArra y version the slowest on my box.

2) You're only allocating the char array (and copying the original
contents) once in TestCharArray - which is no good unless you know
ahead of time what size all the strings you need to work with will be,
*and* that the "surroundin g" string doesn't change between iterations -
and if that's the case, the StringBuilder case can be improved as well,
I suspect. (If it's not the case, you need to call ToCharArray on each
iteration, or use String.CopyTo if the first condition is true but not
the second.)


Nov 16 '05 #8
Jami Bradley <jb******@isa-og.com> wrote:
It is true that the tests aren't quite identical.

In my case, I am essentially dealing with data records similar to
DBase 2 (fixed width records).

My usage will be a small class which owns a record (string or other
type) and has get/set methods to update pieces of the record by field
name. The record is fixed length for each record type, guaranteed.

I really don't care how the record is stored internally, whether it
is a string, StringBuilders, or character array. At the end of the
manipulations, I want to grab a copy of the record as a string. So
the typical usage would be 1) create empty fixed length record, 2)
set a bunch of fields (about 50 calls), 3) get the record as a string
type.

I'm not sure how to improve StringBuilder, because even if I keep the
record in a StringBuilder, I'll need to copy it to make the Append
calls.


Okay. It sounds like keeping it in a char array is indeed going to be
the fastest way of doing things. If you're going to be doing lots of
manipulations with a single record, it probably doesn't matter if you
create a new char array for each record - if it were a case of millions
of records with a couple of manipulations each, and efficiency were
*really* an issue, you could have kept just one char array and copied
to it at the start of each set of manipulations.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
8003
by: John F Dutcher | last post by:
I use code like the following to retrieve fields from a form: recd = recd.append(string.ljust(form.getfirst("lname",' '),15)) recd.append(string.ljust(form.getfirst("fname",' '),15)) etc., etc. The intent is to finish by assigning the list to a string that I would write to disk: recstr = string.join(recd,'')
179
44455
by: SoloCDM | last post by:
How do I keep my entire web page at a fixed width? ********************************************************************* Signed, SoloCDM
5
11597
by: Johnny Meredith | last post by:
I have seven huge fixed width text file that I need to import to Access. They contain headers, subtotals, etc. that are not needed. There is also some corrupt data that we know about and can correct once the data is in Access (or during the import process itself). Furthermore, the text files are poorly set up, such that some records may be "shifted" over a few characters, and therefore the fixed width nature of the file is corrupted. ...
1
1636
by: Mark Smith | last post by:
Hi Group, Are there any examples of class for storing fixed width number strings such as phone number and social security numbers. This class would do thing like valid that the number is all integers or hex and confirm that the number is the correct length. I f the number is to short return an error or not store the number, etc.
0
1232
by: Andy Sy | last post by:
Hi Dan, I find that when doing bit-twiddling in pure Python, fixed-width integer support is an extremely handy capability to have in Python regardless of what the apologists (for its absence) say. I added some stuff to fixedint.py to make
4
3967
by: BostonNole | last post by:
I am looking for suggestions on the most efficient way to import 7 different fixed width files into a DataSet. Not all at the same time. One file at a time, but the format could change from file to file. I have 7 (and more coming) different fixed width files that my clients provide to me, within each one the field sizes and ordering is slightly different...maddening I know. Each file could possibly be as large as 150 MG with around...
6
3570
by: =?Utf-8?B?TWljaGFlbA==?= | last post by:
Hi, I need to create a formatted byte array for SMPP, e.g.: 00 00 00 21 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 is the length of the entire message in octets, however the value is only in the 4 octet, is there someway I can format this into fixed width, a comparison would be
10
5152
by: BostonNole | last post by:
Using Visual Studio 2005, .NET 2.0 and VB.NET: I am looking for the fastest possible way to import a very large fixed width file (over 6 million records and over 1.2 GB file size) into a DataTable. Any suggestions?
4
4004
by: Jeff | last post by:
Hey I'm wondering how the Fixed-Width Text Format is What I know is that the top line in this text format will contain column names. and each row beneath the top line represent for example a row in a table etc... But does fixed-with mean that every column has a fixed with: for example first column is 10 char wide and second column is 30 char wide?
0
9645
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9480
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10324
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10090
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9949
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5380
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5511
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4050
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3645
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.