473,508 Members | 2,007 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Simply replace a string in a string...

Hello,

I have been struggeling for replacing a string in a string. The snippet
from the program below replaces the <, & and > with the XML equivalent
values.

In the program, I allocate space for storing the XML value. This makes
my life a bit easier since I can easilly reallocate space to include
the required new space.

Can you have a look at the source below, and advice if there should be
an issue with the code?

....
if (value != NULL) // e.g. "Johnson & Johnson & Sons Attorneys"
{
nlen=strlen(value);
if ((evalue = (char*) malloc(nlen+1)) == NULL)
{
return NULL;
}
strncpy(evalue,value,nlen);
evalue[nlen] = 0;
e->elem_value = XMLEncodeString(evalue);
// expected "Johnson &amp; Johnson &amp; Sons Attorneys"
}
....

char *XMLEncodeString(char *encstr)
{
string = strrepl(encstr, "&", "&amp;");
string = strrepl(encstr, "<", "&lt;");
string = strrepl(encstr, ">", "&gt;");

return encstr;
}

char *strrepl(char *orgstr, char *oldstr, char *newstr)
{
int oldlen, newlen;
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}

Thank you for your time,

Cheers,
Dirk

Nov 15 '05 #1
4 34074
Locusta wrote:
Hello,

I have been struggeling for replacing a string in a string. The snippet
from the program below replaces the <, & and > with the XML equivalent
values.

In the program, I allocate space for storing the XML value. This makes
my life a bit easier since I can easilly reallocate space to include
the required new space.

Can you have a look at the source below, and advice if there should be
an issue with the code?

...
if (value != NULL) // e.g. "Johnson & Johnson & Sons Attorneys"
{
nlen=strlen(value);
if ((evalue = (char*) malloc(nlen+1)) == NULL)
{
return NULL;
}
strncpy(evalue,value,nlen);
evalue[nlen] = 0;
e->elem_value = XMLEncodeString(evalue);
// expected "Johnson &amp; Johnson &amp; Sons Attorneys"
} No need to cast the return value of malloc in C.
Since you already know the length of value, you don't need strncpy, but
could use memcpy instead. Or use strncpy(evalue,value,nlen+1) to get it
to copy the null terminator too.
...

char *XMLEncodeString(char *encstr)
{
string = strrepl(encstr, "&", "&amp;");
string = strrepl(encstr, "<", "&lt;");
string = strrepl(encstr, ">", "&gt;");

return encstr;
} You probably meant encstr instead of string.

char *strrepl(char *orgstr, char *oldstr, char *newstr)
{
int oldlen, newlen;
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}

Nov 15 '05 #2

"Locusta" <lo***********@gmail.com> wrote
[ comment here. Tell us what parameters the function takes, what it returns,
and what it is
supposed to achieve ]
char *strrepl(char *orgstr, char *oldstr, char *newstr)
{
int oldlen, newlen;
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}

You've got the basic idea.

The first comment is that sometimes you return the original string (if there
are no replacements) and sometimes you return a reallocated updated string.
This means that orgstr must always be allocated with malloc() - something
you need to document, as a caller cannot be expected to know this.

I haven't run this code but I cannot see any glaring errors. There are
potential problems with memory leaks in the unlikely event of malloc()
failing.
Nov 15 '05 #3
On 2005-10-26, Locusta <lo***********@gmail.com> wrote:
Hello,

I have been struggeling for replacing a string in a string. The
snippet from the program below replaces the <, & and > with the
XML equivalent values.

In the program, I allocate space for storing the XML value.
This makes my life a bit easier since I can easilly reallocate
space to include the required new space.

Can you have a look at the source below, and advice if there
should be an issue with the code?

...
if (value != NULL) // e.g. "Johnson & Johnson & Sons Attorneys"
{
nlen=strlen(value);
if ((evalue = (char*) malloc(nlen+1)) == NULL)
It is possible to allocate enough space in the first place. You
can count the number of replacements and how many characters
you'll need exactly. Your program does a lot of reallocating you
could avoid with one traversal of the string.
{
return NULL;
}
strncpy(evalue,value,nlen);
evalue[nlen] = 0;
e->elem_value = XMLEncodeString(evalue);
// expected "Johnson &amp; Johnson &amp; Sons Attorneys"
}
What happens instead? A crash?
...

char *XMLEncodeString(char *encstr)
{
string = strrepl(encstr, "&", "&amp;");
string = strrepl(encstr, "<", "&lt;");
string = strrepl(encstr, ">", "&gt;");

return encstr;
}

char *strrepl(char *orgstr, char *oldstr, char *newstr)
{
int oldlen, newlen;
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
One problem is that p and s might get invalidated every time you
call realloc. You will need to store them as offsets instead, if
you don't want them invalidated.
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}


--
Neil Cerutti
Nov 15 '05 #4
Locusta wrote:
Hello,

I have been struggeling for replacing a string in a string. The snippet
from the program below replaces the <, & and > with the XML equivalent
values.

In the program, I allocate space for storing the XML value. This makes
my life a bit easier since I can easilly reallocate space to include
the required new space.

Can you have a look at the source below, and advice if there should be
an issue with the code?
Please, if you have a problem, then state it clearly.
Also try to be clear and concise in the description of what you
expect your functions to do, especially the interfaces.

And provide a compiling minimal example. It may be that there
are issues we cannot see as you did not show them.

You at least need
#include <stdlib.h>
#include <string.h>

...
if (value != NULL) // e.g. "Johnson & Johnson & Sons Attorneys"
{
nlen=strlen(value);
if ((evalue = (char*) malloc(nlen+1)) == NULL)
Do not cast the return value of malloc() in C.
{
return NULL;
}
strncpy(evalue,value,nlen);
evalue[nlen] = 0;
This is overly paranoid. You already catered for the string terminator
by using +1. Just use strcpy() here.
e->elem_value = XMLEncodeString(evalue);
Okay, this really can be harmless or give me a stomach ache:
Is the string to the start of which evalue points unchanged?
Do you have to free() evalue and elem_value even if nothing changed?
// expected "Johnson &amp; Johnson &amp; Sons Attorneys"
}
...

char *XMLEncodeString(char *encstr)
See above for potential questions. If encstr is not supposed to be
changed, then make it "const char *encstr" {
string = strrepl(encstr, "&", "&amp;");
string = strrepl(encstr, "<", "&lt;");
string = strrepl(encstr, ">", "&gt;");
You probably mean encstr instead of string.
If you did, then this means that strrepl() must never fail or,
if it fails, must return NULL and also accept NULL as first argument.
return encstr;
}
Note: For a clear interface, I would either want to have
char *XMLEncodeString(const char *encstr);
i.e. an unchanged input string and get back an allocated
string or
int *XMLEncodeString(char **pEncStr);
where the return value just tells me about success/failure
and pEncStr is the address of a char pointer which points
either to dynamically allocated storage containing a string
or is a null pointer.

char *strrepl(char *orgstr, char *oldstr, char *newstr) The identifier strrepl invades the implementation namespace,
as str is followed by a letter from a-z.
Either make it str_repl or give it a completely different
name, say replaceString.

Okay, here you have essentially the same interface as above
but for the second and third parameter. Personally, I once
more would go for
int str_repl(char **pOrgStr,
Okay, now for oldstr and newstr: You obviously want to be
able to have string literals as arguments; string literals
must not be modified. You promise this by
const char *oldstr,
const char *newstr); {
int oldlen, newlen;
Sizes in C often are best expressed in their natural type,
the unsigned type size_t.
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
oldstr and newstr _never_ change, so you need to compute oldlen and
newlen only once: before the loop.
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}


Okay, so let's have a look at the algorithm:
You are extending the string on-the-fly. This means that realloc()
always has to copy the whole thing. In addition, you have the problem
that you have many realloc() calls.

There are two easy ways to remedy this _and_ improve the simplicity
and the "debuggability": I first assume that orgstr and the target
string are disjoint.
1) Worst-case: If oldlen>=newlen, then do nothing.
Otherwise, extend your target string to contain up to
ceil( ((double)newlen/oldlen) * strlen(orgstr) )
characters (plus string terminator).

Now, you just copy the original string into the target string
while replacing oldstr on the fly (using strncpy() or memcpy(),
strstr() etc.)

Afterwards, you resize the target string storage "back" to
strlen(target string) + 1.
2) Count first: Count all non-overlapping occurrences of oldstr
in orgstr (numoccur). Extend or shorten the target string to
contain strlen(orgstr) + (newlen - oldlen)*numoccur characters
(plus string terminator).

Now, copy and replace on the fly.

Above, orgstr also _is_ the target string. Either make them disjoint
and set free(orgstr); orgstr=target string at the very end or
take care:
1) In order to not to have to copy _all_ the time, realloc(orgstr)
to the size described above (which is always >= strlen(orgstr)).
Now, memmove() the string at the beginning to the end of the
allocated storage, keep a pointer to the actual string start and
to the current "new end" (initially start of the storage area).
Now, at copying and replacing, you memmove() the storage area
between string start and next occurrence of oldstr to the new end
and append newstr to the new end. Increase the new end by the moved
bytes and newlen and the string start by the moved bytes and oldlen.
Then again, until you reach the string/storage end.
2) If newlen < oldlen, keep the old string and resize at the end.
Otherwise, do as described in 1).

Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 15 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
8132
by: Barnes | last post by:
Does anyone know of a good way to use the JavaScript string.replace() method in an ASP form? Here is the scenario: I have a form that cannot accept apostrophes. I want to use the replace() so...
4
62086
by: higabe | last post by:
Three questions 1) I have a string function that works perfectly but according to W3C.org web site is syntactically flawed because it contains the characters </ in sequence. So how am I...
24
4468
by: Wim Roffal | last post by:
Is there a possibility to do a string replace in javascript without regular experessions. It feels like using a hammer to crash an egg. Wim
12
4018
by: Brian | last post by:
I want to use regxp to check that a form input contains at least 1 non-space charcter. I'd like to only run this if the browser supports it. For DOM stuff, I'd use if (documentGetElementById) {}...
22
11085
by: Phlip | last post by:
C++ers: Here's an open ended STL question. What's the smarmiest most templated way to use <string>, <algorithms> etc. to turn this: " able search baker search charlie " into this: " able...
9
2142
by: Crirus | last post by:
dim pp as string pp="{X=356, Y=256}{X=356, Y=311.2285}{X=311.2285, Y=356}{X=256, Y=356}{X=200.7715, Y=356}{X=156, Y=311.2285}{X=156, Y=256}{X=156, Y=200.7715}{X=200.7715, Y=156}{X=256,...
4
3831
by: Cor | last post by:
Hi Newsgroup, I have given an answer in this newsgroup about a "Replace". There came an answer on that I did not understand, so I have done some tests. I got the idea that someone said,...
5
2416
by: djc | last post by:
I need to prepare a large text database field to display in an asp.net repeater control. Currently I am replacing all chr(13)'s with a "<br/>" and it works fine. However, now I also want to be able...
1
3380
by: NvrBst | last post by:
I want to use the .replace() method with the regular expression /^ %VAR % =,($|&)/. The following DOESN'T replace the "^default.aspx=,($|&)" regular expression with "":...
0
7226
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7125
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7388
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7499
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
5055
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4709
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
1561
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
767
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
422
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.