By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,502 Members | 1,620 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,502 IT Pros & Developers. It's quick & easy.

Simply replace a string in a string...

P: n/a
Hello,

I have been struggeling for replacing a string in a string. The snippet
from the program below replaces the <, & and > with the XML equivalent
values.

In the program, I allocate space for storing the XML value. This makes
my life a bit easier since I can easilly reallocate space to include
the required new space.

Can you have a look at the source below, and advice if there should be
an issue with the code?

....
if (value != NULL) // e.g. "Johnson & Johnson & Sons Attorneys"
{
nlen=strlen(value);
if ((evalue = (char*) malloc(nlen+1)) == NULL)
{
return NULL;
}
strncpy(evalue,value,nlen);
evalue[nlen] = 0;
e->elem_value = XMLEncodeString(evalue);
// expected "Johnson &amp; Johnson &amp; Sons Attorneys"
}
....

char *XMLEncodeString(char *encstr)
{
string = strrepl(encstr, "&", "&amp;");
string = strrepl(encstr, "<", "&lt;");
string = strrepl(encstr, ">", "&gt;");

return encstr;
}

char *strrepl(char *orgstr, char *oldstr, char *newstr)
{
int oldlen, newlen;
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}

Thank you for your time,

Cheers,
Dirk

Nov 15 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Locusta wrote:
Hello,

I have been struggeling for replacing a string in a string. The snippet
from the program below replaces the <, & and > with the XML equivalent
values.

In the program, I allocate space for storing the XML value. This makes
my life a bit easier since I can easilly reallocate space to include
the required new space.

Can you have a look at the source below, and advice if there should be
an issue with the code?

...
if (value != NULL) // e.g. "Johnson & Johnson & Sons Attorneys"
{
nlen=strlen(value);
if ((evalue = (char*) malloc(nlen+1)) == NULL)
{
return NULL;
}
strncpy(evalue,value,nlen);
evalue[nlen] = 0;
e->elem_value = XMLEncodeString(evalue);
// expected "Johnson &amp; Johnson &amp; Sons Attorneys"
} No need to cast the return value of malloc in C.
Since you already know the length of value, you don't need strncpy, but
could use memcpy instead. Or use strncpy(evalue,value,nlen+1) to get it
to copy the null terminator too.
...

char *XMLEncodeString(char *encstr)
{
string = strrepl(encstr, "&", "&amp;");
string = strrepl(encstr, "<", "&lt;");
string = strrepl(encstr, ">", "&gt;");

return encstr;
} You probably meant encstr instead of string.

char *strrepl(char *orgstr, char *oldstr, char *newstr)
{
int oldlen, newlen;
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}

Nov 15 '05 #2

P: n/a

"Locusta" <lo***********@gmail.com> wrote
[ comment here. Tell us what parameters the function takes, what it returns,
and what it is
supposed to achieve ]
char *strrepl(char *orgstr, char *oldstr, char *newstr)
{
int oldlen, newlen;
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}

You've got the basic idea.

The first comment is that sometimes you return the original string (if there
are no replacements) and sometimes you return a reallocated updated string.
This means that orgstr must always be allocated with malloc() - something
you need to document, as a caller cannot be expected to know this.

I haven't run this code but I cannot see any glaring errors. There are
potential problems with memory leaks in the unlikely event of malloc()
failing.
Nov 15 '05 #3

P: n/a
On 2005-10-26, Locusta <lo***********@gmail.com> wrote:
Hello,

I have been struggeling for replacing a string in a string. The
snippet from the program below replaces the <, & and > with the
XML equivalent values.

In the program, I allocate space for storing the XML value.
This makes my life a bit easier since I can easilly reallocate
space to include the required new space.

Can you have a look at the source below, and advice if there
should be an issue with the code?

...
if (value != NULL) // e.g. "Johnson & Johnson & Sons Attorneys"
{
nlen=strlen(value);
if ((evalue = (char*) malloc(nlen+1)) == NULL)
It is possible to allocate enough space in the first place. You
can count the number of replacements and how many characters
you'll need exactly. Your program does a lot of reallocating you
could avoid with one traversal of the string.
{
return NULL;
}
strncpy(evalue,value,nlen);
evalue[nlen] = 0;
e->elem_value = XMLEncodeString(evalue);
// expected "Johnson &amp; Johnson &amp; Sons Attorneys"
}
What happens instead? A crash?
...

char *XMLEncodeString(char *encstr)
{
string = strrepl(encstr, "&", "&amp;");
string = strrepl(encstr, "<", "&lt;");
string = strrepl(encstr, ">", "&gt;");

return encstr;
}

char *strrepl(char *orgstr, char *oldstr, char *newstr)
{
int oldlen, newlen;
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
One problem is that p and s might get invalidated every time you
call realloc. You will need to store them as offsets instead, if
you don't want them invalidated.
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}


--
Neil Cerutti
Nov 15 '05 #4

P: n/a
Locusta wrote:
Hello,

I have been struggeling for replacing a string in a string. The snippet
from the program below replaces the <, & and > with the XML equivalent
values.

In the program, I allocate space for storing the XML value. This makes
my life a bit easier since I can easilly reallocate space to include
the required new space.

Can you have a look at the source below, and advice if there should be
an issue with the code?
Please, if you have a problem, then state it clearly.
Also try to be clear and concise in the description of what you
expect your functions to do, especially the interfaces.

And provide a compiling minimal example. It may be that there
are issues we cannot see as you did not show them.

You at least need
#include <stdlib.h>
#include <string.h>

...
if (value != NULL) // e.g. "Johnson & Johnson & Sons Attorneys"
{
nlen=strlen(value);
if ((evalue = (char*) malloc(nlen+1)) == NULL)
Do not cast the return value of malloc() in C.
{
return NULL;
}
strncpy(evalue,value,nlen);
evalue[nlen] = 0;
This is overly paranoid. You already catered for the string terminator
by using +1. Just use strcpy() here.
e->elem_value = XMLEncodeString(evalue);
Okay, this really can be harmless or give me a stomach ache:
Is the string to the start of which evalue points unchanged?
Do you have to free() evalue and elem_value even if nothing changed?
// expected "Johnson &amp; Johnson &amp; Sons Attorneys"
}
...

char *XMLEncodeString(char *encstr)
See above for potential questions. If encstr is not supposed to be
changed, then make it "const char *encstr" {
string = strrepl(encstr, "&", "&amp;");
string = strrepl(encstr, "<", "&lt;");
string = strrepl(encstr, ">", "&gt;");
You probably mean encstr instead of string.
If you did, then this means that strrepl() must never fail or,
if it fails, must return NULL and also accept NULL as first argument.
return encstr;
}
Note: For a clear interface, I would either want to have
char *XMLEncodeString(const char *encstr);
i.e. an unchanged input string and get back an allocated
string or
int *XMLEncodeString(char **pEncStr);
where the return value just tells me about success/failure
and pEncStr is the address of a char pointer which points
either to dynamically allocated storage containing a string
or is a null pointer.

char *strrepl(char *orgstr, char *oldstr, char *newstr) The identifier strrepl invades the implementation namespace,
as str is followed by a letter from a-z.
Either make it str_repl or give it a completely different
name, say replaceString.

Okay, here you have essentially the same interface as above
but for the second and third parameter. Personally, I once
more would go for
int str_repl(char **pOrgStr,
Okay, now for oldstr and newstr: You obviously want to be
able to have string literals as arguments; string literals
must not be modified. You promise this by
const char *oldstr,
const char *newstr); {
int oldlen, newlen;
Sizes in C often are best expressed in their natural type,
the unsigned type size_t.
char *s, *p;
s = orgstr;
while (s != NULL)
{
p = strstr(s, oldstr);
if (p == NULL )
return orgstr;
oldlen = strlen(oldstr);
newlen = strlen(newstr);
oldstr and newstr _never_ change, so you need to compute oldlen and
newlen only once: before the loop.
orgstr = (char*)realloc(orgstr, strlen(orgstr)-oldlen+newlen+1);
if (orgstr == NULL)
return NULL;
memmove(p + newlen, p + oldlen, strlen(p + oldlen) + 1);
memcpy(p, newstr, newlen);
s = p + newlen;
}
return orgstr;
}


Okay, so let's have a look at the algorithm:
You are extending the string on-the-fly. This means that realloc()
always has to copy the whole thing. In addition, you have the problem
that you have many realloc() calls.

There are two easy ways to remedy this _and_ improve the simplicity
and the "debuggability": I first assume that orgstr and the target
string are disjoint.
1) Worst-case: If oldlen>=newlen, then do nothing.
Otherwise, extend your target string to contain up to
ceil( ((double)newlen/oldlen) * strlen(orgstr) )
characters (plus string terminator).

Now, you just copy the original string into the target string
while replacing oldstr on the fly (using strncpy() or memcpy(),
strstr() etc.)

Afterwards, you resize the target string storage "back" to
strlen(target string) + 1.
2) Count first: Count all non-overlapping occurrences of oldstr
in orgstr (numoccur). Extend or shorten the target string to
contain strlen(orgstr) + (newlen - oldlen)*numoccur characters
(plus string terminator).

Now, copy and replace on the fly.

Above, orgstr also _is_ the target string. Either make them disjoint
and set free(orgstr); orgstr=target string at the very end or
take care:
1) In order to not to have to copy _all_ the time, realloc(orgstr)
to the size described above (which is always >= strlen(orgstr)).
Now, memmove() the string at the beginning to the end of the
allocated storage, keep a pointer to the actual string start and
to the current "new end" (initially start of the storage area).
Now, at copying and replacing, you memmove() the storage area
between string start and next occurrence of oldstr to the new end
and append newstr to the new end. Increase the new end by the moved
bytes and newlen and the string start by the moved bytes and oldlen.
Then again, until you reach the string/storage end.
2) If newlen < oldlen, keep the old string and resize at the end.
Otherwise, do as described in 1).

Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 15 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.