strtok and strtok_r

"siddhu" <si***********@gmail.coma écrit dans le message de news:
11**********************@g4g2000hsf.googlegroups.c om...

Dear experts,

As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with a
null pointer for the first parameter.
My confusion is that this behavior is same as strtok. So I assume
strtok_r must also be using any function static variable to keep the
information about s1. If this is the case then how strtok_r is re-
entrant?
Otherwise how it keeps the information about s1?

strtok_r takes an extra parameter, q pointer to a char * where it stores its
current state.

The implementation is quite straightforward:

char *strtok_r(char *str, const char *delim, char **nextp)
{
char *ret;

if (str == NULL)
str = *nextp;
str += strspn(str, delim);
if (*str == '\0')
return NULL;
ret = str;
str += strcspn(str, delim);
if (*str)
*str++ = '\0';
*nextp = str;
return ret;
}

--
Chqrlie.

Sep 14 '07 #4

siddhu wrote:

>
As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with
a null pointer for the first parameter. My confusion is that
this behavior is same as strtok. So I assume strtok_r must also
be using any function static variable to keep the information
about s1. If this is the case then how strtok_r is re- entrant?
Otherwise how it keeps the information about s1?

There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

/* ------- file tknsplit.c ----------*/
#include "tknsplit.h"

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The
tkn is terminated by the first appearance of tknchar,
or by the end of the source string.

The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.

Returns: a pointer past the terminating tknchar.

This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.

A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
Revised 2006-06-13 2007-05-26 (name)
*/

const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh) /* length tkn can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) src++;

while (*src && (tknchar != *src)) {
if (lgh) {
*tkn++ = *src;
--lgh;
}
src++;
}
if (*src && (tknchar == *src)) src++;
}
*tkn = '\0';
return src;
} /* tknsplit */

#ifdef TESTING
#include <stdio.h>

#define ABRsize 6 /* length of acceptable tkn abbreviations */

/* ---------------- */

static void showtkn(int i, char *tok)
{
putchar(i + '1'); putchar(':');
puts(tok);
} /* showtkn */

/* ---------------- */

int main(void)
{
char teststring[] = "This is a test, ,, abbrev, more";

const char *t, *s = teststring;
int i;
char tkn[ABRsize + 1];

puts(teststring);
t = s;
for (i = 0; i < 4; i++) {
t = tknsplit(t, ',', tkn, ABRsize);
showtkn(i, tkn);
}

puts("\nHow to detect 'no more tkns' while truncating");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ',', tkn, 3);
showtkn(i, tkn);
i++;
}

puts("\nUsing blanks as tkn delimiters");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ' ', tkn, ABRsize);
showtkn(i, tkn);
i++;
}
return 0;
} /* main */

#endif
/* ------- end file tknsplit.c ----------*/

/* ------- file tknsplit.h ----------*/
#ifndef H_tknsplit_h
# define H_tknsplit_h

# ifdef __cplusplus
extern "C" {
# endif

#include <stddef.h>

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The
tkn is terminated by the first appearance of tknchar,
or by the end of the source string.

The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.

Returns: a pointer past the terminating tknchar.

This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
revised 2007-05-26 (name)
*/

const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh); /* length tkn can receive */
/* not including final '\0' */

# ifdef __cplusplus
}
# endif
#endif
/* ------- end file tknsplit.h ----------*/

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 14 '07 #5

"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

siddhu wrote:
>>
As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with
a null pointer for the first parameter. My confusion is that
this behavior is same as strtok. So I assume strtok_r must also
be using any function static variable to keep the information
about s1. If this is the case then how strtok_r is re- entrant?
Otherwise how it keeps the information about s1?

There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

Come on, strtok_r is part of POSIX. Do you pretend POSIX is not popular
enough.
Multiple implementations of strtok_r have been posted before your answer.

>
/* ------- file tknsplit.c ----------*/
#include "tknsplit.h"

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The

Why skip blanks ? this is not strtok behaviour.
The code and the comment don't agree on what blanks are: by C99 Standard,
blanks are space and tab.

tkn is terminated by the first appearance of tknchar,
or by the end of the source string.

Your function definitely differs a lot from strtok that takes a collection
of delimiters instead of a single char.

The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.

Returns: a pointer past the terminating tknchar.

This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.

again, this is not the behaviour of strtok: sequences of separators are
considered one.

A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
Revised 2006-06-13 2007-05-26 (name)
*/

const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh) /* length tkn can receive */
/* not including final '\0' */

I have reservations about your API:
- instead of returning a const char *, you should return the number of chars
skipped.
it would prevent const poisonning when you pass a regular char * but cannot
store the return value into the same variable... It would also allow
trivial testing of end of string.
- the lgh parameter should be the size of the destination array
(sizeof(buf)), out of consistency with other C library functions such as
snprintf, and to avoid off by one errors: if callers pass sizeof(destbuf) -
1, they wouln't invoke UB, whereas they would by passing sizeof(destbuf)
with your current semantics.

{
if (src) {
while (' ' == *src) src++;

while (*src && (tknchar != *src)) {
if (lgh) {
*tkn++ = *src;
--lgh;
}
src++;
}
if (*src && (tknchar == *src)) src++;
}
*tkn = '\0';
return src;
} /* tknsplit */

#ifdef TESTING
#include <stdio.h>

#define ABRsize 6 /* length of acceptable tkn abbreviations */

/* ---------------- */

static void showtkn(int i, char *tok)
{
putchar(i + '1'); putchar(':');
puts(tok);
} /* showtkn */

/* ---------------- */

int main(void)
{
char teststring[] = "This is a test, ,, abbrev, more";

const char *t, *s = teststring;
int i;
char tkn[ABRsize + 1];

puts(teststring);
t = s;
for (i = 0; i < 4; i++) {
t = tknsplit(t, ',', tkn, ABRsize);
showtkn(i, tkn);
}

puts("\nHow to detect 'no more tkns' while truncating");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ',', tkn, 3);
showtkn(i, tkn);
i++;
}

puts("\nUsing blanks as tkn delimiters");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ' ', tkn, ABRsize);
showtkn(i, tkn);
i++;
}
return 0;
} /* main */

#endif
/* ------- end file tknsplit.c ----------*/

/* ------- file tknsplit.h ----------*/
#ifndef H_tknsplit_h
# define H_tknsplit_h

# ifdef __cplusplus
extern "C" {
# endif

#include <stddef.h>

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The
tkn is terminated by the first appearance of tknchar,
or by the end of the source string.

The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.

Returns: a pointer past the terminating tknchar.

This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
revised 2007-05-26 (name)
*/

const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh); /* length tkn can receive */
/* not including final '\0' */

# ifdef __cplusplus
}
# endif
#endif
/* ------- end file tknsplit.h ----------*/

Posting the source code to a public version strtok_r would have been more
helpful.
The only advantage your function offers over strtok_r is the fact that it
does not modify the source string.

--
Chqrlie.

Sep 14 '07 #6

Martien verbruggen

On Sat, 15 Sep 2007 01:07:48 +0200,
Charlie Gordon <ne**@chqrlie.orgwrote:

"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...
>siddhu wrote:
>>>
As I know strtok_r is re-entrant version of strtok.

[snip]

>There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

Come on, strtok_r is part of POSIX. Do you pretend POSIX is not popular
enough.

POSIX is very popular. So is cricket. Neither, however is topical here.

If there were no other place where POSIX were already discussed, one
would have been created, given its popularity.

POSIX is discussed on comp.unix.programmer, and the people there are
very knowledgeable about the subject.

Regards,
Martien
--
|
Martien Verbruggen | Failure is not an option. It comes bundled
| with your Microsoft product.
|

Sep 15 '07 #7

"Martien verbruggen" <mg**@tradingpost.com.aua écrit dans le message de
news: sl*****************@martien.heliotrope.home...

On Sat, 15 Sep 2007 01:07:48 +0200,
Charlie Gordon <ne**@chqrlie.orgwrote:
>"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...
>>siddhu wrote:

As I know strtok_r is re-entrant version of strtok.

[snip]

>>There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

Come on, strtok_r is part of POSIX. Do you pretend POSIX is not popular
enough.

POSIX is very popular. So is cricket. Neither, however is topical here.

If there were no other place where POSIX were already discussed, one
would have been created, given its popularity.

POSIX is discussed on comp.unix.programmer, and the people there are
very knowledgeable about the subject.

POSIX may not be topical here, but mentioning strtok_r as a widely available
_fixed_ version of broken strtok is more helpful to the OP than the useless
display of obtuse chauvinism expressed ad nauseam by some of the group's
regulars.

Why did C99 get published without including the reentrant alternatives to
strtok and similar functions is a mystery. I guess the national bodies were
too busy arguing about iso646.h. Other Posix utility functions are missing
for no reason: strdup for instance. Did the Posix guys patent those or is
WG14 allergic to unix ?

--
Chqrlie.

Sep 15 '07 #8

Charlie Gordon wrote:

"CBFalconer" <cb********@yahoo.coma écrit:
>siddhu wrote:
>>>
As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with
a null pointer for the first parameter. My confusion is that
this behavior is same as strtok. So I assume strtok_r must also
be using any function static variable to keep the information
about s1. If this is the case then how strtok_r is re- entrant?
Otherwise how it keeps the information about s1?

There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

Come on, strtok_r is part of POSIX. Do you pretend POSIX is not
popular enough. Multiple implementations of strtok_r have been
posted before your answer.

Popularity doesn't enter into it. Presence in the standard library
does. strtok_r doesn't exist there. That makes it off-topic here
in c.l.c. (barring source).

>>
/* ------- file tknsplit.c ----------*/
#include "tknsplit.h"

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The

Why skip blanks ? this is not strtok behaviour. The code and the
comment don't agree on what blanks are: by C99 Standard, blanks are
space and tab.

This is not strtok. It is tknsplit. This is behaviour that seems
more useful to me. You don't have to use it, but siddhu may wish
to.

>

.... snip ...

>
Posting the source code to a public version strtok_r would have
been more helpful. The only advantage your function offers over
strtok_r is the fact that it does not modify the source string.

Which, IMO, is a major improvement. It also detects missing
tokens. It (once more) is NOT strtok. I have no idea what
strtok_r is, except that it invades user namespace.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 15 '07 #9

Sam Harris

On 15 Sep 2007 at 1:28, Charlie Gordon wrote:

Why did C99 get published without including the reentrant alternatives to
strtok and similar functions is a mystery. I guess the national bodies were
too busy arguing about iso646.h. Other Posix utility functions are missing
for no reason: strdup for instance. Did the Posix guys patent those or is
WG14 allergic to unix ?

You can easily write your own version of strdup in a couple lines. I use
the following:

char *strdup(char *s)
{
char *r=0;
int i=0;
do {
r=(char *) realloc(r,++i * sizeof(char));
} while(r[i-1]=s[i-1]);
return r;
}

Sep 15 '07 #10

"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

Charlie Gordon wrote:
>"CBFalconer" <cb********@yahoo.coma écrit:
>>siddhu wrote:

As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with
a null pointer for the first parameter. My confusion is that
this behavior is same as strtok. So I assume strtok_r must also
be using any function static variable to keep the information
about s1. If this is the case then how strtok_r is re- entrant?
Otherwise how it keeps the information about s1?

There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

Come on, strtok_r is part of POSIX. Do you pretend POSIX is not
popular enough. Multiple implementations of strtok_r have been
posted before your answer.

Popularity doesn't enter into it. Presence in the standard library
does. strtok_r doesn't exist there. That makes it off-topic here
in c.l.c. (barring source).

I did post source code (my own, put in the public domain)

>>>
/* ------- file tknsplit.c ----------*/
#include "tknsplit.h"

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The

Why skip blanks ? this is not strtok behaviour. The code and the
comment don't agree on what blanks are: by C99 Standard, blanks are
space and tab.

This is not strtok. It is tknsplit. This is behaviour that seems
more useful to me. You don't have to use it, but siddhu may wish
to.

You introduced your function like this: "I just happen to have a suitable
replacement function"
One would expect semantics to be a tad closer.

>>
... snip ...
>>
Posting the source code to a public version strtok_r would have
been more helpful. The only advantage your function offers over
strtok_r is the fact that it does not modify the source string.

Which, IMO, is a major improvement. It also detects missing
tokens. It (once more) is NOT strtok. I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer. You probably never heard of Unix, or even
Linux... Or do you live on this remote planet Microsoft has not settled yet
? If you have no idea what strtok_r is, learn something new today:
http://linux.die.net/man/3/strtok_r or if you like Microsoft's version
better (part of the secure string proposal)
http://msdn2.microsoft.com/en-us/lib...z3(VS.80).aspx

--
Chqrlie

Sep 15 '07 #11

CBFalconer said:

<snip>

I have no idea what strtok_r is, except that it invades user
namespace.

No, it doesn't.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 15 '07 #12

Charlie Gordon said:

"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

<snip>

>I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer.

No, he's toeing the group line, such as it is. As far as comp.lang.c is
concerned, there is *no such function* as strtok_r. If this question
were to arise in, say, comp.unix.programmer, Chuck's answer might be
very different.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 15 '07 #13

Tor Rustad

CBFalconer wrote:

[...]

I have no idea what
strtok_r is, except that it invades user namespace.

If you have no idea what those *_r functions are, it's time for you (as
a Linux user) to read Stevens APUE! :)
str[a-z] is reserved name space, so it isn't part of the user name space.
--
Tor <torust [at] online [dot] no>

Sep 15 '07 #14

Tor Rustad

Richard Heathfield wrote:

Charlie Gordon said:

>"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

<snip>

>>I have no idea what
strtok_r is, except that it invades user namespace.
You must be joking Mr Falconer.

No, he's toeing the group line, such as it is. As far as comp.lang.c is
concerned, there is *no such function* as strtok_r. If this question
were to arise in, say, comp.unix.programmer, Chuck's answer might be
very different.

I have seen Chuck post a number of times over in Linux forums, so it's
rather surprising if he doesn't know about POSIX.

Methinks he know, but choose here to pretend he doesn't! :-)

--
Tor <torust [at] online [dot] no>

Sep 15 '07 #15

jacob navia

Charlie Gordon wrote:

>
POSIX may not be topical here, but mentioning strtok_r as a widely available
_fixed_ version of broken strtok is more helpful to the OP than the useless
display of obtuse chauvinism expressed ad nauseam by some of the group's
regulars.

EXACTLY!

Why did C99 get published without including the reentrant alternatives to
strtok and similar functions is a mystery. I guess the national bodies were
too busy arguing about iso646.h. Other Posix utility functions are missing
for no reason: strdup for instance. Did the Posix guys patent those or is
WG14 allergic to unix ?

C99 did not change ANY of the bugs of the standard library

o non reentrant functions like strtok remained and no alternative
was proposed even if POSIX had developed one.

o Buffer overflows were written into the standard itself.
I had a lengthy discussion in comp.std.c about asctime()
and the fixed buffer of 26 position it says it needs. It
suffices to put some wrong values into the input structure
and you have a buffer overflow. But no corrective action
was taken. More, the commitee told the people reporting
the bug that it was OK to have a buffer overflow there.

o gets() was maintained of course. Only after lengthy discussions,
Mr Gwyn felt forced to propose a "fix" that would have fixed the
input buffer size to at least BUFSIZ. The committee apparently
decided that gets() was deprecated, maybe because of the discussion
in comp.std.c, I do not know. In any case it would have been
better to do it when C99 was published.

o Trigraphs were maintained in the standard.

And I could go on with those examples...

Sep 15 '07 #16

pete said:

<snip>

Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.

memcpy, surely? Why measure the string twice?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 15 '07 #17

"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de news:
Ma*********************@bt.com...

Charlie Gordon said:

>"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

<snip>

>>I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer.

No, he's toeing the group line, such as it is. As far as comp.lang.c is
concerned, there is *no such function* as strtok_r. If this question
were to arise in, say, comp.unix.programmer, Chuck's answer might be
very different.

If he would give a different answer on a different group, one of these
statements would be a lie or a joke.
So he is a fundamentalist, ostracist, extremist...

--
Chqrlie.

Sep 15 '07 #18

Charlie Gordon said:

"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de
news: Ma*********************@bt.com...
>Charlie Gordon said:

>>"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

<snip>

>>>I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer.

No, he's toeing the group line, such as it is. As far as comp.lang.c
is concerned, there is *no such function* as strtok_r. If this
question were to arise in, say, comp.unix.programmer, Chuck's answer
might be very different.

If he would give a different answer on a different group, one of these
statements would be a lie or a joke.

....or a way of making a point, a la "Ich bin ein Berliner", with which
John F Kennedy bolstered the morale of West Berlin's citizens in June
1963. It was not "true" in the literal sense, but neither was it a lie
or a joke.

So he is a fundamentalist, ostracist, extremist...

If you feel forced to resort to personal attacks, I can only assume you
have no logical arguments to put forward.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 15 '07 #19

jacob navia

Richard Heathfield wrote:

>
If you feel forced to resort to personal attacks, I can only assume you
have no logical arguments to put forward.

Personal attacks are allowed only for friends of Heathfield & Co.

Sep 15 '07 #20

"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de news:
tJ******************************@bt.com...

Charlie Gordon said:

>"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de
news: Ma*********************@bt.com...
>>Charlie Gordon said:

"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

<snip>

I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer.

No, he's toeing the group line, such as it is. As far as comp.lang.c
is concerned, there is *no such function* as strtok_r. If this
question were to arise in, say, comp.unix.programmer, Chuck's answer
might be very different.

If he would give a different answer on a different group, one of these
statements would be a lie or a joke.

...or a way of making a point, a la "Ich bin ein Berliner", with which
John F Kennedy bolstered the morale of West Berlin's citizens in June
1963. It was not "true" in the literal sense, but neither was it a lie
or a joke.

Except CBFalconer is no John F. Kennedy ;-)
His blunt rethoric does not bolster any one's morale, sarcasm does no good.

>So he is a fundamentalist, ostracist, extremist...

If you feel forced to resort to personal attacks, I can only assume you
have no logical arguments to put forward.

You are right, I should not have attributed to malice that which can be
adequately explained by plain ignorance. But I repeat: to not use strtok
anymore, check for availability of strtok_r or implement it locally from the
public domain source that has been posted above.

--
Chqrlie.

Sep 15 '07 #21

Francine.Neary

On Sep 15, 1:40 pm, "Joachim Schmitz" <nospam.j...@schmitz-digital.de>
wrote:

It s is NULL, this version only returns NULL if the implemenation's
malloc(0) returns NULL too

Entirely consistent with the standard library string functions - if
you pass them a char * that doesn't point to a string, the behavior is
undefined.

<OT>
And this is one case where "the thing you hope will happen" probably
doesn't - e.g. trying to compute strlen(NULL) on a GNU system produces
a seg fault).
</OT>

>
Bye, Jojo

Sep 15 '07 #22

"Joachim Schmitz" <no*********@schmitz-digital.dea écrit dans le message
de news: fc**********@online.de...

"Joachim Schmitz" <no*********@schmitz-digital.deschrieb im Newsbeitrag
news:fc**********@online.de...
>>
<Fr************@googlemail.comschrieb im Newsbeitrag
news:11**********************@g4g2000hsf.googlegr oups.com...
<snip>
My suggestion would be:

#include <stdlib.h>
#include <string.h>

char *my_strdup(const char *s)
{
size_t len;
char *t;
if(t=malloc(len=strlen(s)+1))
memcpy(t, s, len);
return t;
}

It s is NULL, this version only returns NULL if the implementation's
malloc(0) returns NULL too
oops, sorry somehow my quoting was wrong.

Only the last sentence was mine...

And it does not make much sense ;-)
If s in NULL, strlen(s) invokes undefined behaviour.
otherwise, len is always0, and the code does not depend on the behaviour
of malloc(0)

--
Chqrlie.

Sep 15 '07 #23

Joachim Schmitz

<Fr************@googlemail.comschrieb im Newsbeitrag
news:11*********************@n39g2000hsh.googlegro ups.com...

On Sep 15, 1:40 pm, "Joachim Schmitz" <nospam.j...@schmitz-digital.de>
wrote:
>It s is NULL, this version only returns NULL if the implemenation's
malloc(0) returns NULL too

Entirely consistent with the standard library string functions - if
you pass them a char * that doesn't point to a string, the behavior is
undefined.

Fair enough, but I'd prefer my own functions to do better than that, so I
like Charlie Gordon's implementation better.

<OT>
And this is one case where "the thing you hope will happen" probably
doesn't - e.g. trying to compute strlen(NULL) on a GNU system produces
a seg fault).
</OT>

Damn, here too...
anwyway: see above

Bye, Jojo

Sep 15 '07 #24

<Fr************@googlemail.coma écrit dans le message de news:
11*********************@g4g2000hsf.googlegroups.co m...

On Sep 15, 2:02 pm, "Charlie Gordon" <n...@chqrlie.orgwrote:
><Francine.Ne...@googlemail.coma écrit dans le message de news:
1189858659.251030.194...@g4g2000hsf.googlegroups. com...
>>My suggestion would be:

#include <stdlib.h>
#include <string.h>

>>char *my_strdup(const char *s)
{
size_t len;
char *t;
if(t=malloc(len=strlen(s)+1))
memcpy(t, s, len);
return t;
}

You code performs the task, but I find it misleading to call len a var
iable that is not the length of the string. I prefer to use size for
this purpose.

Furthermore, this code would not pass my default warning settings.
Assignment as an test expression is considered sloppy and error prone.

Tell that to Kernighan and Ritchie. :)

They might read this thread, I am sure they would care to comment.

Coding conventions is a very effective tool to catch bugs at an early
stage in development. Using all the help the compiler and other
automated tools can give at tracking potential errors disguised as
suspicious use of certain operators enhances productivity.

There is no gain at writing

size_t len;
char *t;
if(t=malloc(len=strlen(s)+1)) ...

instead of

size_t size = strlen(s) + 1;
char *t = malloc(size);
if (t) ...

The latter is much more readable and less error prone.

Your version did improve on mine by using memcpy to copy the '\0'
instead of writing separate code for that.

--
Chqrlie.

Is there a reason for the typo in your signature?

chqrlie is my handle, is there a reason you don't sign your messages ?

Sep 15 '07 #25

Joachim Schmitz

"Charlie Gordon" <ne**@chqrlie.orgschrieb im Newsbeitrag
news:46***********************@news.free.fr...

<Fr************@googlemail.coma écrit dans le message de news:
11*********************@g4g2000hsf.googlegroups.co m...
>On Sep 15, 2:02 pm, "Charlie Gordon" <n...@chqrlie.orgwrote:
>><Francine.Ne...@googlemail.coma écrit dans le message de news:
1189858659.251030.194...@g4g2000hsf.googlegroups .com...
My suggestion would be:

#include <stdlib.h>
#include <string.h>

char *my_strdup(const char *s)
{
size_t len;
char *t;
if(t=malloc(len=strlen(s)+1))
memcpy(t, s, len);
return t;
}

You code performs the task, but I find it misleading to call len a var
iable that is not the length of the string. I prefer to use size for
this purpose.

Furthermore, this code would not pass my default warning settings.
Assignment as an test expression is considered sloppy and error prone.

Tell that to Kernighan and Ritchie. :)

They might read this thread, I am sure they would care to comment.

Coding conventions is a very effective tool to catch bugs at an early
stage in development. Using all the help the compiler and other
automated tools can give at tracking potential errors disguised as
suspicious use of certain operators enhances productivity.

There is no gain at writing

size_t len;
char *t;
if(t=malloc(len=strlen(s)+1)) ...

instead of

size_t size = strlen(s) + 1;
char *t = malloc(size);
if (t) ...

The latter is much more readable and less error prone.

Your version did improve on mine by using memcpy to copy the '\0'
instead of writing separate code for that.

True but quite easy to fix:
char *my_strdup(const char *str) {
char *dest = NULL;

if (str) {
size_t size = strlen(str) + 1;
dest = malloc(size);
if (dest)
memcpy(dest, str, size);
}
return dest;
}

Sep 15 '07 #26

Charlie Gordon wrote:

"CBFalconer" <cb********@yahoo.coma écrit:

.... snip about tknsplit and strtok ...

>
>Which, IMO, is a major improvement. It also detects missing
tokens. It (once more) is NOT strtok. I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer. You probably never heard of
Unix, or even Linux... Or do you live on this remote planet
Microsoft has not settled yet ? If you have no idea what
strtok_r is, learn something new today:
http://linux.die.net/man/3/strtok_r
or if you like Microsoft's version better (part of the secure
string proposal)
http://msdn2.microsoft.com/en-us/lib...z3(VS.80).aspx

This newsgroup is comp.lang.c. C is defined by the various C
standards, present or past, and includes K&R for times previous to
1989. None of these define, or even mention, strtok_r. Thus,
without standard C code, published in the same message, discussion
of it is off-topic here. The name is still reserved for the
implementor. As I said, it doesn't exist. Unix, Linux, Microsoft
have no influence whatsoever.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 15 '07 #27

Richard Heathfield wrote:

CBFalconer said:

<snip>

>I have no idea what strtok_r is, except that it invades user
namespace.

No, it doesn't.

Well, it's not exactly a typo, but ....

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 15 '07 #28

Charlie Gordon wrote:

"pete" <pf*****@mindspring.coma écrit:

.... snip ...

>>
Intsead of using realloc in a loop, I think most programmers
would write strdup with one function call to strlen and one to
malloc and one to strcpy.

Or more efficiently calling memcpy instead of strcpy.

char *strdup(const char *str) {
size_t len;
char *dest = NULL;

if (str) {
len = strlen(str);
dest = malloc(len + 1);
if (dest) {
memcpy(dest, str, len);
dest[len] = '\0';
}
}
return dest;
}

I challenge the 'more efficient'. It will be highly dependent on
the compiler, but at the simplest you would be trading the effort
of an extra procedure call against the possible efficiency
improvement. Since most strings are short (in my case, probably
under 10 or 20 chars) this 'improvement' is a chimera. Also
bearing in mind that strdup is a system reserved name, my version
(with a #include <stdlib.h>) is:

char *dupstr(const char *str) {
char *dest, *temp;

if (dest = malloc(1 + strlen(str))) {
temp = dest;
while (*temp++ = *str++) continue;
}
return dest;
}

and I am willing to let it go boom when str is NULL, for early
warning etc. of problems.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 15 '07 #29

Joe Wright

Richard Heathfield wrote:

pete said:

<snip>

>Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.

memcpy, surely? Why measure the string twice?

Huh? strcpy doesn't measure anything.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Sep 15 '07 #30

Joe Wright said:

Richard Heathfield wrote:
>pete said:

<snip>

>>Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.

memcpy, surely? Why measure the string twice?

Huh? strcpy doesn't measure anything.

Sorry, Joe - I was guilty of truncated exegesis. What I meant was this:
that strcpy must keep going until it hits a null terminator, and it
doesn't know in advance where that null terminator will be found, so it
must test every character. So, although it isn't measuring the string
as such, that's only because it doesn't bother to write down how long
the string is. It's still ploughing through the string, character by
character. But we've already done that with our strlen call. By using
memcpy, we can take advantage of the fact that the string has already
been measured - memcpy can use any number of platform-specific tricks
for copying multiple bytes at a time. Therefore, if the length of the
string to be copied is known in advance, it is (likely to be) more
efficient to use memcpy than strcpy.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 15 '07 #31

"Charlie Gordon" <ne**@chqrlie.orgwrites:
[...]

But I repeat: to not use
strtok anymore, check for availability of strtok_r or implement it
locally from the public domain source that has been posted above.

Well, maybe.

The standard function strtok() is non-reentrant *and* it has some
other -- well, not bugs necessarily, but quirks. For example, the
fact that it merges multiple adjacent delimiters can be inconvenient,
though it might be just what you need. (In practice, you usually want
this behavior if the delimiter is whitespace, but not if it's
something else.)

If you're using strtok() and it already does exactly what you want
except for the lack of reentrancy, then strtok_r() (if it's available
on your system -- and if not, you can compile it yourself) is just the
thing. If your requirements are less specific, then tknsplit() might
turn out to be perfect for you -- or some other non-standard function
might be better.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 15 '07 #32

"Joachim Schmitz" <no*********@schmitz-digital.dewrites:

<Fr************@googlemail.comschrieb im Newsbeitrag
news:11*********************@n39g2000hsh.googlegro ups.com...
>On Sep 15, 1:40 pm, "Joachim Schmitz" <nospam.j...@schmitz-digital.de>
wrote:
>>It s is NULL, this version only returns NULL if the implemenation's
malloc(0) returns NULL too

Entirely consistent with the standard library string functions - if
you pass them a char * that doesn't point to a string, the behavior is
undefined.
Fair enough, but I'd prefer my own functions to do better than that, so I
like Charlie Gordon's implementation better.

><OT>
And this is one case where "the thing you hope will happen" probably
doesn't - e.g. trying to compute strlen(NULL) on a GNU system produces
a seg fault).
</OT>
Damn, here too...
anwyway: see above

How is returning (size_t)-1 better than a seg fault?

If I pass NULL to strlen(), there's a bug in my program. I'd like to
find out about it as early as possible. If strlen() quietly returns
-1, I might not detect the error until much later.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 15 '07 #33

Chris Torek

>Charlie Gordon wrote:

>>Why did C99 get published without including the reentrant alternatives to
strtok and similar functions is a mystery.

Ask the people who were on the committees. :-) (Seriously, I do not
know why the non-reentrant versions were retained without at least
some sort of cleanup.)

In article <46**********************@news.orange.fr>
jacob navia <ja***@jacob.remcomp.frwrote:

>C99 did not change ANY of the bugs of the standard library

I am not sure I would call all of these "bugs". ("Misfeatures",
perhaps, especially trigraphs :-) . More seriously, just two
points here:)

>o non reentrant functions like strtok remained and no alternative
was proposed even if POSIX had developed one.

While strtok_r() is an improvement on strtok(), it leaves one of
strtok()'s fundamental flaws in place. If one is going to "improve"
strtok(), one should at least look at the BSD strsep().

Still, importing the whole set of POSIX "_r" functions would, I
think, have been better than doing nothing.

>o Buffer overflows were written into the standard itself.

This is, at best, an overstatement.

I had a lengthy discussion in comp.std.c about asctime()
and the fixed buffer of 26 position it says it needs. It
suffices to put some wrong values into the input structure
and you have a buffer overflow.

If you "put some wrong values" in, you have little hope of expecting
*anything* -- what happens in lcc-win32, for instance, if I write:

struct big { int a[1000]; };
struct big main(double oops) {
short x = strlen((char *)0x98766542);
... /* more "wrong values" as inputs as needed */
return *(struct big *)42;
}

? If you want to protect against bad inputs, you need to think
hard about which kinds of "bad inputs" to guard against, and do
some serious cost/benefit analysis.

Moreover, if your objection is that values of .tm_year greater
than 8100 (or less than or equal to some negative number) cause
problems, you can always test for that in your own implementation:

__internal_return_type __internal_worker_function_for_times(...) {
...
if (OUT_OF_RANGE(tm->tm_year)) ... signal error ...
...
}

which might be used as, e.g.:

char *asctime(const struct tm *tm) {
...
if (__internal_worker_function_for_times(...) == ERROR)
__runtime_error_trap_report("invalid parameter to asctime()");
...
}

and thus demonstrate the superior Quality of Implementation of
lcc-win32, with regard to this particular possibility. (Presumably
__runtime_error_trap_report saves the state of the program for use
in the debugger, prints a stack trace, and/or does whatever else
is good for fixing program bugs.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Sep 15 '07 #34

rafaelc

On 2007-09-15, Keith Thompson <ks***@mib.orgwrote:

How is returning (size_t)-1 better than a seg fault?

If I pass NULL to strlen(), there's a bug in my program. I'd like to
find out about it as early as possible. If strlen() quietly returns
-1, I might not detect the error until much later.

By returning -1 strlen is not being so quiet. If you check function's return
value then you can catch the error at least as good as if it'd just segfault.
It could not segfault for some reason, but you'd always be able to check the
return value and tell yourself there's something wrong.

Sep 15 '07 #35

pete

Richard Heathfield wrote:

>
pete said:

<snip>

Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.

memcpy, surely? Why measure the string twice?

I suppose I was unduly influenced by the strdup example in K&R2.

--
pete

Sep 15 '07 #36

ra*****@dcc.ufmg.br writes:

On 2007-09-15, Keith Thompson <ks***@mib.orgwrote:
>How is returning (size_t)-1 better than a seg fault?
If I pass NULL to strlen(), there's a bug in my program. I'd like to
find out about it as early as possible. If strlen() quietly returns
-1, I might not detect the error until much later.

By returning -1 strlen is not being so quiet. If you check function's return
value then you can catch the error at least as good as if it'd just segfault.
It could not segfault for some reason, but you'd always be able to check the
return value and tell yourself there's something wrong.

Since returning (size_t)-1 is non-standard behavior (though it's
allowed), I'm not likely to check for it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 15 '07 #37

Sam Harris

On 15 Sep 2007 at 2:31, Charlie Gordon wrote:

Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes undefined behaviour when running out of memory, it
should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company, I think I can just
about write a strdup function that works more than adequately on any
machine I'd ever want to run it on.

Sep 15 '07 #38

Sam Harris said:

On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
>Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes undefined behaviour when running out of memory,
it should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company, I think I can
just about write a strdup function that works more than adequately on
any machine I'd ever want to run it on.

Claiming that you work for a Fortune 500 company might impress your
aunt, but the fact remains that your implementation of a string
duplication function left a lot to be desired. You would do well to
learn from your mistakes, rather than try to justify them.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 15 '07 #39

Mark McIntyre

On Sat, 15 Sep 2007 13:07:28 +0200, in comp.lang.c , "Charlie Gordon"
<ne**@chqrlie.orgwrote:

>
If he would give a different answer on a different group, one of these
statements would be a lie or a joke.

If my teenage son asks me how they work out the price of credit
default swaps, I give one answer.

If junior quant analyst in a bank asks me the same question, I give an
entirely different answer.

I assure you, neither answer is a lie or a joke.

>So he is a fundamentalist, ostracist, extremist...

Or he's tailoring his answer to the forum of the question.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Sep 15 '07 #40

P.J. Plauger

"jacob navia" <ja***@jacob.remcomp.frwrote in message
news:46***********************@news.orange.fr...

Mr Clive Feather submitted a defect report saying in substance the
same thing as I said. The committee answer was:

<quote>
Thus, asctime() may exhibit undefined behavior if any of the members of
timeptr produce undefined behavior in the sample algorithm (for example,
if the timeptr->tm_wday is outside the range 0 to 6 the function may
index beyond the end of an array).

As always, the range of undefined behavior permitted includes:
Corrupting memory
Aborting the program
Range checking the argument and returning a failure indicator (e.g., a
null pointer)
Returning truncated results within the traditional 26 byte buffer.
There is no consensus to make the suggested change or any change along
this line.
<end quote>

You read correctly. Corrupting memory (i.e. a buffer overflow) is
within the range of undefined behavior acceptable!!!!

I have the right then, to name a buffer overflow for what it is, a
buffer overflow in the C standard with all the committee behind it.

You have the right to misread anything. You have a responsibility, as
a self-proclaimed expert. to think with something other than your
gonads. The response you quoted clearly encompasses a variety
of nicer behaviors than buffer overflow, but you neglect to take
them in.

The committee *accepts* that buffer overflow can occur in a
conforming implementation. The same is true of:

int a[10];
a[300] = 4;

And the committee is "behind" an implementation that overwrites
storage when this ill-formed program executes. (The committee
is also "behind" an implementation that aborts with a diagnostic
message.)

Get over it.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Sep 15 '07 #41

Mark McIntyre

On Sun, 16 Sep 2007 00:21:43 +0200 (CEST), in comp.lang.c , Sam Harris
<no****@in.validwrote:

>On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
>Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes undefined behaviour when running out of memory, it
should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company,

Whoopy doo. *Anyone* will tell you that this is absolutely no
guarantee whatsoever of coding quality or ability.

I think I can just
about write a strdup function that works more than adequately on any
machine I'd ever want to run it on.

Pardon me, but you had a few actual errors pointed out, perhaps you
should consider being less arrogant?
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Sep 15 '07 #42

pete

Sam Harris wrote:

>
On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes
undefined behaviour when running out of memory, it
should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company,
I think I can just
about write a strdup function that works more than adequately on any
machine I'd ever want to run it on.

You posted an extraordinarily crappy code example.

--
pete

Sep 15 '07 #43

Tor Rustad

Richard Heathfield wrote:

Joe Wright said:

>Richard Heathfield wrote:
>>pete said:

<snip>

Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.
memcpy, surely? Why measure the string twice?

Huh? strcpy doesn't measure anything.

Sorry, Joe - I was guilty of truncated exegesis. What I meant was this:
that strcpy must keep going until it hits a null terminator, and it
doesn't know in advance where that null terminator will be found, so it
must test every character. So, although it isn't measuring the string
as such, that's only because it doesn't bother to write down how long
the string is. It's still ploughing through the string, character by
character. But we've already done that with our strlen call. By using
memcpy, we can take advantage of the fact that the string has already
been measured - memcpy can use any number of platform-specific tricks
for copying multiple bytes at a time. Therefore, if the length of the
string to be copied is known in advance, it is (likely to be) more
efficient to use memcpy than strcpy.

Well, some 5 years ago, I made a similar comment on your code Richard,
which was using strcpy() at the time. We had a rather "long" argument
about it, and in the end, I tried to make my point by measuring memcpy()
vs strcpy() performance.

IIRC, the result of those tests, was rather humiliating for me, as your
strcpy() performed excellent! :-)

Is there a reason to beleave, that the strcpy() has become more CPU
bound in recent years? If not, I don't think you will have much success
in measuring an improvement by using memcpy().

Making good measurements on this, is a challenge. We don't want to
measure L1 cache performance only.

--
Tor <torust [at] online [dot] no>

Sep 16 '07 #44

Sam Harris <no****@in.validwrites:

On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
>Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes undefined behaviour when running out of memory, it
should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company, I think I can just
about write a strdup function that works more than adequately on any
machine I'd ever want to run it on.

You haven't demonstrated it so far. Frankly, when I read your
implementation upthread I assumed it was a joke. You call realloc()
once for each character; why not compute the length and call malloc()
just once?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 16 '07 #45

jacob navia <ja***@jacob.remcomp.frwrites:

Chris Torek wrote:
>jacob navia <ja***@jacob.remcomp.frwrote:
>>o Buffer overflows were written into the standard itself.
This is, at best, an overstatement.

A buffer overflow happens when a fixed size memory area is defined
but a program writes PAST the fixed size buffer. This is a buffer
overflow.

Now, the standard specifies a buffer length of 26 for the buffer of
asctime.

[...]

Yes, calling asctime() with certain arguments can result in a buffer
overflow.

Calling strcpy() with certain arguments can result in a buffer
overflow. Likewise for sprintf(), sscanf(), memcpy(), memmove(),
strcat(), etc. In all these cases, the arguments passed are under the
program's control; the problem can reliably be avoided by checking the
arguments before invoking the function.

I happen to agree that asctime() should be defined to use a larger
buffer, one big enough so that the buffer won't overflow for any
possible arguments. But the problem is so easy to avoid that it's
hardly a fatal flaw in the language -- and it can't overflow if you
give it an argument corresponding to the current time (at least not
for the next 8000 years or so). It's certainly not nearly as
dangerous as gets().

I generally wouldn't use asctime() anyway. The format it uses isn't
my favorite (I prefer YYYY-MM-DD for dates), and the trailing '\n' is
more trouble than it's worth. In real code, I'd use strftime()
instead, which is more flexible and doesn't have asctime()'s problems.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 16 '07 #46