468,771 Members | 1,553 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,771 developers. It's quick & easy.

strtok and strtok_r

Dear experts,

As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with a
null pointer for the first parameter.
My confusion is that this behavior is same as strtok. So I assume
strtok_r must also be using any function static variable to keep the
information about s1. If this is the case then how strtok_r is re-
entrant?
Otherwise how it keeps the information about s1?

Regards,
Siddharth

Sep 14 '07 #1
75 24249
siddhu wrote:
Dear experts,

As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with a
null pointer for the first parameter.
My confusion is that this behavior is same as strtok. So I assume
strtok_r must also be using any function static variable to keep the
information about s1. If this is the case then how strtok_r is re-
entrant?
Otherwise how it keeps the information about s1?

Regards,
Siddharth
The reentrant version takes one more argument where it stores its progress:
http://www.bullfreeware.com/download...-1.0.9/support
// Skip GNU copyright
#include <string.h>
/* Parse S into tokens separated by characters in DELIM.
If S is NULL, the saved pointer in SAVE_PTR is used as
the next starting point. For example:
char s[] = "-abc-=-def";
char *sp;
x = strtok_r(s, "-", &sp); // x = "abc", sp = "=-def"
x = strtok_r(NULL, "-=", &sp); // x = "def", sp = NULL
x = strtok_r(NULL, "=", &sp); // x = NULL
// s = "abc\0-def\0"
*/
char *strtok_r (char *s,
const char *delim,
char **save_ptr)
{
char *token;

if (s == NULL)
s = *save_ptr;

/* Scan leading delimiters. */
s += strspn (s, delim);
if (*s == '\0')
return NULL;

/* Find the end of the token. */
token = s;
s = strpbrk (token, delim);
if (s == NULL)
/* This token finishes the string. */
*save_ptr = strchr (token, '\0');
else
{
/* Terminate the token and make *SAVE_PTR point past it. */
*s = '\0';
*save_ptr = s + 1;
}
return token;
}
Sep 14 '07 #2
siddhu <si***********@gmail.comwrites:
As I know strtok_r is re-entrant version of strtok.
This is true on a system compliant with, e.g., POSIX, but it is
not required by C. Followups set.
[...misunderstanding...]
I think the problem is that you do not realize that strtok_r
takes one more parameter than strtok, and uses that parameter to
save state from one call to the next.
--
char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa6 7f6aaa,0xaa9aa9f6,0x11f6},*p
=b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}
Sep 14 '07 #3
"siddhu" <si***********@gmail.coma écrit dans le message de news:
11**********************@g4g2000hsf.googlegroups.c om...
Dear experts,

As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with a
null pointer for the first parameter.
My confusion is that this behavior is same as strtok. So I assume
strtok_r must also be using any function static variable to keep the
information about s1. If this is the case then how strtok_r is re-
entrant?
Otherwise how it keeps the information about s1?
strtok_r takes an extra parameter, q pointer to a char * where it stores its
current state.

The implementation is quite straightforward:

char *strtok_r(char *str, const char *delim, char **nextp)
{
char *ret;

if (str == NULL)
str = *nextp;
str += strspn(str, delim);
if (*str == '\0')
return NULL;
ret = str;
str += strcspn(str, delim);
if (*str)
*str++ = '\0';
*nextp = str;
return ret;
}

--
Chqrlie.
Sep 14 '07 #4
siddhu wrote:
>
As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with
a null pointer for the first parameter. My confusion is that
this behavior is same as strtok. So I assume strtok_r must also
be using any function static variable to keep the information
about s1. If this is the case then how strtok_r is re- entrant?
Otherwise how it keeps the information about s1?
There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

/* ------- file tknsplit.c ----------*/
#include "tknsplit.h"

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The
tkn is terminated by the first appearance of tknchar,
or by the end of the source string.

The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.

Returns: a pointer past the terminating tknchar.

This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.

A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
Revised 2006-06-13 2007-05-26 (name)
*/

const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh) /* length tkn can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) src++;

while (*src && (tknchar != *src)) {
if (lgh) {
*tkn++ = *src;
--lgh;
}
src++;
}
if (*src && (tknchar == *src)) src++;
}
*tkn = '\0';
return src;
} /* tknsplit */

#ifdef TESTING
#include <stdio.h>

#define ABRsize 6 /* length of acceptable tkn abbreviations */

/* ---------------- */

static void showtkn(int i, char *tok)
{
putchar(i + '1'); putchar(':');
puts(tok);
} /* showtkn */

/* ---------------- */

int main(void)
{
char teststring[] = "This is a test, ,, abbrev, more";

const char *t, *s = teststring;
int i;
char tkn[ABRsize + 1];

puts(teststring);
t = s;
for (i = 0; i < 4; i++) {
t = tknsplit(t, ',', tkn, ABRsize);
showtkn(i, tkn);
}

puts("\nHow to detect 'no more tkns' while truncating");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ',', tkn, 3);
showtkn(i, tkn);
i++;
}

puts("\nUsing blanks as tkn delimiters");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ' ', tkn, ABRsize);
showtkn(i, tkn);
i++;
}
return 0;
} /* main */

#endif
/* ------- end file tknsplit.c ----------*/

/* ------- file tknsplit.h ----------*/
#ifndef H_tknsplit_h
# define H_tknsplit_h

# ifdef __cplusplus
extern "C" {
# endif

#include <stddef.h>

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The
tkn is terminated by the first appearance of tknchar,
or by the end of the source string.

The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.

Returns: a pointer past the terminating tknchar.

This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
revised 2007-05-26 (name)
*/

const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh); /* length tkn can receive */
/* not including final '\0' */

# ifdef __cplusplus
}
# endif
#endif
/* ------- end file tknsplit.h ----------*/

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 14 '07 #5
"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...
siddhu wrote:
>>
As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with
a null pointer for the first parameter. My confusion is that
this behavior is same as strtok. So I assume strtok_r must also
be using any function static variable to keep the information
about s1. If this is the case then how strtok_r is re- entrant?
Otherwise how it keeps the information about s1?

There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:
Come on, strtok_r is part of POSIX. Do you pretend POSIX is not popular
enough.
Multiple implementations of strtok_r have been posted before your answer.
>
/* ------- file tknsplit.c ----------*/
#include "tknsplit.h"

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The
Why skip blanks ? this is not strtok behaviour.
The code and the comment don't agree on what blanks are: by C99 Standard,
blanks are space and tab.
tkn is terminated by the first appearance of tknchar,
or by the end of the source string.
Your function definitely differs a lot from strtok that takes a collection
of delimiters instead of a single char.
The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.

Returns: a pointer past the terminating tknchar.

This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.
again, this is not the behaviour of strtok: sequences of separators are
considered one.
A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
Revised 2006-06-13 2007-05-26 (name)
*/

const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh) /* length tkn can receive */
/* not including final '\0' */
I have reservations about your API:
- instead of returning a const char *, you should return the number of chars
skipped.
it would prevent const poisonning when you pass a regular char * but cannot
store the return value into the same variable... It would also allow
trivial testing of end of string.
- the lgh parameter should be the size of the destination array
(sizeof(buf)), out of consistency with other C library functions such as
snprintf, and to avoid off by one errors: if callers pass sizeof(destbuf) -
1, they wouln't invoke UB, whereas they would by passing sizeof(destbuf)
with your current semantics.
{
if (src) {
while (' ' == *src) src++;

while (*src && (tknchar != *src)) {
if (lgh) {
*tkn++ = *src;
--lgh;
}
src++;
}
if (*src && (tknchar == *src)) src++;
}
*tkn = '\0';
return src;
} /* tknsplit */

#ifdef TESTING
#include <stdio.h>

#define ABRsize 6 /* length of acceptable tkn abbreviations */

/* ---------------- */

static void showtkn(int i, char *tok)
{
putchar(i + '1'); putchar(':');
puts(tok);
} /* showtkn */

/* ---------------- */

int main(void)
{
char teststring[] = "This is a test, ,, abbrev, more";

const char *t, *s = teststring;
int i;
char tkn[ABRsize + 1];

puts(teststring);
t = s;
for (i = 0; i < 4; i++) {
t = tknsplit(t, ',', tkn, ABRsize);
showtkn(i, tkn);
}

puts("\nHow to detect 'no more tkns' while truncating");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ',', tkn, 3);
showtkn(i, tkn);
i++;
}

puts("\nUsing blanks as tkn delimiters");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ' ', tkn, ABRsize);
showtkn(i, tkn);
i++;
}
return 0;
} /* main */

#endif
/* ------- end file tknsplit.c ----------*/

/* ------- file tknsplit.h ----------*/
#ifndef H_tknsplit_h
# define H_tknsplit_h

# ifdef __cplusplus
extern "C" {
# endif

#include <stddef.h>

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The
tkn is terminated by the first appearance of tknchar,
or by the end of the source string.

The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.

Returns: a pointer past the terminating tknchar.

This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.

released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
revised 2007-05-26 (name)
*/

const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh); /* length tkn can receive */
/* not including final '\0' */

# ifdef __cplusplus
}
# endif
#endif
/* ------- end file tknsplit.h ----------*/
Posting the source code to a public version strtok_r would have been more
helpful.
The only advantage your function offers over strtok_r is the fact that it
does not modify the source string.

--
Chqrlie.
Sep 14 '07 #6
On Sat, 15 Sep 2007 01:07:48 +0200,
Charlie Gordon <ne**@chqrlie.orgwrote:
"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...
>siddhu wrote:
>>>
As I know strtok_r is re-entrant version of strtok.
[snip]
>There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

Come on, strtok_r is part of POSIX. Do you pretend POSIX is not popular
enough.
POSIX is very popular. So is cricket. Neither, however is topical here.

If there were no other place where POSIX were already discussed, one
would have been created, given its popularity.

POSIX is discussed on comp.unix.programmer, and the people there are
very knowledgeable about the subject.

Regards,
Martien
--
|
Martien Verbruggen | Failure is not an option. It comes bundled
| with your Microsoft product.
|
Sep 15 '07 #7
"Martien verbruggen" <mg**@tradingpost.com.aua écrit dans le message de
news: sl*****************@martien.heliotrope.home...
On Sat, 15 Sep 2007 01:07:48 +0200,
Charlie Gordon <ne**@chqrlie.orgwrote:
>"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...
>>siddhu wrote:

As I know strtok_r is re-entrant version of strtok.

[snip]
>>There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

Come on, strtok_r is part of POSIX. Do you pretend POSIX is not popular
enough.

POSIX is very popular. So is cricket. Neither, however is topical here.

If there were no other place where POSIX were already discussed, one
would have been created, given its popularity.

POSIX is discussed on comp.unix.programmer, and the people there are
very knowledgeable about the subject.
POSIX may not be topical here, but mentioning strtok_r as a widely available
_fixed_ version of broken strtok is more helpful to the OP than the useless
display of obtuse chauvinism expressed ad nauseam by some of the group's
regulars.

Why did C99 get published without including the reentrant alternatives to
strtok and similar functions is a mystery. I guess the national bodies were
too busy arguing about iso646.h. Other Posix utility functions are missing
for no reason: strdup for instance. Did the Posix guys patent those or is
WG14 allergic to unix ?

--
Chqrlie.
Sep 15 '07 #8
Charlie Gordon wrote:
"CBFalconer" <cb********@yahoo.coma écrit:
>siddhu wrote:
>>>
As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with
a null pointer for the first parameter. My confusion is that
this behavior is same as strtok. So I assume strtok_r must also
be using any function static variable to keep the information
about s1. If this is the case then how strtok_r is re- entrant?
Otherwise how it keeps the information about s1?

There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

Come on, strtok_r is part of POSIX. Do you pretend POSIX is not
popular enough. Multiple implementations of strtok_r have been
posted before your answer.
Popularity doesn't enter into it. Presence in the standard library
does. strtok_r doesn't exist there. That makes it off-topic here
in c.l.c. (barring source).
>>
/* ------- file tknsplit.c ----------*/
#include "tknsplit.h"

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The

Why skip blanks ? this is not strtok behaviour. The code and the
comment don't agree on what blanks are: by C99 Standard, blanks are
space and tab.
This is not strtok. It is tknsplit. This is behaviour that seems
more useful to me. You don't have to use it, but siddhu may wish
to.
>
.... snip ...
>
Posting the source code to a public version strtok_r would have
been more helpful. The only advantage your function offers over
strtok_r is the fact that it does not modify the source string.
Which, IMO, is a major improvement. It also detects missing
tokens. It (once more) is NOT strtok. I have no idea what
strtok_r is, except that it invades user namespace.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 15 '07 #9
On 15 Sep 2007 at 1:28, Charlie Gordon wrote:
Why did C99 get published without including the reentrant alternatives to
strtok and similar functions is a mystery. I guess the national bodies were
too busy arguing about iso646.h. Other Posix utility functions are missing
for no reason: strdup for instance. Did the Posix guys patent those or is
WG14 allergic to unix ?
You can easily write your own version of strdup in a couple lines. I use
the following:

char *strdup(char *s)
{
char *r=0;
int i=0;
do {
r=(char *) realloc(r,++i * sizeof(char));
} while(r[i-1]=s[i-1]);
return r;
}

Sep 15 '07 #10
"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...
Charlie Gordon wrote:
>"CBFalconer" <cb********@yahoo.coma écrit:
>>siddhu wrote:

As I know strtok_r is re-entrant version of strtok.
strtok_r() is called with s1(lets say) as its first parameter.
Remaining tokens from s1 are obtained by calling strtok_r() with
a null pointer for the first parameter. My confusion is that
this behavior is same as strtok. So I assume strtok_r must also
be using any function static variable to keep the information
about s1. If this is the case then how strtok_r is re- entrant?
Otherwise how it keeps the information about s1?

There is no such standard C function as strtok_r(). To discuss
such a function you have to give its source, in standard C.
However, I just happen to have a suitable replacement function
lying about, whose source follows:

Come on, strtok_r is part of POSIX. Do you pretend POSIX is not
popular enough. Multiple implementations of strtok_r have been
posted before your answer.

Popularity doesn't enter into it. Presence in the standard library
does. strtok_r doesn't exist there. That makes it off-topic here
in c.l.c. (barring source).
I did post source code (my own, put in the public domain)
>>>
/* ------- file tknsplit.c ----------*/
#include "tknsplit.h"

/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The

Why skip blanks ? this is not strtok behaviour. The code and the
comment don't agree on what blanks are: by C99 Standard, blanks are
space and tab.

This is not strtok. It is tknsplit. This is behaviour that seems
more useful to me. You don't have to use it, but siddhu may wish
to.
You introduced your function like this: "I just happen to have a suitable
replacement function"
One would expect semantics to be a tad closer.
>>
... snip ...
>>
Posting the source code to a public version strtok_r would have
been more helpful. The only advantage your function offers over
strtok_r is the fact that it does not modify the source string.

Which, IMO, is a major improvement. It also detects missing
tokens. It (once more) is NOT strtok. I have no idea what
strtok_r is, except that it invades user namespace.
You must be joking Mr Falconer. You probably never heard of Unix, or even
Linux... Or do you live on this remote planet Microsoft has not settled yet
? If you have no idea what strtok_r is, learn something new today:
http://linux.die.net/man/3/strtok_r or if you like Microsoft's version
better (part of the secure string proposal)
http://msdn2.microsoft.com/en-us/lib...z3(VS.80).aspx

--
Chqrlie
Sep 15 '07 #11
CBFalconer said:

<snip>
I have no idea what strtok_r is, except that it invades user
namespace.
No, it doesn't.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 15 '07 #12
Charlie Gordon said:
"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...
<snip>
>I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer.
No, he's toeing the group line, such as it is. As far as comp.lang.c is
concerned, there is *no such function* as strtok_r. If this question
were to arise in, say, comp.unix.programmer, Chuck's answer might be
very different.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 15 '07 #13
CBFalconer wrote:

[...]
I have no idea what
strtok_r is, except that it invades user namespace.
If you have no idea what those *_r functions are, it's time for you (as
a Linux user) to read Stevens APUE! :)
str[a-z] is reserved name space, so it isn't part of the user name space.
--
Tor <torust [at] online [dot] no>
Sep 15 '07 #14
Richard Heathfield wrote:
Charlie Gordon said:
>"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

<snip>
>>I have no idea what
strtok_r is, except that it invades user namespace.
You must be joking Mr Falconer.

No, he's toeing the group line, such as it is. As far as comp.lang.c is
concerned, there is *no such function* as strtok_r. If this question
were to arise in, say, comp.unix.programmer, Chuck's answer might be
very different.
I have seen Chuck post a number of times over in Linux forums, so it's
rather surprising if he doesn't know about POSIX.

Methinks he know, but choose here to pretend he doesn't! :-)

--
Tor <torust [at] online [dot] no>
Sep 15 '07 #15
Charlie Gordon wrote:
>
POSIX may not be topical here, but mentioning strtok_r as a widely available
_fixed_ version of broken strtok is more helpful to the OP than the useless
display of obtuse chauvinism expressed ad nauseam by some of the group's
regulars.
EXACTLY!
Why did C99 get published without including the reentrant alternatives to
strtok and similar functions is a mystery. I guess the national bodies were
too busy arguing about iso646.h. Other Posix utility functions are missing
for no reason: strdup for instance. Did the Posix guys patent those or is
WG14 allergic to unix ?
C99 did not change ANY of the bugs of the standard library

o non reentrant functions like strtok remained and no alternative
was proposed even if POSIX had developed one.

o Buffer overflows were written into the standard itself.
I had a lengthy discussion in comp.std.c about asctime()
and the fixed buffer of 26 position it says it needs. It
suffices to put some wrong values into the input structure
and you have a buffer overflow. But no corrective action
was taken. More, the commitee told the people reporting
the bug that it was OK to have a buffer overflow there.

o gets() was maintained of course. Only after lengthy discussions,
Mr Gwyn felt forced to propose a "fix" that would have fixed the
input buffer size to at least BUFSIZ. The committee apparently
decided that gets() was deprecated, maybe because of the discussion
in comp.std.c, I do not know. In any case it would have been
better to do it when C99 was published.

o Trigraphs were maintained in the standard.

And I could go on with those examples...
Sep 15 '07 #16
pete said:

<snip>
Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.
memcpy, surely? Why measure the string twice?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 15 '07 #17
"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de news:
Ma*********************@bt.com...
Charlie Gordon said:
>"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

<snip>
>>I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer.

No, he's toeing the group line, such as it is. As far as comp.lang.c is
concerned, there is *no such function* as strtok_r. If this question
were to arise in, say, comp.unix.programmer, Chuck's answer might be
very different.
If he would give a different answer on a different group, one of these
statements would be a lie or a joke.
So he is a fundamentalist, ostracist, extremist...

--
Chqrlie.
Sep 15 '07 #18
Charlie Gordon said:
"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de
news: Ma*********************@bt.com...
>Charlie Gordon said:
>>"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

<snip>
>>>I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer.

No, he's toeing the group line, such as it is. As far as comp.lang.c
is concerned, there is *no such function* as strtok_r. If this
question were to arise in, say, comp.unix.programmer, Chuck's answer
might be very different.

If he would give a different answer on a different group, one of these
statements would be a lie or a joke.
....or a way of making a point, a la "Ich bin ein Berliner", with which
John F Kennedy bolstered the morale of West Berlin's citizens in June
1963. It was not "true" in the literal sense, but neither was it a lie
or a joke.

So he is a fundamentalist, ostracist, extremist...
If you feel forced to resort to personal attacks, I can only assume you
have no logical arguments to put forward.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 15 '07 #19
Richard Heathfield wrote:
>
If you feel forced to resort to personal attacks, I can only assume you
have no logical arguments to put forward.
Personal attacks are allowed only for friends of Heathfield & Co.

Sep 15 '07 #20
"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de news:
tJ******************************@bt.com...
Charlie Gordon said:
>"Richard Heathfield" <rj*@see.sig.invalida écrit dans le message de
news: Ma*********************@bt.com...
>>Charlie Gordon said:

"CBFalconer" <cb********@yahoo.coma écrit dans le message de news:
46***************@yahoo.com...

<snip>

I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer.

No, he's toeing the group line, such as it is. As far as comp.lang.c
is concerned, there is *no such function* as strtok_r. If this
question were to arise in, say, comp.unix.programmer, Chuck's answer
might be very different.

If he would give a different answer on a different group, one of these
statements would be a lie or a joke.

...or a way of making a point, a la "Ich bin ein Berliner", with which
John F Kennedy bolstered the morale of West Berlin's citizens in June
1963. It was not "true" in the literal sense, but neither was it a lie
or a joke.
Except CBFalconer is no John F. Kennedy ;-)
His blunt rethoric does not bolster any one's morale, sarcasm does no good.
>So he is a fundamentalist, ostracist, extremist...

If you feel forced to resort to personal attacks, I can only assume you
have no logical arguments to put forward.
You are right, I should not have attributed to malice that which can be
adequately explained by plain ignorance. But I repeat: to not use strtok
anymore, check for availability of strtok_r or implement it locally from the
public domain source that has been posted above.

--
Chqrlie.
Sep 15 '07 #21
On Sep 15, 1:40 pm, "Joachim Schmitz" <nospam.j...@schmitz-digital.de>
wrote:
It s is NULL, this version only returns NULL if the implemenation's
malloc(0) returns NULL too
Entirely consistent with the standard library string functions - if
you pass them a char * that doesn't point to a string, the behavior is
undefined.

<OT>
And this is one case where "the thing you hope will happen" probably
doesn't - e.g. trying to compute strlen(NULL) on a GNU system produces
a seg fault).
</OT>
>
Bye, Jojo
Sep 15 '07 #22
"Joachim Schmitz" <no*********@schmitz-digital.dea écrit dans le message
de news: fc**********@online.de...
"Joachim Schmitz" <no*********@schmitz-digital.deschrieb im Newsbeitrag
news:fc**********@online.de...
>>
<Fr************@googlemail.comschrieb im Newsbeitrag
news:11**********************@g4g2000hsf.googlegr oups.com...
<snip>
My suggestion would be:

#include <stdlib.h>
#include <string.h>

char *my_strdup(const char *s)
{
size_t len;
char *t;
if(t=malloc(len=strlen(s)+1))
memcpy(t, s, len);
return t;
}

It s is NULL, this version only returns NULL if the implementation's
malloc(0) returns NULL too
oops, sorry somehow my quoting was wrong.

Only the last sentence was mine...
And it does not make much sense ;-)
If s in NULL, strlen(s) invokes undefined behaviour.
otherwise, len is always0, and the code does not depend on the behaviour
of malloc(0)

--
Chqrlie.
Sep 15 '07 #23
<Fr************@googlemail.comschrieb im Newsbeitrag
news:11*********************@n39g2000hsh.googlegro ups.com...
On Sep 15, 1:40 pm, "Joachim Schmitz" <nospam.j...@schmitz-digital.de>
wrote:
>It s is NULL, this version only returns NULL if the implemenation's
malloc(0) returns NULL too

Entirely consistent with the standard library string functions - if
you pass them a char * that doesn't point to a string, the behavior is
undefined.
Fair enough, but I'd prefer my own functions to do better than that, so I
like Charlie Gordon's implementation better.
<OT>
And this is one case where "the thing you hope will happen" probably
doesn't - e.g. trying to compute strlen(NULL) on a GNU system produces
a seg fault).
</OT>
Damn, here too...
anwyway: see above

Bye, Jojo
Sep 15 '07 #24
<Fr************@googlemail.coma écrit dans le message de news:
11*********************@g4g2000hsf.googlegroups.co m...
On Sep 15, 2:02 pm, "Charlie Gordon" <n...@chqrlie.orgwrote:
><Francine.Ne...@googlemail.coma écrit dans le message de news:
1189858659.251030.194...@g4g2000hsf.googlegroups. com...
>>My suggestion would be:

#include <stdlib.h>
#include <string.h>
>>char *my_strdup(const char *s)
{
size_t len;
char *t;
if(t=malloc(len=strlen(s)+1))
memcpy(t, s, len);
return t;
}

You code performs the task, but I find it misleading to call len a var
iable that is not the length of the string. I prefer to use size for
this purpose.

Furthermore, this code would not pass my default warning settings.
Assignment as an test expression is considered sloppy and error prone.

Tell that to Kernighan and Ritchie. :)
They might read this thread, I am sure they would care to comment.

Coding conventions is a very effective tool to catch bugs at an early
stage in development. Using all the help the compiler and other
automated tools can give at tracking potential errors disguised as
suspicious use of certain operators enhances productivity.

There is no gain at writing

size_t len;
char *t;
if(t=malloc(len=strlen(s)+1)) ...

instead of

size_t size = strlen(s) + 1;
char *t = malloc(size);
if (t) ...

The latter is much more readable and less error prone.

Your version did improve on mine by using memcpy to copy the '\0'
instead of writing separate code for that.

--
Chqrlie.
Is there a reason for the typo in your signature?
chqrlie is my handle, is there a reason you don't sign your messages ?
Sep 15 '07 #25
"Charlie Gordon" <ne**@chqrlie.orgschrieb im Newsbeitrag
news:46***********************@news.free.fr...
<Fr************@googlemail.coma écrit dans le message de news:
11*********************@g4g2000hsf.googlegroups.co m...
>On Sep 15, 2:02 pm, "Charlie Gordon" <n...@chqrlie.orgwrote:
>><Francine.Ne...@googlemail.coma écrit dans le message de news:
1189858659.251030.194...@g4g2000hsf.googlegroups .com...
My suggestion would be:

#include <stdlib.h>
#include <string.h>

char *my_strdup(const char *s)
{
size_t len;
char *t;
if(t=malloc(len=strlen(s)+1))
memcpy(t, s, len);
return t;
}

You code performs the task, but I find it misleading to call len a var
iable that is not the length of the string. I prefer to use size for
this purpose.

Furthermore, this code would not pass my default warning settings.
Assignment as an test expression is considered sloppy and error prone.

Tell that to Kernighan and Ritchie. :)

They might read this thread, I am sure they would care to comment.

Coding conventions is a very effective tool to catch bugs at an early
stage in development. Using all the help the compiler and other
automated tools can give at tracking potential errors disguised as
suspicious use of certain operators enhances productivity.

There is no gain at writing

size_t len;
char *t;
if(t=malloc(len=strlen(s)+1)) ...

instead of

size_t size = strlen(s) + 1;
char *t = malloc(size);
if (t) ...

The latter is much more readable and less error prone.

Your version did improve on mine by using memcpy to copy the '\0'
instead of writing separate code for that.
True but quite easy to fix:
char *my_strdup(const char *str) {
char *dest = NULL;

if (str) {
size_t size = strlen(str) + 1;
dest = malloc(size);
if (dest)
memcpy(dest, str, size);
}
return dest;
}
Sep 15 '07 #26
Charlie Gordon wrote:
"CBFalconer" <cb********@yahoo.coma écrit:
.... snip about tknsplit and strtok ...
>
>Which, IMO, is a major improvement. It also detects missing
tokens. It (once more) is NOT strtok. I have no idea what
strtok_r is, except that it invades user namespace.

You must be joking Mr Falconer. You probably never heard of
Unix, or even Linux... Or do you live on this remote planet
Microsoft has not settled yet ? If you have no idea what
strtok_r is, learn something new today:
http://linux.die.net/man/3/strtok_r
or if you like Microsoft's version better (part of the secure
string proposal)
http://msdn2.microsoft.com/en-us/lib...z3(VS.80).aspx
This newsgroup is comp.lang.c. C is defined by the various C
standards, present or past, and includes K&R for times previous to
1989. None of these define, or even mention, strtok_r. Thus,
without standard C code, published in the same message, discussion
of it is off-topic here. The name is still reserved for the
implementor. As I said, it doesn't exist. Unix, Linux, Microsoft
have no influence whatsoever.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 15 '07 #27
Richard Heathfield wrote:
CBFalconer said:

<snip>
>I have no idea what strtok_r is, except that it invades user
namespace.

No, it doesn't.
Well, it's not exactly a typo, but ....

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 15 '07 #28
Charlie Gordon wrote:
"pete" <pf*****@mindspring.coma écrit:
.... snip ...
>>
Intsead of using realloc in a loop, I think most programmers
would write strdup with one function call to strlen and one to
malloc and one to strcpy.

Or more efficiently calling memcpy instead of strcpy.

char *strdup(const char *str) {
size_t len;
char *dest = NULL;

if (str) {
len = strlen(str);
dest = malloc(len + 1);
if (dest) {
memcpy(dest, str, len);
dest[len] = '\0';
}
}
return dest;
}
I challenge the 'more efficient'. It will be highly dependent on
the compiler, but at the simplest you would be trading the effort
of an extra procedure call against the possible efficiency
improvement. Since most strings are short (in my case, probably
under 10 or 20 chars) this 'improvement' is a chimera. Also
bearing in mind that strdup is a system reserved name, my version
(with a #include <stdlib.h>) is:

char *dupstr(const char *str) {
char *dest, *temp;

if (dest = malloc(1 + strlen(str))) {
temp = dest;
while (*temp++ = *str++) continue;
}
return dest;
}

and I am willing to let it go boom when str is NULL, for early
warning etc. of problems.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Sep 15 '07 #29
Richard Heathfield wrote:
pete said:

<snip>
>Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.

memcpy, surely? Why measure the string twice?
Huh? strcpy doesn't measure anything.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Sep 15 '07 #30
Joe Wright said:
Richard Heathfield wrote:
>pete said:

<snip>
>>Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.

memcpy, surely? Why measure the string twice?
Huh? strcpy doesn't measure anything.
Sorry, Joe - I was guilty of truncated exegesis. What I meant was this:
that strcpy must keep going until it hits a null terminator, and it
doesn't know in advance where that null terminator will be found, so it
must test every character. So, although it isn't measuring the string
as such, that's only because it doesn't bother to write down how long
the string is. It's still ploughing through the string, character by
character. But we've already done that with our strlen call. By using
memcpy, we can take advantage of the fact that the string has already
been measured - memcpy can use any number of platform-specific tricks
for copying multiple bytes at a time. Therefore, if the length of the
string to be copied is known in advance, it is (likely to be) more
efficient to use memcpy than strcpy.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 15 '07 #31
"Charlie Gordon" <ne**@chqrlie.orgwrites:
[...]
But I repeat: to not use
strtok anymore, check for availability of strtok_r or implement it
locally from the public domain source that has been posted above.
Well, maybe.

The standard function strtok() is non-reentrant *and* it has some
other -- well, not bugs necessarily, but quirks. For example, the
fact that it merges multiple adjacent delimiters can be inconvenient,
though it might be just what you need. (In practice, you usually want
this behavior if the delimiter is whitespace, but not if it's
something else.)

If you're using strtok() and it already does exactly what you want
except for the lack of reentrancy, then strtok_r() (if it's available
on your system -- and if not, you can compile it yourself) is just the
thing. If your requirements are less specific, then tknsplit() might
turn out to be perfect for you -- or some other non-standard function
might be better.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 15 '07 #32
"Joachim Schmitz" <no*********@schmitz-digital.dewrites:
<Fr************@googlemail.comschrieb im Newsbeitrag
news:11*********************@n39g2000hsh.googlegro ups.com...
>On Sep 15, 1:40 pm, "Joachim Schmitz" <nospam.j...@schmitz-digital.de>
wrote:
>>It s is NULL, this version only returns NULL if the implemenation's
malloc(0) returns NULL too

Entirely consistent with the standard library string functions - if
you pass them a char * that doesn't point to a string, the behavior is
undefined.
Fair enough, but I'd prefer my own functions to do better than that, so I
like Charlie Gordon's implementation better.
><OT>
And this is one case where "the thing you hope will happen" probably
doesn't - e.g. trying to compute strlen(NULL) on a GNU system produces
a seg fault).
</OT>
Damn, here too...
anwyway: see above
How is returning (size_t)-1 better than a seg fault?

If I pass NULL to strlen(), there's a bug in my program. I'd like to
find out about it as early as possible. If strlen() quietly returns
-1, I might not detect the error until much later.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 15 '07 #33
>Charlie Gordon wrote:
>>Why did C99 get published without including the reentrant alternatives to
strtok and similar functions is a mystery.
Ask the people who were on the committees. :-) (Seriously, I do not
know why the non-reentrant versions were retained without at least
some sort of cleanup.)

In article <46**********************@news.orange.fr>
jacob navia <ja***@jacob.remcomp.frwrote:
>C99 did not change ANY of the bugs of the standard library
I am not sure I would call all of these "bugs". ("Misfeatures",
perhaps, especially trigraphs :-) . More seriously, just two
points here:)
>o non reentrant functions like strtok remained and no alternative
was proposed even if POSIX had developed one.
While strtok_r() is an improvement on strtok(), it leaves one of
strtok()'s fundamental flaws in place. If one is going to "improve"
strtok(), one should at least look at the BSD strsep().

Still, importing the whole set of POSIX "_r" functions would, I
think, have been better than doing nothing.
>o Buffer overflows were written into the standard itself.
This is, at best, an overstatement.
I had a lengthy discussion in comp.std.c about asctime()
and the fixed buffer of 26 position it says it needs. It
suffices to put some wrong values into the input structure
and you have a buffer overflow.
If you "put some wrong values" in, you have little hope of expecting
*anything* -- what happens in lcc-win32, for instance, if I write:

struct big { int a[1000]; };
struct big main(double oops) {
short x = strlen((char *)0x98766542);
... /* more "wrong values" as inputs as needed */
return *(struct big *)42;
}

? If you want to protect against bad inputs, you need to think
hard about which kinds of "bad inputs" to guard against, and do
some serious cost/benefit analysis.

Moreover, if your objection is that values of .tm_year greater
than 8100 (or less than or equal to some negative number) cause
problems, you can always test for that in your own implementation:

__internal_return_type __internal_worker_function_for_times(...) {
...
if (OUT_OF_RANGE(tm->tm_year)) ... signal error ...
...
}

which might be used as, e.g.:

char *asctime(const struct tm *tm) {
...
if (__internal_worker_function_for_times(...) == ERROR)
__runtime_error_trap_report("invalid parameter to asctime()");
...
}

and thus demonstrate the superior Quality of Implementation of
lcc-win32, with regard to this particular possibility. (Presumably
__runtime_error_trap_report saves the state of the program for use
in the debugger, prints a stack trace, and/or does whatever else
is good for fixing program bugs.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Sep 15 '07 #34
On 2007-09-15, Keith Thompson <ks***@mib.orgwrote:
How is returning (size_t)-1 better than a seg fault?

If I pass NULL to strlen(), there's a bug in my program. I'd like to
find out about it as early as possible. If strlen() quietly returns
-1, I might not detect the error until much later.
By returning -1 strlen is not being so quiet. If you check function's return
value then you can catch the error at least as good as if it'd just segfault.
It could not segfault for some reason, but you'd always be able to check the
return value and tell yourself there's something wrong.
Sep 15 '07 #35
Richard Heathfield wrote:
>
pete said:

<snip>
Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.

memcpy, surely? Why measure the string twice?
I suppose I was unduly influenced by the strdup example in K&R2.

--
pete
Sep 15 '07 #36
ra*****@dcc.ufmg.br writes:
On 2007-09-15, Keith Thompson <ks***@mib.orgwrote:
>How is returning (size_t)-1 better than a seg fault?
If I pass NULL to strlen(), there's a bug in my program. I'd like to
find out about it as early as possible. If strlen() quietly returns
-1, I might not detect the error until much later.

By returning -1 strlen is not being so quiet. If you check function's return
value then you can catch the error at least as good as if it'd just segfault.
It could not segfault for some reason, but you'd always be able to check the
return value and tell yourself there's something wrong.
Since returning (size_t)-1 is non-standard behavior (though it's
allowed), I'm not likely to check for it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 15 '07 #37
On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes undefined behaviour when running out of memory, it
should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company, I think I can just
about write a strdup function that works more than adequately on any
machine I'd ever want to run it on.

Sep 15 '07 #38
Sam Harris said:
On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
>Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes undefined behaviour when running out of memory,
it should return NULL instead.


Yeah, whatever. I'm a coder at a Fortune 500 company, I think I can
just about write a strdup function that works more than adequately on
any machine I'd ever want to run it on.
Claiming that you work for a Fortune 500 company might impress your
aunt, but the fact remains that your implementation of a string
duplication function left a lot to be desired. You would do well to
learn from your mistakes, rather than try to justify them.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Sep 15 '07 #39
On Sat, 15 Sep 2007 13:07:28 +0200, in comp.lang.c , "Charlie Gordon"
<ne**@chqrlie.orgwrote:
>
If he would give a different answer on a different group, one of these
statements would be a lie or a joke.
If my teenage son asks me how they work out the price of credit
default swaps, I give one answer.

If junior quant analyst in a bank asks me the same question, I give an
entirely different answer.

I assure you, neither answer is a lie or a joke.
>So he is a fundamentalist, ostracist, extremist...
Or he's tailoring his answer to the forum of the question.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Sep 15 '07 #40
"jacob navia" <ja***@jacob.remcomp.frwrote in message
news:46***********************@news.orange.fr...
Mr Clive Feather submitted a defect report saying in substance the
same thing as I said. The committee answer was:

<quote>
Thus, asctime() may exhibit undefined behavior if any of the members of
timeptr produce undefined behavior in the sample algorithm (for example,
if the timeptr->tm_wday is outside the range 0 to 6 the function may
index beyond the end of an array).

As always, the range of undefined behavior permitted includes:
Corrupting memory
Aborting the program
Range checking the argument and returning a failure indicator (e.g., a
null pointer)
Returning truncated results within the traditional 26 byte buffer.
There is no consensus to make the suggested change or any change along
this line.
<end quote>

You read correctly. Corrupting memory (i.e. a buffer overflow) is
within the range of undefined behavior acceptable!!!!

I have the right then, to name a buffer overflow for what it is, a
buffer overflow in the C standard with all the committee behind it.
You have the right to misread anything. You have a responsibility, as
a self-proclaimed expert. to think with something other than your
gonads. The response you quoted clearly encompasses a variety
of nicer behaviors than buffer overflow, but you neglect to take
them in.

The committee *accepts* that buffer overflow can occur in a
conforming implementation. The same is true of:

int a[10];
a[300] = 4;

And the committee is "behind" an implementation that overwrites
storage when this ill-formed program executes. (The committee
is also "behind" an implementation that aborts with a diagnostic
message.)

Get over it.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Sep 15 '07 #41
On Sun, 16 Sep 2007 00:21:43 +0200 (CEST), in comp.lang.c , Sam Harris
<no****@in.validwrote:
>On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
>Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes undefined behaviour when running out of memory, it
should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company,
Whoopy doo. *Anyone* will tell you that this is absolutely no
guarantee whatsoever of coding quality or ability.
I think I can just
about write a strdup function that works more than adequately on any
machine I'd ever want to run it on.
Pardon me, but you had a few actual errors pointed out, perhaps you
should consider being less arrogant?
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
Sep 15 '07 #42
Sam Harris wrote:
>
On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes
undefined behaviour when running out of memory, it
should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company,
I think I can just
about write a strdup function that works more than adequately on any
machine I'd ever want to run it on.
You posted an extraordinarily crappy code example.

--
pete
Sep 15 '07 #43
Richard Heathfield wrote:
Joe Wright said:
>Richard Heathfield wrote:
>>pete said:

<snip>

Intsead of using realloc in a loop,
I think most programmers would write strdup with
one function call to strlen and one to malloc and one to strcpy.
memcpy, surely? Why measure the string twice?
Huh? strcpy doesn't measure anything.

Sorry, Joe - I was guilty of truncated exegesis. What I meant was this:
that strcpy must keep going until it hits a null terminator, and it
doesn't know in advance where that null terminator will be found, so it
must test every character. So, although it isn't measuring the string
as such, that's only because it doesn't bother to write down how long
the string is. It's still ploughing through the string, character by
character. But we've already done that with our strlen call. By using
memcpy, we can take advantage of the fact that the string has already
been measured - memcpy can use any number of platform-specific tricks
for copying multiple bytes at a time. Therefore, if the length of the
string to be copied is known in advance, it is (likely to be) more
efficient to use memcpy than strcpy.
Well, some 5 years ago, I made a similar comment on your code Richard,
which was using strcpy() at the time. We had a rather "long" argument
about it, and in the end, I tried to make my point by measuring memcpy()
vs strcpy() performance.

IIRC, the result of those tests, was rather humiliating for me, as your
strcpy() performed excellent! :-)

Is there a reason to beleave, that the strcpy() has become more CPU
bound in recent years? If not, I don't think you will have much success
in measuring an improvement by using memcpy().

Making good measurements on this, is a challenge. We don't want to
measure L1 cache performance only.

--
Tor <torust [at] online [dot] no>
Sep 16 '07 #44
Sam Harris <no****@in.validwrites:
On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
>Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes undefined behaviour when running out of memory, it
should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company, I think I can just
about write a strdup function that works more than adequately on any
machine I'd ever want to run it on.
You haven't demonstrated it so far. Frankly, when I read your
implementation upthread I assumed it was a joke. You call realloc()
once for each character; why not compute the length and call malloc()
just once?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 16 '07 #45
jacob navia <ja***@jacob.remcomp.frwrites:
Chris Torek wrote:
>jacob navia <ja***@jacob.remcomp.frwrote:
>>o Buffer overflows were written into the standard itself.
This is, at best, an overstatement.

A buffer overflow happens when a fixed size memory area is defined
but a program writes PAST the fixed size buffer. This is a buffer
overflow.

Now, the standard specifies a buffer length of 26 for the buffer of
asctime.
[...]

Yes, calling asctime() with certain arguments can result in a buffer
overflow.

Calling strcpy() with certain arguments can result in a buffer
overflow. Likewise for sprintf(), sscanf(), memcpy(), memmove(),
strcat(), etc. In all these cases, the arguments passed are under the
program's control; the problem can reliably be avoided by checking the
arguments before invoking the function.

I happen to agree that asctime() should be defined to use a larger
buffer, one big enough so that the buffer won't overflow for any
possible arguments. But the problem is so easy to avoid that it's
hardly a fatal flaw in the language -- and it can't overflow if you
give it an argument corresponding to the current time (at least not
for the next 8000 years or so). It's certainly not nearly as
dangerous as gets().

I generally wouldn't use asctime() anyway. The format it uses isn't
my favorite (I prefer YYYY-MM-DD for dates), and the trailing '\n' is
more trouble than it's worth. In real code, I'd use strftime()
instead, which is more flexible and doesn't have asctime()'s problems.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 16 '07 #46
"Keith Thompson" <ks***@mib.orga écrit dans le message de news:
ln************@nuthaus.mib.org...
Sam Harris <no****@in.validwrites:
>On 15 Sep 2007 at 2:31, Charlie Gordon wrote:
>>Your function should take a const char *.
sizeof(char) is 1 by definition
Why do you cast the result of realloc ?
Your function invokes undefined behaviour when running out of memory, it
should return NULL instead.

Yeah, whatever. I'm a coder at a Fortune 500 company, I think I can just
about write a strdup function that works more than adequately on any
machine I'd ever want to run it on.

You haven't demonstrated it so far. Frankly, when I read your
implementation upthread I assumed it was a joke. You call realloc()
once for each character; why not compute the length and call malloc()
just once?
Frankly, I too thought the repeated calls to realloc was some sort of joke
from a forum regular trying to come up with the most inefficient yet correct
implementation and was surprised to find the small klotzy details I pointed
out.

If you are actually proud of the code you posted, and consider that a good
example of what you are paid for by a large corporation, shame on you ! You
have some serious progress to make to reach 'decent' status. So far you
qualify for 'best of the worst'. I guess being the best is what prompts
your arrogance, but rest assured everyone here can come up with an even
worse proposal, one you would not even understand.

No matter how efficient and powerful the hardware guys make their products,
there will be software bums to destroy these gains, and managers to come up
with lame excuses and marketers to ship lousy crap. Sturgeon was so right!

--
Chqrlie.
Sep 16 '07 #47
Keith Thompson wrote:
ra*****@dcc.ufmg.br writes:
>On 2007-09-15, Keith Thompson <ks***@mib.orgwrote:
>>How is returning (size_t)-1 better than a seg fault?
If I pass NULL to strlen(), there's a bug in my program. I'd like to
find out about it as early as possible. If strlen() quietly returns
-1, I might not detect the error until much later.
By returning -1 strlen is not being so quiet. If you check function's return
value then you can catch the error at least as good as if it'd just segfault.
It could not segfault for some reason, but you'd always be able to check the
return value and tell yourself there's something wrong.

Since returning (size_t)-1 is non-standard behavior (though it's
allowed), I'm not likely to check for it.
I don't want to check strlen for error. SIZE_MAX may well be valid.
Passing in a NULL should be a NOP in my view.

size_t strlen(const char *s) {
size_t r = 0;
if (s) while (*s++) ++r;
return r;
}

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Sep 16 '07 #48

"Joe Wright" <jo********@comcast.netschrieb im Newsbeitrag
news:6J******************************@comcast.com. ..
Keith Thompson wrote:
>ra*****@dcc.ufmg.br writes:
>>On 2007-09-15, Keith Thompson <ks***@mib.orgwrote:
How is returning (size_t)-1 better than a seg fault?
If I pass NULL to strlen(), there's a bug in my program. I'd like to
find out about it as early as possible. If strlen() quietly returns
-1, I might not detect the error until much later.
By returning -1 strlen is not being so quiet. If you check function's
return
value then you can catch the error at least as good as if it'd just
segfault.
It could not segfault for some reason, but you'd always be able to check
the
return value and tell yourself there's something wrong.

Since returning (size_t)-1 is non-standard behavior (though it's
allowed), I'm not likely to check for it.
I don't want to check strlen for error. SIZE_MAX may well be valid.
Passing in a NULL should be a NOP in my view.

size_t strlen(const char *s) {
size_t r = 0;
if (s) while (*s++) ++r;
return r;
}
Well, that version of strlen doesn't distinguish between a NULL and an empty
string. This would:
size_t strlen(const char *s) {
size_t r = (size_t)-1;
if (s) while (*s++) ++r;
return ++r;
}

And neither is a NOP if being passed a NULL...

Bye, Jojo
Sep 16 '07 #49
In article <6J******************************@comcast.com>,
Joe Wright <jo********@comcast.netwrote:
>I don't want to check strlen for error. SIZE_MAX may well be valid.
Passing in a NULL should be a NOP in my view.
>size_t strlen(const char *s) {
size_t r = 0;
if (s) while (*s++) ++r;
return r;
}
Then how will you distinguish between the string containing just
the terminating nul, and the null pointer?? strlen() is often
used to determine array indices; you don't want to be indexing
the NULL pointer (for one thing, the result of the indexing
might get you to a readable or writable memory location -- and yes,
there are real systems on which virtual addresses near 0 are
accessible.)
--
Okay, buzzwords only. Two syllables, tops. -- Laurie Anderson
Sep 16 '07 #50

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

23 posts views Thread by kbhat | last post: by
6 posts views Thread by gyan | last post: by
13 posts views Thread by ern | last post: by
18 posts views Thread by Robbie Hatley | last post: by
4 posts views Thread by Michael | last post: by
14 posts views Thread by Mr John FO Evans | last post: by
3 posts views Thread by siddhu | last post: by
11 posts views Thread by magicman | last post: by
1 post views Thread by Marin | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.