strndup: RFC - C / C++

jacob navia

Hi
Reading comp.std.c I noticed that there was a message like this:
The WG14 Post Portland mailing is now available from the WG14 web site
at http://www.open-std.org/jtc1/sc22/wg14/

Best regards
Keld Simonsen

I went there and found that there is a report called
ISO/IEC JTC1 SC22 WG14 WG14/N1193
Specification for Safer C Library Functions —
Part II: Dynamic Allocation Functions

It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few weeks
ago.

It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing. Here is a proposed implementation. I would like
to see if your sharp eyes see any bug or serious problem with it.

Thanks in advance

jacob
---------------------------------------------------------cut here
#include <string.h>
#include <stdlib.h>
/*
The strndup function copies not more than n characters (characters that
follow a null character are not copied) from string to a dynamically
allocated buffer. The copied string shall always be null terminated.
*/
char *strndup(const char *string,size_t s)
{
char *p,*r;
if (string == NULL)
return NULL;
p = string;
while (s 0) {
if (*p == 0)
break;
p++;
s--;
}
s = (p - string);
r = malloc(1+s);
if (r) {
strncpy(r,string,s);
r[s] = 0;
}
return r;
}

#ifdef TEST
#include <stdio.h>
#define MAXTEST 60
int main(void)
{
char *table[MAXTEST];
char *str = "The quick brown fox jumps over the lazy dog";

for (int i=0; i<MAXTEST;i++) {
table[i] = strndup(str,i);
}
for (int i=0; i<MAXTEST;i++) {
printf("[%4d] %s\n",i,table[i]);
}
return 0;
}
#endif

Dec 2 '06 #1

Subscribe Post Reply

6980

Richard Heathfield

jacob navia said:

<snip>

I would like
to see if your sharp eyes see any bug or serious problem with it.

I don't see any serious problem with the code in terms of meeting its
specification, although I would consider replacing

strncpy(r,string,s);

with:

memcpy(r, string, s);

since you've already detected the terminator and know precisely where it is.

The thing that does concern me is the spec itself, which seems to me to
suffer from the same flaw as strncpy - i.e. it gives no indication of
whether truncation occurred. But of course that's a design issue, not a C
issue.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 2 '06 #2

jacob navia

Richard Heathfield a écrit :

jacob navia said:

<snip>

>>I would like
to see if your sharp eyes see any bug or serious problem with it.

I don't see any serious problem with the code in terms of meeting its
specification, although I would consider replacing

> strncpy(r,string,s);

with:

memcpy(r, string, s);

since you've already detected the terminator and know precisely where it is.

The thing that does concern me is the spec itself, which seems to me to
suffer from the same flaw as strncpy - i.e. it gives no indication of
whether truncation occurred. But of course that's a design issue, not a C
issue.

Interesting. I did not think about that.

You know of a version that gives that information back to the user?
It is sometimes important to know.

< off topic>
Since lcc-win32 supports optional arguments I could do:
char *strndup(char *str,size_t siz,bool *pTruncated=NULL);
strndup(str,30) would be strndup(str,30,NULL)...
< / off topic >

Dec 2 '06 #3

Richard Tobin

In article <cK******************************@bt.com>,
Richard Heathfield <rj*@see.sig.invalidwrote:

>The thing that does concern me is the spec itself, which seems to me to
suffer from the same flaw as strncpy - i.e. it gives no indication of
whether truncation occurred. But of course that's a design issue, not a C
issue.

When I've used my own version of strndup, it's always been make an
ordinary string from a "counted" string, so there is no question of
truncation. I suspect this is the more common use of it, rather than
copying a string to a buffer that might not be big enough.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Dec 3 '06 #4

Tom St Denis

jacob navia wrote:

It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few weeks
ago.

It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing. Here is a proposed implementation. I would like
to see if your sharp eyes see any bug or serious problem with it.

What would the point of strndup be? It allocates the memory so there
really isn't a problem of an overflow. If you're low on memory ... why
are you duplicating a string?

Me thinks this is a solution looking for a problem [not directed at you
specifically Jacob, just the whole question of whether we should care
about strndup at all].

Tom

Dec 3 '06 #5

Thad Smith

jacob navia wrote:

It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing. Here is a proposed implementation. I would like
to see if your sharp eyes see any bug or serious problem with it.

....

---------------------------------------------------------cut here
#include <string.h>
#include <stdlib.h>
/*
The strndup function copies not more than n characters (characters that
follow a null character are not copied) from string to a dynamically
allocated buffer. The copied string shall always be null terminated.
*/
char *strndup(const char *string,size_t s)

I haven't looked at the code, but I am a big believer in accurate
documentation.

Your comment references "n characters", but defines a parameter s. Is
that n? The comment doesn't say how many characters are copied, only
that it is not more than n, so I can't rely on the function to copy the
full strlen(string), even if s exceeds this. I suggest calling the
result a dynamically allocated char array, rather than a buffer, since
it isn't necessarily buffering anything. As a matter of fact, since it
is presumably sized to the input string, there probably isn't room for
additional characters to be added later, thus not used as a buffer.

The description doesn't say what the results are if insufficient memory
exists for the new array.

--
Thad

Dec 3 '06 #6

Richard Heathfield

Richard Tobin said:

In article <cK******************************@bt.com>,
Richard Heathfield <rj*@see.sig.invalidwrote:

>>The thing that does concern me is the spec itself, which seems to me to
suffer from the same flaw as strncpy - i.e. it gives no indication of
whether truncation occurred. But of course that's a design issue, not a C
issue.

When I've used my own version of strndup, it's always been make an
ordinary string from a "counted" string, so there is no question of
truncation. I suspect this is the more common use of it, rather than
copying a string to a buffer that might not be big enough.

No, that's not more common - it's just a more *intelligent* use of strncpy.
By far the most common usage of strncpy is from the cargo cult bunch: "I'm
smart, I know about buffer overruns, I know I should use strncpy instead of
strcpy, oops, oh look, I just threw away data, ohdearhowsadnevermind."

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 3 '06 #7

Richard Heathfield

jacob navia said:

You know of a version that gives that information back to the user?

No, but that doesn't mean much because I know of no version of strndup
whatsoever, apart from the one you just posted. I don't see any particular
use for it, so I've never gone looking for it. (That is not the same as
saying it's useless - one man's useless is another man's essential.)

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 3 '06 #8

CBFalconer

Tom St Denis wrote:

jacob navia wrote:

>It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few
weeks ago.

It proposes strdup and strndup too. Lcc-win32 proposes already
strdup, but strndup was missing. Here is a proposed implementation.
I would like to see if your sharp eyes see any bug or serious
problem with it.

What would the point of strndup be? It allocates the memory so
there really isn't a problem of an overflow. If you're low on
memory ... why are you duplicating a string?

Me thinks this is a solution looking for a problem [not directed at
you specifically Jacob, just the whole question of whether we should
care about strndup at all].

I can see a possible use for it - to duplicate the initial portion
of a string only, i.e. to truncate it on the right.

#include <stdlib.h>
#include <stddef.h>

/* The strndup function copies not more than n characters
(characters that follow a null character are not copied)
from string to a dynamically allocated buffer. The copied
string shall always be null terminated.
*/
char *strndup(const char *_string, size_t _len) {
char *s, *p;

if ((p = s = malloc(_len + 1))) {
if (_string) /* else interpret NULL as empty string */
while (_len-- && (*p++ = *_string++)) continue;
*p = '\0';
}
return s;
} /* strndup, untested */

However N1193 is, in general, a Microsoft proposal, not a
standard. It is their lame attempt to catch their own foolish
clueless programmers. It is ugly, ugly, ugly.

BTW, Jacobs code carelessly fails to define size_t. It also return
NULL for other reasons than lack of memory, which can only cause
confusion.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Dec 3 '06 #9

jacob navia

CBFalconer a écrit :

BTW, Jacobs code carelessly fails to define size_t.

???
I have
#include <stdlib.h>
and that file includes stddef.h that defines size_t

It also return
NULL for other reasons than lack of memory, which can only cause
confusion.

??? Why confusion?

Null means failure

Dec 3 '06 #10

jacob navia

Thad Smith a écrit :

jacob navia wrote:

>It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing. Here is a proposed implementation. I would like
to see if your sharp eyes see any bug or serious problem with it.

...

>---------------------------------------------------------cut here
#include <string.h>
#include <stdlib.h>
/*
The strndup function copies not more than n characters (characters that
follow a null character are not copied) from string to a dynamically
allocated buffer. The copied string shall always be null terminated.
*/
char *strndup(const char *string,size_t s)

I haven't looked at the code, but I am a big believer in accurate
documentation.

Your comment references "n characters", but defines a parameter s. Is
that n? The comment doesn't say how many characters are copied, only
that it is not more than n, so I can't rely on the function to copy the
full strlen(string), even if s exceeds this. I suggest calling the
result a dynamically allocated char array, rather than a buffer, since
it isn't necessarily buffering anything. As a matter of fact, since it
is presumably sized to the input string, there probably isn't room for
additional characters to be added later, thus not used as a buffer.

The description doesn't say what the results are if insufficient memory
exists for the new array.

Excuse me I just cutted and pasted the specification from the
standards document. It is not my documentation.

Dec 3 '06 #11

jacob navia

CBFalconer a écrit :

Tom St Denis wrote:

char *strndup(const char *_string, size_t _len) {
char *s, *p;

if ((p = s = malloc(_len + 1)))

Here you allocate _len characters even if the string could be
considerably shorter... This wastes space.

Dec 3 '06 #12

Roland Pibinger

On Sat, 02 Dec 2006 23:53:26 +0100, jacob navia wrote:

>Reading comp.std.c

....

>It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few weeks
ago.
It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing.

strdup, getline and other functions require deallocation by the user.
They were deliberately excluded from the C Standards. Not by
oversight, not because they were difficult to implement, not because
they wouldn't have been 'useful'.

Best regards,
Roland Pibinger

Dec 3 '06 #13

jacob navia

Roland Pibinger a écrit :

On Sat, 02 Dec 2006 23:53:26 +0100, jacob navia wrote:

>>Reading comp.std.c

...

>>It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few weeks
ago.
It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing.

strdup, getline and other functions require deallocation by the user.
They were deliberately excluded from the C Standards. Not by
oversight, not because they were difficult to implement, not because
they wouldn't have been 'useful'.

Best regards,
Roland Pibinger

1) What's wrong with the user deallocating?
2) Maybe this view is changing since that technical report is there...

Dec 3 '06 #14

Richard Heathfield

jacob navia said:

CBFalconer a écrit :
>BTW, Jacobs code carelessly fails to define size_t.

(Wrong, Chuck!)

>
???
I have
#include <stdlib.h>
and that file includes stddef.h that defines size_t

On your particular implementation, possibly, but the Standard doesn't
require this as far as I know. What it *does* require is that size_t is
available as a type to those translation units that have included
<stdlib.h>. So yes, your code was correct in that regard.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 3 '06 #15

jacob navia

Richard Heathfield a écrit :

jacob navia said:

>>CBFalconer a écrit :

>>>BTW, Jacobs code carelessly fails to define size_t.

(Wrong, Chuck!)

>>???
I have
#include <stdlib.h>
and that file includes stddef.h that defines size_t

On your particular implementation, possibly, but the Standard doesn't
require this as far as I know. What it *does* require is that size_t is
available as a type to those translation units that have included
<stdlib.h>. So yes, your code was correct in that regard.

stdlib.h defines

calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);

and many others, so I do not see how size_t could be unknown after
including stdlib.h...

Obviously in other implementation they could have defined size_t
several times in several files.

Dec 3 '06 #16

Richard Heathfield

jacob navia said:

<snip>

>
stdlib.h defines

calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);

and many others, so I do not see how size_t could be unknown after
including stdlib.h...

True enough - or, better still, we can simply quote chapter and verse at
Chuck:

"4.10 GENERAL UTILITIES <stdlib.h>

The header <stdlib.hdeclares four types and several functions of
general utility, and defines several macros./113/

The types declared are size_t and wchar_t (both described in $4.1.5),..."

<snip>

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 3 '06 #17

CBFalconer

jacob navia wrote:

CBFalconer a écrit :

>char *strndup(const char *_string, size_t _len) {
char *s, *p;

if ((p = s = malloc(_len + 1)))

Here you allocate _len characters even if the string could be
considerably shorter... This wastes space.

The only possible use for the function is to truncate input
strings, as I pointed out. In that case there is no space wasted.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Dec 3 '06 #18

CBFalconer

Roland Pibinger wrote:

On Sat, 02 Dec 2006 23:53:26 +0100, jacob navia wrote:

>Reading comp.std.c
...
>It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few
weeks ago.
It proposes strdup and strndup too. Lcc-win32 proposes already
strdup, but strndup was missing.

strdup, getline and other functions require deallocation by the user.
They were deliberately excluded from the C Standards. Not by
oversight, not because they were difficult to implement, not because
they wouldn't have been 'useful'.

By that reasoning malloc, calloc, and realloc should also be
omitted. Not to mention fopen.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Dec 3 '06 #19

CBFalconer

jacob navia wrote:

CBFalconer a écrit :

>BTW, Jacobs code carelessly fails to define size_t.

???
I have
#include <stdlib.h>
and that file includes stddef.h that defines size_t

Sorry, you are right there. However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).

>It also return NULL for other reasons than lack of memory, which
can only cause confusion.

??? Why confusion?

Null means failure

Because the caller can't tell whether or not the system is out of
memory. If you are going to define the 'undefined behaviour' from
calling with an invalid parameter, you might as well make it
something innocuous. Otherwise the only way the user can tell the
cause of the error is to save the input parameter, and test it for
NULL himself after the routine returns a NULL. If he does that he
might as well test first, and not call the routine.

The other basic choice is to let a bad parameter blow up in the
function. Both will work properly if the programmer tests and
avoids passing that bad parameter in the first place. If he
doesn't the 'innocuous behavior' has a much better chance of
producing a user friendly end application. The unsophisticated end
user has little use for a segfault message.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Dec 3 '06 #20

pete

Richard Heathfield wrote:

>
jacob navia said:

<snip>

stdlib.h defines

calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);

and many others, so I do not see how size_t could be unknown after
including stdlib.h...

True enough - or, better still,
we can simply quote chapter and verse at Chuck:

"4.10 GENERAL UTILITIES <stdlib.h>

The header <stdlib.hdeclares four types and several functions of
general utility, and defines several macros./113/

The types declared are size_t and wchar_t
(both described in $4.1.5),..."

<snip>

We can, but I also use jacob navia's logic
to deduce what's defined in a header,
except for stddef.h which is small enough to learn.

I tend to be more familiar with standard function descriptions
from looking them up repeatedly,
than I am with the standard header descriptions in their entirety.

I know that the ctype functions are described as being
able to take the value of EOF as an argument,
so I know that EOF is defined in ctype.h.

I know that putchar is described as being
able to return the value of EOF,
so I know that EOF is defined in stdio.h.

--
pete

Dec 3 '06 #21

Joe Wright

jacob navia wrote:
[ snip ]

stdlib.h defines

calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);

and many others, so I do not see how size_t could be unknown after
including stdlib.h...

Obviously in other implementation they could have defined size_t
several times in several files.

Lazy? Headers tend to declare rather than define. All four of your
examples fail as prototypes.

realloc's prototype..

void *realloc(void *_ptr, size_t _size);

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Dec 3 '06 #22

jacob navia

Joe Wright a écrit :

jacob navia wrote:
[ snip ]

>stdlib.h defines

calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);

and many others, so I do not see how size_t could be unknown after
including stdlib.h...

Obviously in other implementation they could have defined size_t
several times in several files.

Lazy? Headers tend to declare rather than define. All four of your
examples fail as prototypes.

realloc's prototype..

void *realloc(void *_ptr, size_t _size);

Please Joe do not use my mail messages as header files :-)

I was just copying, and I wrote (etc...) sometimes to
speed it up. Maybe is lazyness but I just wanted to make a point,
not to write a header file.

Dec 3 '06 #23

Chris Torek

[comp.compilers.lcc snipped as there is nothing lcc-specific here]

In article <45***************@yahoo.com>
CBFalconer <cb********@maineline.netwrote:

>... However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).

Indeed, in fact including <stdlib.hmust *not* include <stddef.h>,
because stddef.h will (e.g.) define "offsetof", and it must not be
defined if you have included only stdlib.h:

% cat x.c
#include <stdlib.h>
#ifdef offsetof
# error offsetof defined when it should not be
#endif
void f(void){}
%

This x.c translation unit must translate without error, i.e., the
#error must not fire.

(Now that it has been officially announced I can mention this: One
rather painful aspect of obtaining POSIX PSE52 certification for
vxWorks 6.4 was cleaning up our header-file organization so that
"inappropriate" symbols were not defined by including various POSIX
headers. The certification process includes code that reads the
actual headers, searching for all things that look like identifiers,
and emits a test module to make sure they are not "#define"d when
compiling in "POSIX mode". Only names reserved to the implementor
-- things like __users_must_not_use_this_identifier -- are omitted
from this test.

Making header files "do the right thing" when writing them from
scratch is not that bad, but retroactively enforcing such rules in
a system with many years of history is more difficult.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Dec 3 '06 #24

Roland Pibinger

On Sun, 03 Dec 2006 12:49:37 +0100, jacob navia wrote:

>1) What's wrong with the user deallocating?

It's bad style. The responsibility for deallocation becomes unclear.
In your program you have some functions that return a char* that must
be freed and other functions retruning char* without that requirement:
a perfect receipt for a leaking program.
Moreover, functions like getline foster an inefficient style. They
dynamically allocate memory for each line even when most of the lines
would fit into a char[80] buffer and only exceptional cases needed
dynamic allocation.

>2) Maybe this view is changing since that technical report is there...

Why should they abandon good style?

Best regards,
Roland Pibinger

Dec 3 '06 #25

Roland Pibinger

On Sun, 03 Dec 2006 10:01:01 -0500, CBFalconer wrote:

>By that reasoning malloc, calloc, and realloc should also be
omitted.

Why? Do those functions force you to free something you haven't
allocated?

>Not to mention fopen.

Wrong analogy again. Would you write a function like the following
(probably not):

/* user must call fclose() on the returned FILE* */
FILE *do_something (int i);

Best regards,
Roland Pibinger

Dec 3 '06 #26

jacob navia

Roland Pibinger a écrit :

On Sun, 03 Dec 2006 12:49:37 +0100, jacob navia wrote:

>>1) What's wrong with the user deallocating?

It's bad style. The responsibility for deallocation becomes unclear.
In your program you have some functions that return a char* that must
be freed and other functions retruning char* without that requirement:
a perfect receipt for a leaking program.
Moreover, functions like getline foster an inefficient style. They
dynamically allocate memory for each line even when most of the lines
would fit into a char[80] buffer and only exceptional cases needed
dynamic allocation.

>>2) Maybe this view is changing since that technical report is there...

Why should they abandon good style?

Best regards,
Roland Pibinger

The responsability of the deallocation is perfectly clear: the specs
specify that the user should deallocate the new space.

And to the argument that this is inefficient, just pass a buffer
(allocated with malloc) to this function. It will NOT touch the
passed buffer unless IT NEEEDS TO!!!

If all your lines are less than 80 characters and you pass it
a buffer of 80, the buffer will be reused!

Dec 3 '06 #27

Thad Smith

jacob navia wrote:

Thad Smith a écrit :
>jacob navia wrote:

>> Here is a proposed implementation. I would like
to see if your sharp eyes see any bug or serious problem with it.

....

>>/*
The strndup function copies not more than n characters (characters that
follow a null character are not copied) from string to a dynamically
allocated buffer. The copied string shall always be null terminated.
*/
char *strndup(const char *string,size_t s)

I haven't looked at the code, but I am a big believer in accurate
documentation.

[suggestions to improve function description deleted]

>
Excuse me I just cutted and pasted the specification from the
standards document. It is not my documentation.

Feel free to forward my comments to the author. Still, if you
distribute such a function, I recommend you use a better description.

--
Thad

Dec 3 '06 #28

Stephen Sprunk

"Roland Pibinger" <rp*****@yahoo.comwrote in message
news:45**************@news.utanet.at...

On Sun, 03 Dec 2006 10:01:01 -0500, CBFalconer wrote:
>>Not to mention fopen.

Wrong analogy again. Would you write a function like the following
(probably not):

/* user must call fclose() on the returned FILE* */
FILE *do_something (int i);

I've written code like that, yes, though more often with <OT>open() /
close()</OTthan fopen() / fclose(). It's also pretty standard when
working with <OT>sockets</OT>; because they require so much f'ing work
to create, people tend to put all that in another function to avoid
clutter.

This whole "who deallocates the returned string" argument is one of the
largest problems I have with C; yes, you can always find the correct
answer if you look at the function specs (assuming they exist), but it's
not obvious and is therefore prone to errors. <OTI'm often tempted to
use the C-like subset of C++ just so I'll have string objects that
deallocate (er, destruct) themselves when appropriate rather than having
to read function specs to figure things out. The hassle of requiring a
working C++ environment isn't yet worth the gain, though it's getting
closer. </OT>

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
--
Posted via a free Usenet account from http://www.teranews.com

Dec 3 '06 #29

Keith Thompson

Joe Wright <jo********@comcast.netwrites:

jacob navia wrote:
[ snip ]
>stdlib.h defines
calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);
and many others, so I do not see how size_t could be unknown after
including stdlib.h...
Obviously in other implementation they could have defined size_t
several times in several files.

Lazy? Headers tend to declare rather than define. All four of your
examples fail as prototypes.

How so? A prototype is a function declaration that declares the types
(not necessarily the names) of its parameters. Only the definition
needs the parameter names.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Dec 3 '06 #30

Keith Thompson

Chris Torek <no****@torek.netwrites:

[comp.compilers.lcc snipped as there is nothing lcc-specific here]

In article <45***************@yahoo.com>
CBFalconer <cb********@maineline.netwrote:
>>... However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).

Indeed, in fact including <stdlib.hmust *not* include <stddef.h>,
because stddef.h will (e.g.) define "offsetof", and it must not be
defined if you have included only stdlib.h:

% cat x.c
#include <stdlib.h>
#ifdef offsetof
# error offsetof defined when it should not be
#endif
void f(void){}
%

This x.c translation unit must translate without error, i.e., the
#error must not fire.

[...]

<stdlib.hand <stddef.hcould have some kind of logic that causes
<stddef.hto define offsetof normally, but not to define it if it's
#include'd from <stdlib.h>. The restriction isn't that <stdlib.hmay
not include <stddef.h>; it's that, as your test program shows,
including just <stdlib.hmay not define offsetof.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Dec 3 '06 #31

CBFalconer

jacob navia wrote:

>

.... snip ...

>
And to the argument that this is inefficient, just pass a buffer
(allocated with malloc) to this function. It will NOT touch the
passed buffer unless IT NEEEDS TO!!!

If all your lines are less than 80 characters and you pass it
a buffer of 80, the buffer will be reused!

Oh? What if you pass it a locally (automatic storage) buffer. How
is the function going to expand that buffer? What is there to
prevent the noob user (or the one who doesn't read the fine print)
from passing such a buffer? The only way the routine can tell is
by trying to realloc, and if the program blows up the buffer was in
automatic storage. Very efficient. Highly user friendly.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Dec 3 '06 #32

CBFalconer

Roland Pibinger wrote:

On Sun, 03 Dec 2006 10:01:01 -0500, CBFalconer wrote:

>By that reasoning malloc, calloc, and realloc should also be
omitted.

Why? Do those functions force you to free something you haven't
allocated?

Yes. Which function does the allocating? If you answer 'all
three' then what is wrong with making it 'all four'.

>Not to mention fopen.

Wrong analogy again. Would you write a function like the following
(probably not):

/* user must call fclose() on the returned FILE* */
FILE *do_something (int i);

Certainly would. Although the passed parameters would probably
have a bit more to say. For example, you want to open a file with
a specified name, or prefix, in some system dependent directory,
subject to some condition or other. Or you want to conditionally
override a default file name. Consider logging files.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Dec 3 '06 #33

jacob navia

CBFalconer a écrit :

jacob navia wrote:

... snip ...

>>And to the argument that this is inefficient, just pass a buffer
(allocated with malloc) to this function. It will NOT touch the
passed buffer unless IT NEEEDS TO!!!

If all your lines are less than 80 characters and you pass it
a buffer of 80, the buffer will be reused!

Oh? What if you pass it a locally (automatic storage) buffer. How
is the function going to expand that buffer? What is there to
prevent the noob user (or the one who doesn't read the fine print)
from passing such a buffer? The only way the routine can tell is
by trying to realloc, and if the program blows up the buffer was in
automatic storage. Very efficient. Highly user friendly.

You have a good point here.

The specs are:

< quote >

4 The application shall ensure that *lineptr is a valid argument that
could be passed to the free function. If *n is nonzero, the application
shall ensure that *lineptr points to an object containing at least *n
characters.
5 The size of the object pointed to by *lineptr shall be increased to
fit the incoming line, if it isn’t already large enough. The characters
read shall be stored in the string pointed to by the argument lineptr

< end quote >

That error would absolutely fatal since it would mean that the
function would pass a wrong pointer to realloc, with disastrous
consequences!

Since there isn't in C a portable way to determine if a memory
block is the result of malloc() we are screwed...

jacob

Dec 3 '06 #34

Keith Thompson

CBFalconer <cb********@yahoo.comwrites:

jacob navia wrote:
>>
... snip ...
>>
And to the argument that this is inefficient, just pass a buffer
(allocated with malloc) to this function. It will NOT touch the
passed buffer unless IT NEEEDS TO!!!

If all your lines are less than 80 characters and you pass it
a buffer of 80, the buffer will be reused!

Oh? What if you pass it a locally (automatic storage) buffer. How
is the function going to expand that buffer? What is there to
prevent the noob user (or the one who doesn't read the fine print)
from passing such a buffer? The only way the routine can tell is
by trying to realloc, and if the program blows up the buffer was in
automatic storage. Very efficient. Highly user friendly.

It's worse than that. Attempting to realloc() something that wasn't
allocated by one of the *alloc() functions invokes undefined behavior.
You're as likely to silently corrupt the heap (if the implementation
uses a "heap" for dynamic memory allocation) as to make the program
"blow up".

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Dec 3 '06 #35

Richard Tobin

In article <0d********************@bt.com>,
Richard Heathfield <rj*@see.sig.invalidwrote:

>>>The thing that does concern me is the spec itself, which seems to me to
suffer from the same flaw as strncpy - i.e. it gives no indication of
whether truncation occurred. But of course that's a design issue, not a C
issue.

>When I've used my own version of strndup, it's always been make an

^^^^^^^

>ordinary string from a "counted" string, so there is no question of
truncation. I suspect this is the more common use of it, rather than
copying a string to a buffer that might not be big enough.

>No, that's not more common - it's just a more *intelligent* use of strncpy.

^^^^^^^

>By far the most common usage of strncpy is from the cargo cult bunch: "I'm
smart, I know about buffer overruns, I know I should use strncpy instead of
strcpy, oops, oh look, I just threw away data, ohdearhowsadnevermind."

Um, you seem to have mixed up strndup and strncpy here. But looking
at my own posting, I seem to have done the same thing in the last
sentence. If you know the string is null-terminated but don't know
how big it is, just use strdup. I can't imagine why you'd use strndup
unless you had a counted string or wanted to just copy a prefix of the
string.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Dec 4 '06 #36

Richard Tobin

In article <11*********************@79g2000cws.googlegroups.c om>,
Tom St Denis <to********@gmail.comwrote:

>What would the point of strndup be?

As I said in another article, it's useful for converting a counted
string to a null-terminated one.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Dec 4 '06 #37

Stephen Sprunk

"CBFalconer" <cb********@yahoo.comwrote in message
news:45***************@yahoo.com...

Tom St Denis wrote:
>jacob navia wrote:
>>It proposes strdup and strndup too. Lcc-win32 proposes already
strdup, but strndup was missing. Here is a proposed implementation.
I would like to see if your sharp eyes see any bug or serious
problem with it.

What would the point of strndup be? It allocates the memory so
there really isn't a problem of an overflow. If you're low on
memory ... why are you duplicating a string?

....

I can see a possible use for it - to duplicate the initial portion
of a string only, i.e. to truncate it on the right.

Ah, but it's in a "Specification for Safer C Library Functions"; that
implies that strndup() is somehow a safer version of strdup(), like
strncat() and strncpy() are safer versions of strcat() and strcpy().

There is nothing unsafe about strdup() per se, though. The problem with
strdup() is how it's normally used, which is to copy strings that are
returned from functions as pointers to static buffers. This is not
thread-safe/re-entrant, and strndup() doesn't solve that problem at all,
because the problem is needing strdup() in the first place.

If one needs to truncate a string on the right, it's easy enough to do
that with strdup() and some extra code, with malloc() and strncpy(), or
by creating a new function which truncates any string (such as one
returned by strdup()). But such a function is not a "safer" version of
strdup() -- it's entirely new functionality.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
--
Posted via a free Usenet account from http://www.teranews.com

Dec 4 '06 #38

Stan Milam

CBFalconer wrote:

jacob navia wrote:
>CBFalconer a écrit :

>>BTW, Jacobs code carelessly fails to define size_t.
???
I have
#include <stdlib.h>
and that file includes stddef.h that defines size_t

Sorry, you are right there. However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).

>>It also return NULL for other reasons than lack of memory, which
can only cause confusion.

??? Why confusion?

Null means failure

Because the caller can't tell whether or not the system is out of
memory. If you are going to define the 'undefined behaviour' from
calling with an invalid parameter, you might as well make it
something innocuous. Otherwise the only way the user can tell the
cause of the error is to save the input parameter, and test it for
NULL himself after the routine returns a NULL. If he does that he
might as well test first, and not call the routine.

The other basic choice is to let a bad parameter blow up in the
function. Both will work properly if the programmer tests and
avoids passing that bad parameter in the first place. If he
doesn't the 'innocuous behavior' has a much better chance of
producing a user friendly end application. The unsophisticated end
user has little use for a segfault message.

Why can't you test the arguments for valid conditions and set errno to
EINVAL when one of the condition fails, and return NULL. Now, if the
validations are good and the memory allocation fails errno will be set
to ENOMEM. Now you know why you failed and can get the error message
with strerror().

#include <errno.h>
#include <stdlib.h>

char *
dupnstr( const char *source, size_t size )
{
size_t length;
char *wrk, *rv = NULL;

if ( source == NULL || *source == 0 || size == 0 )
errno = EINVAL;
else {
for ( length = 0, wrk = (char *)source; *wrk; length++ ) wrk++;
if ( length < size ) size = length;
if ( rv = malloc( size + 1 ) ) {
for( wrk = rv; size; size-- ) *wrk++ = *source++;
*wrk = 0;
}
}
return rv;
}
Regards,
Stan Milam.

Dec 4 '06 #39

Richard Heathfield

Stan Milam said:

<snip>

>
Why can't you test the arguments for valid conditions and set errno to
EINVAL when one of the condition fails, and return NULL. Now, if the
validations are good and the memory allocation fails errno will be set
to ENOMEM. Now you know why you failed and can get the error message
with strerror().

Except that neither EINVAL nor ENOMEM is defined by the Standard.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 4 '06 #40

Richard Heathfield

jacob navia said:

<snip>

>
Since there isn't in C a portable way to determine if a memory
block is the result of malloc() we are screwed...

There is actually such a way. It's called "programming". When we call
malloc, obviously we know we're calling malloc, so we're in an ideal
position to record the fact that this particular pointer was returned by
malloc. The fact that there isn't some magical "is_malloc" function in ISO
C doesn't mean that there is no portable way to achieve what we want.

It's the same as with block sizes - "if you need to know this, well, at one
point in the program you *do* know it, so the answer is simply: DON'T
FORGET".

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 4 '06 #41

Richard Heathfield

Richard Tobin said:

<snip>

>
Um, you seem to have mixed up strndup and strncpy here.

Mea culpa.

But looking
at my own posting, I seem to have done the same thing in the last
sentence.

Youa culpa!

If you know the string is null-terminated but don't know
how big it is, just use strdup.

If you don't know how big it is, you might want to find out before calling
strdup. (I'm thinking of possible Denial of Memory attacks.)

I can't imagine why you'd use strndup
unless you had a counted string or wanted to just copy a prefix of the
string.

Yes - splitting up a CSV line into tokens, perhaps (where you've answered
the "how long" question with something like strchr and ptr arithmetic).

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 4 '06 #42

Barry Schwarz

On Sun, 03 Dec 2006 21:55:00 GMT, Keith Thompson <ks***@mib.org>
wrote:

>Joe Wright <jo********@comcast.netwrites:
>jacob navia wrote:
[ snip ]
>>stdlib.h defines
calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);
and many others, so I do not see how size_t could be unknown after
including stdlib.h...
Obviously in other implementation they could have defined size_t
several times in several files.

Lazy? Headers tend to declare rather than define. All four of your
examples fail as prototypes.

How so? A prototype is a function declaration that declares the types
(not necessarily the names) of its parameters. Only the definition
needs the parameter names.

All the samples seem to be missing the return type.
Remove del for email

Dec 4 '06 #43

Richard Heathfield

Barry Schwarz said:

On Sun, 03 Dec 2006 21:55:00 GMT, Keith Thompson <ks***@mib.org>
wrote:

>>Joe Wright <jo********@comcast.netwrites:
>>jacob navia wrote:
[ snip ]
stdlib.h defines
calloc(size_t,size_t)
malloc(size_t)
qsort(void *,size_t,size_t,int (*)(...etc));
realloc(size_t);
and many others, so I do not see how size_t could be unknown after
including stdlib.h...
Obviously in other implementation they could have defined size_t
several times in several files.

Lazy? Headers tend to declare rather than define. All four of your
examples fail as prototypes.

How so? A prototype is a function declaration that declares the types
(not necessarily the names) of its parameters. Only the definition
needs the parameter names.

All the samples seem to be missing the return type.

Guys, guys, what's the matter with you? Mr Navia was just showing how often
size_t crops up in <stdlib.h>, for heaven's sake! He has a perfectly valid
point, which is borne out by the Standard. If you want <stdlib.hyou know
where to find it.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 4 '06 #44

CBFalconer

Stan Milam wrote:

CBFalconer wrote:
>jacob navia wrote:
>>CBFalconer a écrit :

.... snip ...

>>>>
It also return NULL for other reasons than lack of memory, which
can only cause confusion.

??? Why confusion?

Null means failure

Because the caller can't tell whether or not the system is out of
memory. If you are going to define the 'undefined behaviour' from
calling with an invalid parameter, you might as well make it
something innocuous. Otherwise the only way the user can tell the
cause of the error is to save the input parameter, and test it for
NULL himself after the routine returns a NULL. If he does that he
might as well test first, and not call the routine.

The other basic choice is to let a bad parameter blow up in the
function. Both will work properly if the programmer tests and
avoids passing that bad parameter in the first place. If he
doesn't the 'innocuous behavior' has a much better chance of
producing a user friendly end application. The unsophisticated end
user has little use for a segfault message.

Why can't you test the arguments for valid conditions and set errno
to EINVAL when one of the condition fails, and return NULL. Now,
if the validations are good and the memory allocation fails errno
will be set to ENOMEM. Now you know why you failed and can get the
error message with strerror().

In part because I disapprove of global variables.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Dec 4 '06 #45

CBFalconer

Richard Heathfield wrote:

Richard Tobin said:

<snip>
>>
Um, you seem to have mixed up strndup and strncpy here.

Mea culpa.

>But looking at my own posting, I seem to have done the same thing
in the last sentence.

Youa culpa!

We'alla culpa!

.... snip ...

>
>I can't imagine why you'd use strndup unless you had a counted
string or wanted to just copy a prefix of the string.

Yes - splitting up a CSV line into tokens, perhaps (where you've
answered the "how long" question with something like strchr and
ptr arithmetic).

Which is basically my function 'toksplit', which can be thought of
as a combination of strchr and strncpy, plus fol-de-rol:

const char *toksplit(const char *src, /* Source of tokens */
char tokchar, /* token delimiting char */
char *token, /* receiver of parsed token */
size_t lgh) /* length token can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) src++;

while (*src && (tokchar != *src)) {
if (lgh) {
*token++ = *src;
--lgh;
}
src++;
}
if (*src && (tokchar == *src)) src++;
}
*token = '\0';
return src;
} /* toksplit */

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Dec 4 '06 #46

Richard Tobin

In article <u7******************************@bt.com>,
Richard Heathfield <rj*@see.sig.invalidwrote:

>If you know the string is null-terminated but don't know
how big it is, just use strdup.

>If you don't know how big it is, you might want to find out before calling
strdup. (I'm thinking of possible Denial of Memory attacks.)

It wouldn't be a very *good* denial of memory attack. strdup only
produces a string as big as one you've already got. You've already
made the mistake by reading in or creating the existing string.

It's not impossible for it to be a real consideration I suppose. You
might not want to call strdup inside your operating system kernel on a
string passed in by the user. But strndup is probably not the place
to fix the problem in that case either.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Dec 4 '06 #47

Roland Pibinger

On Sun, 3 Dec 2006 14:02:54 -0600, "Stephen Sprunk" wrote:

>"Roland Pibinger" <rp*****@yahoo.comwrote
>Would you write a function like the following
(probably not):

/* user must call fclose() on the returned FILE* */
FILE *do_something (int i);

I've written code like that, yes, though more often with <OT>open() /
close()</OTthan fopen() / fclose(). It's also pretty standard when
working with <OT>sockets</OT>; because they require so much f'ing work
to create, people tend to put all that in another function to avoid
clutter.

IMO, there is no problem when you write your own symmetric open
(create, connect, ...) and close (cleanup, disconnect, ...) functions,
e.g.

Handle h = my_open (...);
// ...
my_close (h);

>This whole "who deallocates the returned string" argument is one of the
largest problems I have with C; yes, you can always find the correct
answer if you look at the function specs (assuming they exist), but it's
not obvious and is therefore prone to errors.

So just don't do it, i.e. don't return a string that has to be freed
by the caller. Even the Windows API uses that convention, AFAIK.

Best wishes,
Roland Pibinger

Dec 4 '06 #48

stmilam

Richard Heathfield wrote:

Stan Milam said:

<snip>

Why can't you test the arguments for valid conditions and set errno to
EINVAL when one of the condition fails, and return NULL. Now, if the
validations are good and the memory allocation fails errno will be set
to ENOMEM. Now you know why you failed and can get the error message
with strerror().

Except that neither EINVAL nor ENOMEM is defined by the Standard.

Ah, another glaring deficiency of the standard. Every implementation I
have used for over 20 years has had EINVAL and ENOMEM defined.

Regards,
Stan Milam.

Dec 4 '06 #49

Richard Heathfield

st*****@yahoo.com said:

>
Richard Heathfield wrote:
>Stan Milam said:

<snip>
>
Why can't you test the arguments for valid conditions and set errno to
EINVAL when one of the condition fails, and return NULL. Now, if the
validations are good and the memory allocation fails errno will be set
to ENOMEM. Now you know why you failed and can get the error message
with strerror().

Except that neither EINVAL nor ENOMEM is defined by the Standard.

Ah, another glaring deficiency of the standard. Every implementation I
have used for over 20 years has had EINVAL and ENOMEM defined.

I don't see why that means the Standard is deficient. By the same reasoning,
someone who has only ever used Turbo C 2.0 could claim that the Standard is
glaringly deficient in its omission of initgraph() from the library section
despite its being present in every implementation that person has used for
over 20 years - but this would not say so much about the Standard as it
would about the person making the claim!

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 4 '06 #50