strndup: RFC - Page 4

jacob navia

Hi
Reading comp.std.c I noticed that there was a message like this:
The WG14 Post Portland mailing is now available from the WG14 web site
at http://www.open-std.org/jtc1/sc22/wg14/

Best regards
Keld Simonsen

I went there and found that there is a report called
ISO/IEC JTC1 SC22 WG14 WG14/N1193
Specification for Safer C Library Functions —
Part II: Dynamic Allocation Functions

It proposes really interesting functions, among others getline and
getdelim, that I introduced into lcc-win32 (by coincidence) a few weeks
ago.

It proposes strdup and strndup too. Lcc-win32 proposes already strdup,
but strndup was missing. Here is a proposed implementation. I would like
to see if your sharp eyes see any bug or serious problem with it.

Thanks in advance

jacob
---------------------------------------------------------cut here
#include <string.h>
#include <stdlib.h>
/*
The strndup function copies not more than n characters (characters that
follow a null character are not copied) from string to a dynamically
allocated buffer. The copied string shall always be null terminated.
*/
char *strndup(const char *string,size_t s)
{
char *p,*r;
if (string == NULL)
return NULL;
p = string;
while (s 0) {
if (*p == 0)
break;
p++;
s--;
}
s = (p - string);
r = malloc(1+s);
if (r) {
strncpy(r,strin g,s);
r[s] = 0;
}
return r;
}

#ifdef TEST
#include <stdio.h>
#define MAXTEST 60
int main(void)
{
char *table[MAXTEST];
char *str = "The quick brown fox jumps over the lazy dog";

for (int i=0; i<MAXTEST;i++) {
table[i] = strndup(str,i);
}
for (int i=0; i<MAXTEST;i++) {
printf("[%4d] %s\n",i,table[i]);
}
return 0;
}
#endif

Dec 2 '06

Subscribe Reply

7081

« First
<
2
3
4
5
6
>
Last »

Keith Thompson

Chris Torek <no****@torek.n etwrites:

[comp.compilers. lcc snipped as there is nothing lcc-specific here]

In article <45************ ***@yahoo.com>
CBFalconer <cb********@mai neline.netwrote :
>>... However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).

Indeed, in fact including <stdlib.hmust *not* include <stddef.h>,
because stddef.h will (e.g.) define "offsetof", and it must not be
defined if you have included only stdlib.h:

% cat x.c
#include <stdlib.h>
#ifdef offsetof
# error offsetof defined when it should not be
#endif
void f(void){}
%

This x.c translation unit must translate without error, i.e., the
#error must not fire.

[...]

<stdlib.hand <stddef.hcoul d have some kind of logic that causes
<stddef.hto define offsetof normally, but not to define it if it's
#include'd from <stdlib.h>. The restriction isn't that <stdlib.hmay
not include <stddef.h>; it's that, as your test program shows,
including just <stdlib.hmay not define offsetof.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Dec 3 '06 #31

CBFalconer

jacob navia wrote:

>

.... snip ...

>
And to the argument that this is inefficient, just pass a buffer
(allocated with malloc) to this function. It will NOT touch the
passed buffer unless IT NEEEDS TO!!!

If all your lines are less than 80 characters and you pass it
a buffer of 80, the buffer will be reused!

Oh? What if you pass it a locally (automatic storage) buffer. How
is the function going to expand that buffer? What is there to
prevent the noob user (or the one who doesn't read the fine print)
from passing such a buffer? The only way the routine can tell is
by trying to realloc, and if the program blows up the buffer was in
automatic storage. Very efficient. Highly user friendly.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

Dec 3 '06 #32

CBFalconer

Roland Pibinger wrote:

On Sun, 03 Dec 2006 10:01:01 -0500, CBFalconer wrote:

>By that reasoning malloc, calloc, and realloc should also be
omitted.

Why? Do those functions force you to free something you haven't
allocated?

Yes. Which function does the allocating? If you answer 'all
three' then what is wrong with making it 'all four'.

>Not to mention fopen.

Wrong analogy again. Would you write a function like the following
(probably not):

/* user must call fclose() on the returned FILE* */
FILE *do_something (int i);

Certainly would. Although the passed parameters would probably
have a bit more to say. For example, you want to open a file with
a specified name, or prefix, in some system dependent directory,
subject to some condition or other. Or you want to conditionally
override a default file name. Consider logging files.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

Dec 3 '06 #33

jacob navia

CBFalconer a écrit :

jacob navia wrote:

... snip ...

>>And to the argument that this is inefficient, just pass a buffer
(allocated with malloc) to this function. It will NOT touch the
passed buffer unless IT NEEEDS TO!!!

If all your lines are less than 80 characters and you pass it
a buffer of 80, the buffer will be reused!

Oh? What if you pass it a locally (automatic storage) buffer. How
is the function going to expand that buffer? What is there to
prevent the noob user (or the one who doesn't read the fine print)
from passing such a buffer? The only way the routine can tell is
by trying to realloc, and if the program blows up the buffer was in
automatic storage. Very efficient. Highly user friendly.

You have a good point here.

The specs are:

< quote >

4 The application shall ensure that *lineptr is a valid argument that
could be passed to the free function. If *n is nonzero, the application
shall ensure that *lineptr points to an object containing at least *n
characters.
5 The size of the object pointed to by *lineptr shall be increased to
fit the incoming line, if it isn’t already large enough. The characters
read shall be stored in the string pointed to by the argument lineptr

< end quote >

That error would absolutely fatal since it would mean that the
function would pass a wrong pointer to realloc, with disastrous
consequences!

Since there isn't in C a portable way to determine if a memory
block is the result of malloc() we are screwed...

jacob

Dec 3 '06 #34

Keith Thompson

CBFalconer <cb********@yah oo.comwrites:

jacob navia wrote:
>>
... snip ...
>>
And to the argument that this is inefficient, just pass a buffer
(allocated with malloc) to this function. It will NOT touch the
passed buffer unless IT NEEEDS TO!!!

If all your lines are less than 80 characters and you pass it
a buffer of 80, the buffer will be reused!

Oh? What if you pass it a locally (automatic storage) buffer. How
is the function going to expand that buffer? What is there to
prevent the noob user (or the one who doesn't read the fine print)
from passing such a buffer? The only way the routine can tell is
by trying to realloc, and if the program blows up the buffer was in
automatic storage. Very efficient. Highly user friendly.

It's worse than that. Attempting to realloc() something that wasn't
allocated by one of the *alloc() functions invokes undefined behavior.
You're as likely to silently corrupt the heap (if the implementation
uses a "heap" for dynamic memory allocation) as to make the program
"blow up".

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Dec 3 '06 #35

Richard Tobin

In article <0d************ ********@bt.com >,
Richard Heathfield <rj*@see.sig.in validwrote:

>>>The thing that does concern me is the spec itself, which seems to me to
suffer from the same flaw as strncpy - i.e. it gives no indication of
whether truncation occurred. But of course that's a design issue, not a C
issue.

>When I've used my own version of strndup, it's always been make an

^^^^^^^

>ordinary string from a "counted" string, so there is no question of
truncation. I suspect this is the more common use of it, rather than
copying a string to a buffer that might not be big enough.

>No, that's not more common - it's just a more *intelligent* use of strncpy.

^^^^^^^

>By far the most common usage of strncpy is from the cargo cult bunch: "I'm
smart, I know about buffer overruns, I know I should use strncpy instead of
strcpy, oops, oh look, I just threw away data, ohdearhowsadnev ermind."

Um, you seem to have mixed up strndup and strncpy here. But looking
at my own posting, I seem to have done the same thing in the last
sentence. If you know the string is null-terminated but don't know
how big it is, just use strdup. I can't imagine why you'd use strndup
unless you had a counted string or wanted to just copy a prefix of the
string.

-- Richard
--
"Considerat ion shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Dec 4 '06 #36

Richard Tobin

In article <11************ *********@79g20 00cws.googlegro ups.com>,
Tom St Denis <to********@gma il.comwrote:

>What would the point of strndup be?

As I said in another article, it's useful for converting a counted
string to a null-terminated one.

-- Richard
--
"Considerat ion shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Dec 4 '06 #37

Stephen Sprunk

"CBFalconer " <cb********@yah oo.comwrote in message
news:45******** *******@yahoo.c om...

Tom St Denis wrote:
>jacob navia wrote:
>>It proposes strdup and strndup too. Lcc-win32 proposes already
strdup, but strndup was missing. Here is a proposed implementation.
I would like to see if your sharp eyes see any bug or serious
problem with it.

What would the point of strndup be? It allocates the memory so
there really isn't a problem of an overflow. If you're low on
memory ... why are you duplicating a string?

....

I can see a possible use for it - to duplicate the initial portion
of a string only, i.e. to truncate it on the right.

Ah, but it's in a "Specificat ion for Safer C Library Functions"; that
implies that strndup() is somehow a safer version of strdup(), like
strncat() and strncpy() are safer versions of strcat() and strcpy().

There is nothing unsafe about strdup() per se, though. The problem with
strdup() is how it's normally used, which is to copy strings that are
returned from functions as pointers to static buffers. This is not
thread-safe/re-entrant, and strndup() doesn't solve that problem at all,
because the problem is needing strdup() in the first place.

If one needs to truncate a string on the right, it's easy enough to do
that with strdup() and some extra code, with malloc() and strncpy(), or
by creating a new function which truncates any string (such as one
returned by strdup()). But such a function is not a "safer" version of
strdup() -- it's entirely new functionality.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
--
Posted via a free Usenet account from http://www.teranews.com

Dec 4 '06 #38

Stan Milam

CBFalconer wrote:

jacob navia wrote:
>CBFalconer a écrit :

>>BTW, Jacobs code carelessly fails to define size_t.
???
I have
#include <stdlib.h>
and that file includes stddef.h that defines size_t

Sorry, you are right there. However there is no requirement that
stdlib includes stddef, only that it define size_t (among other
things).

>>It also return NULL for other reasons than lack of memory, which
can only cause confusion.

??? Why confusion?

Null means failure

Because the caller can't tell whether or not the system is out of
memory. If you are going to define the 'undefined behaviour' from
calling with an invalid parameter, you might as well make it
something innocuous. Otherwise the only way the user can tell the
cause of the error is to save the input parameter, and test it for
NULL himself after the routine returns a NULL. If he does that he
might as well test first, and not call the routine.

The other basic choice is to let a bad parameter blow up in the
function. Both will work properly if the programmer tests and
avoids passing that bad parameter in the first place. If he
doesn't the 'innocuous behavior' has a much better chance of
producing a user friendly end application. The unsophisticated end
user has little use for a segfault message.

Why can't you test the arguments for valid conditions and set errno to
EINVAL when one of the condition fails, and return NULL. Now, if the
validations are good and the memory allocation fails errno will be set
to ENOMEM. Now you know why you failed and can get the error message
with strerror().

#include <errno.h>
#include <stdlib.h>

char *
dupnstr( const char *source, size_t size )
{
size_t length;
char *wrk, *rv = NULL;

if ( source == NULL || *source == 0 || size == 0 )
errno = EINVAL;
else {
for ( length = 0, wrk = (char *)source; *wrk; length++ ) wrk++;
if ( length < size ) size = length;
if ( rv = malloc( size + 1 ) ) {
for( wrk = rv; size; size-- ) *wrk++ = *source++;
*wrk = 0;
}
}
return rv;
}
Regards,
Stan Milam.

Dec 4 '06 #39

Richard Heathfield

Stan Milam said:

<snip>

>
Why can't you test the arguments for valid conditions and set errno to
EINVAL when one of the condition fails, and return NULL. Now, if the
validations are good and the memory allocation fails errno will be set
to ENOMEM. Now you know why you failed and can get the error message
with strerror().

Except that neither EINVAL nor ENOMEM is defined by the Standard.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Dec 4 '06 #40