By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,663 Members | 2,152 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,663 IT Pros & Developers. It's quick & easy.

Question about the clc string lib

P: n/a
In the function below, can size ever be 0 (zero)?

char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);

size = strlen(s) + 1;
if (size == 0)
p = NULL;
else if ((p = malloc(size)) != NULL)
memcpy(p, s, size);

return p;
}

Jan 26 '06 #1
Share this Question
Share on Google+
53 Replies


P: n/a
On 2006-01-26, Jeff <jo******@gmail.com> wrote:
In the function below, can size ever be 0 (zero)?
I've never heard of the "clc string lib" - where can i find it?
char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);
No idea how this would work in a way that needs the function pointer as
its first argument. Is it a macro that stringizes its first argument to
print an error?

size = strlen(s) + 1;
The result of this assignment cannot be zero.
if (size == 0)
p = NULL;
hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.
else if ((p = malloc(size)) != NULL)
memcpy(p, s, size);

return p;
}


Incidentally, speaking of prefixes, the str* prefix, and reserved
identifiers, is something along the lines of

#define strdup clc_strdup

legal? My reading of the standard says it is, but thought i'd ask here.
Jan 26 '06 #2

P: n/a
Jeff wrote:
In the function below, can size ever be 0 (zero)?

char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);

size = strlen(s) + 1;
if (size == 0)
p = NULL;
else if ((p = malloc(size)) != NULL)
memcpy(p, s, size);

return p;
}


If s is a null terminated string, even if s[0] == '\0', then strlen of s
will be 0. As you add one to that, in this case, size cannot be 0.

if s is of such a length that it overflows a size_t (on my system that's
SIZE_MAX = UNIT32_MAX = 4294967295 [+ 1], then size will go to 0. Again, as
you add 1 to that, I can't see how size can ever be zero.

--
==============
*Not a pedant*
==============
Jan 26 '06 #3

P: n/a
On 2006-01-26, pemo <us***********@gmail.com> wrote:
Jeff wrote:
In the function below, can size ever be 0 (zero)?

char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);

size = strlen(s) + 1;
if (size == 0)
p = NULL;
else if ((p = malloc(size)) != NULL)
memcpy(p, s, size);

return p;
}


If s is a null terminated string, even if s[0] == '\0', then strlen of s
will be 0. As you add one to that, in this case, size cannot be 0.

if s is of such a length that it overflows a size_t (on my system that's
SIZE_MAX = UNIT32_MAX = 4294967295 [+ 1], then size will go to 0. Again, as
you add 1 to that, I can't see how size can ever be zero.


Except, if the length of s _is_ SIZE_MAX, then subsequently adding 1
will "overflow".

suppose we have an unrealistic system where SIZE_MAX is 63. [clearly not
ISO compliant, but i don't want to type out 65535 characters]

then, the string
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW XYZ 0123456789"

has a length of 63, and 63+1 will wrap to 0.
Jan 26 '06 #4

P: n/a
Jordan Abel wrote:
On 2006-01-26, pemo <us***********@gmail.com> wrote:
Jeff wrote:
In the function below, can size ever be 0 (zero)?

char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);

size = strlen(s) + 1;
if (size == 0)
p = NULL;
else if ((p = malloc(size)) != NULL)
memcpy(p, s, size);

return p;
}


If s is a null terminated string, even if s[0] == '\0', then strlen
of s will be 0. As you add one to that, in this case, size cannot
be 0.

if s is of such a length that it overflows a size_t (on my system
that's SIZE_MAX = UNIT32_MAX = 4294967295 [+ 1], then size will go
to 0. Again, as you add 1 to that, I can't see how size can ever be
zero.


Except, if the length of s _is_ SIZE_MAX, then subsequently adding 1
will "overflow".

suppose we have an unrealistic system where SIZE_MAX is 63. [clearly
not ISO compliant, but i don't want to type out 65535 characters]

then, the string
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW XYZ 0123456789"

has a length of 63, and 63+1 will wrap to 0.


Yup - agreed, if the string is exactly SIZE_MAX in length. Thanks for the
correction.

--
==============
*Not a pedant*
==============
Jan 26 '06 #5

P: n/a
Jordan Abel wrote:

< snip OP and tidbits >
Incidentally, speaking of prefixes, the str* prefix, and reserved
identifiers, is something along the lines of

#define strdup clc_strdup

legal? My reading of the standard says it is, but thought i'd ask
here.


I believe it is (and can't be asked to open the Standard right now).

In any case, by the time the compiler proper sees the code, all
instances of `strdup` will be replaced by the pre-processor with the
`clc_strdup` which does not violate the "str[lowercaseletter] is
reserved" requirement.

Cheers

Vladimir

--
If you make people think they're thinking, they'll love you; but if you
really make them think they'll hate you.
Jan 26 '06 #6

P: n/a
boa
Jordan Abel wrote:
On 2006-01-26, Jeff <jo******@gmail.com> wrote:
In the function below, can size ever be 0 (zero)?
I've never heard of the "clc string lib" - where can i find it?


http://libclc.sf.net
char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);


No idea how this would work in a way that needs the function pointer as
its first argument. Is it a macro that stringizes its first argument to
print an error?
size = strlen(s) + 1;


The result of this assignment cannot be zero.
if (size == 0)
p = NULL;


hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.


size can never be SIZE_MAX as size doesn't include '\0'. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX.

boa
Jan 26 '06 #7

P: n/a
boa
boa wrote:
Jordan Abel wrote:
On 2006-01-26, Jeff <jo******@gmail.com> wrote:
In the function below, can size ever be 0 (zero)?


I've never heard of the "clc string lib" - where can i find it?


http://libclc.sf.net
char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);


No idea how this would work in a way that needs the function pointer as
its first argument. Is it a macro that stringizes its first argument to
print an error?
size = strlen(s) + 1;


The result of this assignment cannot be zero.
if (size == 0)
p = NULL;


hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.


size can never be SIZE_MAX as size doesn't include '\0'. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX.

boa


One more try: strlen() can never return SIZE_MAX. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX, so the max value strlen() can return is SIZE_MAX - 1.

boa
Jan 26 '06 #8

P: n/a
On 2006-01-26, boa <bo*****@gmail.com> wrote:
boa wrote:
Jordan Abel wrote:
On 2006-01-26, Jeff <jo******@gmail.com> wrote:
In the function below, can size ever be 0 (zero)?

I've never heard of the "clc string lib" - where can i find it?


http://libclc.sf.net

char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);

No idea how this would work in a way that needs the function pointer as
its first argument. Is it a macro that stringizes its first argument to
print an error?

size = strlen(s) + 1;

The result of this assignment cannot be zero.

if (size == 0)
p = NULL;

hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.


size can never be SIZE_MAX as size doesn't include '\0'. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX.

boa


One more try: strlen() can never return SIZE_MAX. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX, so the max value strlen() can return is SIZE_MAX - 1.

boa


take SIZE_MAX 65535

char * foo = calloc(256,256);
memset(foo,"x",SIZE_MAX);
strlen(foo) == SIZE_MAX;

I don't think there's an explicit requirement in the standard that there
never be an object smaller than SIZE_MAX. There may, however, be a
requirement that the product of the arguments to calloc be less than
SIZE_MAX. [that, i don't know.]
Jan 26 '06 #9

P: n/a

Jordan Abel wrote:
On 2006-01-26, pemo <us***********@gmail.com> wrote:
Jeff wrote:
In the function below, can size ever be 0 (zero)?

char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);

size = strlen(s) + 1;
if (size == 0)
p = NULL;
else if ((p = malloc(size)) != NULL)
memcpy(p, s, size);

return p;
}


If s is a null terminated string, even if s[0] == '\0', then strlen of s
will be 0. As you add one to that, in this case, size cannot be 0.

if s is of such a length that it overflows a size_t (on my system that's
SIZE_MAX = UNIT32_MAX = 4294967295 [+ 1], then size will go to 0. Again, as
you add 1 to that, I can't see how size can ever be zero.


Except, if the length of s _is_ SIZE_MAX, then subsequently adding 1
will "overflow".

suppose we have an unrealistic system where SIZE_MAX is 63. [clearly not
ISO compliant, but i don't want to type out 65535 characters]

then, the string
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVW XYZ 0123456789"

has a length of 63, and 63+1 will wrap to 0.


But if the most you can allocate is SIZE_MAX, then your string can only
be SIZE_MAX-1 if it's going to be null terminated. Therefore I don't
see how strlen can return SIZE_MAX.

Jeff

Jan 26 '06 #10

P: n/a
boa
Jordan Abel wrote:
On 2006-01-26, boa <bo*****@gmail.com> wrote:
boa wrote:
Jordan Abel wrote:
On 2006-01-26, Jeff <jo******@gmail.com> wrote:
> In the function below, can size ever be 0 (zero)?
I've never heard of the "clc string lib" - where can i find it?
http://libclc.sf.net

> char *clc_strdup(const char * CLC_RESTRICT s)
> {
> size_t size;
> char *p;
>
> clc_assert_not_null(clc_strdup, s);
No idea how this would work in a way that needs the function pointer as
its first argument. Is it a macro that stringizes its first argument to
print an error?

> size = strlen(s) + 1;
The result of this assignment cannot be zero.

> if (size == 0)
> p = NULL;
hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.
size can never be SIZE_MAX as size doesn't include '\0'. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX.

boa One more try: strlen() can never return SIZE_MAX. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX, so the max value strlen() can return is SIZE_MAX - 1.

boa


take SIZE_MAX 65535

char * foo = calloc(256,256);
memset(foo,"x",SIZE_MAX);
strlen(foo) == SIZE_MAX;

I don't think there's an explicit requirement in the standard that there
never be an object smaller than SIZE_MAX. There may, however, be a
requirement that the product of the arguments to calloc be less than
SIZE_MAX. [that, i don't know.]


It goes without saying? ;-)

FWIW, I've looked through both the C99 standard, the C rationale and
TC1, looking for some description of the relationship between size_t and
SIZE_MAX, but found nothing. SIZE_T is only mentioned twice in C99.

So I really don't have much of a case here. One could argue that even if
it is OK to calloc() memory the way you do above, you don't allocate a
string, just 256 objects of 256 bytes so you cannot use it as an
argument to strlen().

The standard is vague on this issue, 7.21.1 says this: The header <string.h> declares one type and several functions, and defines one
macro useful for manipulating arrays of character type and other objects treated as arrays
of character type.


Pretty clear that strlen() manipulates arrays of character type, but
what's "other objects"?

boa

Jan 26 '06 #11

P: n/a
Jordan Abel wrote:
I've never heard of the "clc string lib" -


Silly me, I hadn't either!
clc_assert_not_null(clc_strdup, s);


No idea how this would work in a way that needs the function pointer as
its first argument. Is it a macro that stringizes its first argument to
print an error?


Good question. But it could also just be used as a
guaranteed-unique, easy-to-produce token.

My suggestion, though, would be to omit that assertion entirely
and replace it with a simple

if(s == NULL)
return NULL;
Jan 26 '06 #12

P: n/a
Jordan Abel <ra*******@gmail.com> writes:
On 2006-01-26, Jeff <jo******@gmail.com> wrote:
In the function below, can size ever be 0 (zero)?


I've never heard of the "clc string lib" - where can i find it?
char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);


No idea how this would work in a way that needs the function pointer as
its first argument. Is it a macro that stringizes its first argument to
print an error?


That's the most likely case (but then why not just pass a string?),
but it *could* be a function that compares its first argument against
the address of each function that it knows about. Silly, but
possible.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 26 '06 #13

P: n/a

In article <dr**********@nwrdmz01.dmz.ncs.ea.ibs-infra.bt.com>, "Vladimir S. Oka" <no****@btopenworld.com> writes:
Jordan Abel wrote:
Incidentally, speaking of prefixes, the str* prefix, and reserved
identifiers, is something along the lines of

#define strdup clc_strdup

legal? My reading of the standard says it is, but thought i'd ask
here.
I believe it is (and can't be asked to open the Standard right now).


If you can't be bothered to check the Standard, why reply? That's
a serious question. Jordan's question was about the Standard, not
about anyone else's belief.

I *am* looking at the Standard (I'll cite C99 here, but C90 has
equivalent, indeed mostly identical, language). My interpretation
of the Standard contradicts Jordan's.

Note first that macro names are identifiers (6.2.1).

7.1.3 ("Reserved identifiers"):

Each header declares or defines all identifiers listed in its
associated subclause, and optionally declares or defines identifiers
listed in its associated future library directions subclause and
identifiers which are always reserved either for any use or for use
as file scope identifiers.
...
Each identifier with file scope listed in any of the following
subclauses (including the future library directions) is reserved for
use as a macro name and as an identifier with file scope in the same
name space if any of its associated headers is included.

Note the "str..." identifiers in string.h are identifiers with file
scope.

7.26.10 ("General utilities <stdlib.h>"):

Function names that begin with str and a lowercase letter may be
added to the declarations in the <stdlib.h> header.

7.26.11 ("String handling <string.h>"):

Function names that begin with str, mem, or wcs and a lowercase
letter may be added to the declarations in the <string.h> header.
"strdup" as a macro name is an identifier that begins with "str"
and a lowercase letter. It has file scope, because it is a macro
name. That means it is covered by 7.26.10 and 7.26.11. By 7.1.3,
it is thus reserved if stdlib.h or string.h is included.

The only thing that making it a macro name rather than simply having
a function (with external linkage) named "strdup" gives you is
relief from 7.26:

7.26 ("Future library directions"):

All external names described below are reserved no matter what
headers are included by the program.

So calling the function "clc_strdup" and using a macro to refer to it
as "strdup" is legal provided stdlib.h and string.h are not included
- but that seems rather unlikely, and you could achieve the same thing
by giving "strdup" internal linkage (ie by declaring it static).

In any case, by the time the compiler proper sees the code, all
instances of `strdup` will be replaced by the pre-processor with the
`clc_strdup` which does not violate the "str[lowercaseletter] is
reserved" requirement.


This is mostly wrong. There is no "compiler proper" as far as the
Standard is concerned; the replacement of macro-name identifiers with
macro bodies is part of translation phase four, carried out by the
same notional "implementation" as all other translation phases. More
importantly, some of the restrictions on reserved identifiers, like
this one, apply to macro names. That the macro name is replaced with
its associated body is irrelevant in this case.

Now, Jordan certainly knows how preprocessing directives and macro
expansion work. His question had to do with what the Standard says
about reserved identifiers and whether an identifier of particular
type and name was reserved. This is a question which can only be
answered by recourse to the Standard, not by speculation about what
happens during compilation; it is a point of law, not a point of
fact. It is, in other words, a question of pedantry, and only
pedantry will satisfy it.

Fortunately, c.l.c contains one of the world's largest herds of free-
roaming pedants, thundering majestically across the virtual plains...
--
Michael Wojcik mi************@microfocus.com

World domination has encountered a momentary setback. Talk amongst
yourselves. -- Darby Conley
Jan 26 '06 #14

P: n/a
mw*****@newsguy.com (Michael Wojcik) writes:
[...]
Fortunately, c.l.c contains one of the world's largest herds of free-
roaming pedants, thundering majestically across the virtual plains...


I just wanted to quote that sentence.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 26 '06 #15

P: n/a
On 2006-01-26, Michael Wojcik <mw*****@newsguy.com> wrote:
Each identifier with file scope listed in any of the following
subclauses (including the future library directions) is reserved for
use as a macro name and as an identifier with file scope in the same
name space if any of its associated headers is included.

Note the "str..." identifiers in string.h are identifiers with file
scope.

"strdup" as a macro name is an identifier that begins with "str"
and a lowercase letter. It has file scope, because it is a macro
name. That means it is covered by 7.26.10 and 7.26.11. By 7.1.3,
it is thus reserved if stdlib.h or string.h is included.


Can I #undef it and then #define my own? It seems like this would fall
into the same dubious area as redefining a keyword, but would still be
legal (just as it is to redefine a keyword)
Jan 26 '06 #16

P: n/a
Michael Wojcik wrote:
In article <dr**********@nwrdmz01.dmz.ncs.ea.ibs-infra.bt.com>,
"Vladimir S. Oka" <no****@btopenworld.com> writes:
Jordan Abel wrote:
> Incidentally, speaking of prefixes, the str* prefix, and reserved
> identifiers, is something along the lines of
>
> #define strdup clc_strdup
>
> legal? My reading of the standard says it is, but thought i'd ask
> here.
I believe it is (and can't be asked to open the Standard right now).


If you can't be bothered to check the Standard, why reply? That's
a serious question. Jordan's question was about the Standard, not
about anyone else's belief.


Your premise here seems to be that I have never actually opened the
Standard, and am in fact guessing. If you happened to agree with me (or
Jordan), I wonder whether you'd have bothered to chastise me. Instead,
you proceed believing (no pun intended) that your reading of the
Standard will justify the above paragraph, and the below statement.
I *am* looking at the Standard (I'll cite C99 here, but C90 has
equivalent, indeed mostly identical, language). My interpretation
of the Standard contradicts Jordan's.
But, let's see what your reading of the standard yields:
Note first that macro names are identifiers (6.2.1).

7.1.3 ("Reserved identifiers"):

Each header declares or defines all identifiers listed in its
associated subclause, and optionally declares or defines
identifiers listed in its associated future library directions
subclause and identifiers which are always reserved either for any
use or for use as file scope identifiers.
...
Each identifier with file scope listed in any of the following
subclauses (including the future library directions) is reserved
for use as a macro name and as an identifier with file scope in the
same name space if any of its associated headers is included.

Note the "str..." identifiers in string.h are identifiers with file
scope.

7.26.10 ("General utilities <stdlib.h>"):

Function names that begin with str and a lowercase letter may be
added to the declarations in the <stdlib.h> header.

7.26.11 ("String handling <string.h>"):

Function names that begin with str, mem, or wcs and a lowercase
letter may be added to the declarations in the <string.h> header.
"strdup" as a macro name is an identifier that begins with "str"
and a lowercase letter. It has file scope, because it is a macro
name. That means it is covered by 7.26.10 and 7.26.11. By 7.1.3,
it is thus reserved if stdlib.h or string.h is included.

The only thing that making it a macro name rather than simply having
a function (with external linkage) named "strdup" gives you is
relief from 7.26:

7.26 ("Future library directions"):

All external names described below are reserved no matter what
headers are included by the program.

So calling the function "clc_strdup" and using a macro to refer to it
as "strdup" is legal provided stdlib.h and string.h are not included
- but that seems rather unlikely, and you could achieve the same thing
by giving "strdup" internal linkage (ie by declaring it static).
In other words, there _are_ circumstances where it actually _is_ legal,
even according to your reading of the Standard. Why would it be
unlikely for a source file not to include certain standard headers?
I've seen quite a few.

So, your reading does not actually contradict Jordan's (or mine, for
that matter), at least not entirely.
In any case, by the time the compiler proper sees the code, all
instances of `strdup` will be replaced by the pre-processor with the
`clc_strdup` which does not violate the "str[lowercaseletter] is
reserved" requirement.


This is mostly wrong. There is no "compiler proper" as far as the
Standard is concerned; the replacement of macro-name identifiers with
macro bodies is part of translation phase four, carried out by the
same notional "implementation" as all other translation phases. More
importantly, some of the restrictions on reserved identifiers, like
this one, apply to macro names. That the macro name is replaced with
its associated body is irrelevant in this case.


I do agree that I got this bit wrong.
Now, Jordan certainly knows how preprocessing directives and macro
expansion work. His question had to do with what the Standard says
about reserved identifiers and whether an identifier of particular
type and name was reserved.
Thanks for clarifying this for me...
This is a question which can only be
answered by recourse to the Standard, not by speculation about what
happens during compilation; it is a point of law, not a point of
fact. It is, in other words, a question of pedantry, and only
pedantry will satisfy it.

Fortunately, c.l.c contains one of the world's largest herds of free-
roaming pedants, thundering majestically across the virtual plains...


And sometimes, just sometimes, being a bit too self-righteous... ;-)

Cheers

Vladimir
--
"Who cares if it doesn't do anything? It was made with our new
Triple-Iso-Bifurcated-Krypton-Gate-MOS process ..."

Jan 26 '06 #17

P: n/a

Jordan Abel wrote:
On 2006-01-26, boa <bo*****@gmail.com> wrote:
boa wrote:
Jordan Abel wrote:
On 2006-01-26, Jeff <jo******@gmail.com> wrote:
> In the function below, can size ever be 0 (zero)?

I've never heard of the "clc string lib" - where can i find it?

http://libclc.sf.net
> char *clc_strdup(const char * CLC_RESTRICT s)
> {
> size_t size;
> char *p;
>
> clc_assert_not_null(clc_strdup, s);

No idea how this would work in a way that needs the function pointer as
its first argument. Is it a macro that stringizes its first argument to
print an error?

> size = strlen(s) + 1;

The result of this assignment cannot be zero.

> if (size == 0)
> p = NULL;

hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.

size can never be SIZE_MAX as size doesn't include '\0'. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX.

boa


One more try: strlen() can never return SIZE_MAX. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX, so the max value strlen() can return is SIZE_MAX - 1.

boa


take SIZE_MAX 65535

char * foo = calloc(256,256);
memset(foo,"x",SIZE_MAX);
strlen(foo) == SIZE_MAX;

I don't think there's an explicit requirement in the standard that there
never be an object smaller than SIZE_MAX. There may, however, be a
requirement that the product of the arguments to calloc be less than
SIZE_MAX. [that, i don't know.]


That's my point below. You can't allocate more than SIZE_MAX so the
strlen would be SIZE_MAX -1, so in the example above, size could not be
0 (zero). I may be missing something, but the check 'if (size == 0)'
seems pointless.

Jeff

Jan 26 '06 #18

P: n/a
boa wrote:
Jordan Abel wrote:
On 2006-01-26, boa <bo*****@gmail.com> wrote:
boa wrote:
<snip>
One more try: strlen() can never return SIZE_MAX. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX, so the max value strlen() can return is SIZE_MAX - 1.

boa
take SIZE_MAX 65535

char * foo = calloc(256,256);
memset(foo,"x",SIZE_MAX);
strlen(foo) == SIZE_MAX;
This is a variation of something I suggested in another thread some time
ago. I don't remember anyone proving it invalid back then, so I'll see
if I can find the justification again.
I don't think there's an explicit requirement in the standard that there
never be an object smaller than SIZE_MAX. There may, however, be a
requirement that the product of the arguments to calloc be less than
SIZE_MAX. [that, i don't know.]


It goes without saying? ;-)


The definition of calloc places no limit on what you can pass to it. Of
course, an implementation could legally always return NULL if you try to
allocate an object larger than SIZE_MAX.
FWIW, I've looked through both the C99 standard, the C rationale and
TC1, looking for some description of the relationship between size_t and
SIZE_MAX, but found nothing. SIZE_T is only mentioned twice in C99.
size_t is mentioned a lot more than twice in C99, although SIZE_T isn't
mentioned at all. SIZE_MAX is mentioned 3 times in C99 and once is
defining it as the limit of size_t.
So I really don't have much of a case here.
I see nothing in the standard that forbids it, and definitely nothing
that forbids the implementation from successfully allocating such an object.
One could argue that even if
it is OK to calloc() memory the way you do above, you don't allocate a
string, just 256 objects of 256 bytes so you cannot use it as an
argument to strlen().
You allocate space for an array of objects, and that space must be
contiguous. Since the standard defines an object as, "region of data
storage in the execution environment, the contents of which can
represent values" the space allocated by calloc clearly meets the
definition of an object. The standard also states, "When a pointer to an
object is converted to a pointer to a character type, the result points
to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining
bytes of the object." so it is clearly legal to increment a char pointer
over such an object created by a call to calloc, assuming the calloc
call succeeds. Now for the big one, section 7.1.1of N1124:

| A string is a contiguous sequence of characters terminated by and
| including the first null character. The term multibyte string is
| sometimes used instead to emphasize special processing given to
| multibyte characters contained in the string or to avoid confusion
| with a wide string. A pointer to a string is a pointer to its initial
| (lowest addressed) character. The length of a string is the number of
| bytes preceding the null character and the value of a string is the
| sequence of the values of the contained characters, in order.

No where does the above place any limitations on the length of the
string or how you create it.

So if the implementation allows the call to calloc to succeed, it is
perfectly legal to construct a string with a length of SIZE_MAX.
The standard is vague on this issue, 7.21.1 says this:
The header <string.h> declares one type and several functions, and
defines one
macro useful for manipulating arrays of character type and other
objects treated as arrays
of character type.


Pretty clear that strlen() manipulates arrays of character type, but
what's "other objects"?


You can treat any object as an array of type unsigned char. See the bit
about being allowed to convert any pointer to a character pointer. See
also http://www.open-std.org/jtc1/sc22/wg...ocs/dr_274.htm which
says that the string functions should treat the characters as type
unsigned char which definitely has no trap representations or padding bits.

So I believe it is technically possibly to generate a string of size
SIZE_MAX without invoking undefined behaviour if the implementation
allows it, but the implementation is not required to allow you to do
this (the calloc call can fail).

Note that using a 2d array instead of calloc'd space still does not
force the implementation to allow you to do this since you would be
exceeding an environmental limit, this is covered by another DR.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
Jan 26 '06 #19

P: n/a
Jeff wrote:
Jordan Abel wrote:
On 2006-01-26, boa <bo*****@gmail.com> wrote:
boa wrote:
Jordan Abel wrote:
> On 2006-01-26, Jeff <jo******@gmail.com> wrote:
>> In the function below, can size ever be 0 (zero)?
> I've never heard of the "clc string lib" - where can i find it?
http://libclc.sf.net

>> char *clc_strdup(const char * CLC_RESTRICT s)
>> {
>> size_t size;
>> char *p;
>>
>> clc_assert_not_null(clc_strdup, s);
> No idea how this would work in a way that needs the function pointer as
> its first argument. Is it a macro that stringizes its first argument to
> print an error?
>
>> size = strlen(s) + 1;
> The result of this assignment cannot be zero.
>
>> if (size == 0)
>> p = NULL;
> hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.
size can never be SIZE_MAX as size doesn't include '\0'. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX.

boa
One more try: strlen() can never return SIZE_MAX. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX, so the max value strlen() can return is SIZE_MAX - 1.

boa

take SIZE_MAX 65535

char * foo = calloc(256,256);
memset(foo,"x",SIZE_MAX);
strlen(foo) == SIZE_MAX;

I don't think there's an explicit requirement in the standard that there
never be an object smaller than SIZE_MAX. There may, however, be a
requirement that the product of the arguments to calloc be less than
SIZE_MAX. [that, i don't know.]


That's my point below. You can't allocate more than SIZE_MAX so the
strlen would be SIZE_MAX -1, so in the example above, size could not be
0 (zero). I may be missing something, but the check 'if (size == 0)'
seems pointless.


See my other post, the standard does not forbid Jordan's code as far as
I can see. So if the call to calloc succeeds you have *legally* created
an object larger than SIZE_MAX.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
Jan 26 '06 #20

P: n/a
On 2006-01-26, Flash Gordon <sp**@flash-gordon.me.uk> wrote:
Jeff wrote:
Jordan Abel wrote:
On 2006-01-26, boa <bo*****@gmail.com> wrote:
boa wrote:
> Jordan Abel wrote:
>> On 2006-01-26, Jeff <jo******@gmail.com> wrote:
>>> In the function below, can size ever be 0 (zero)?
>> I've never heard of the "clc string lib" - where can i find it?
> http://libclc.sf.net
>
>>> char *clc_strdup(const char * CLC_RESTRICT s)
>>> {
>>> size_t size;
>>> char *p;
>>>
>>> clc_assert_not_null(clc_strdup, s);
>> No idea how this would work in a way that needs the function pointer as
>> its first argument. Is it a macro that stringizes its first argument to
>> print an error?
>>
>>> size = strlen(s) + 1;
>> The result of this assignment cannot be zero.
>>
>>> if (size == 0)
>>> p = NULL;
>> hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.
> size can never be SIZE_MAX as size doesn't include '\0'. A valid string
> always has the '\0' and the max size of a buffer containing a string is
> SIZE_MAX.
>
> boa
One more try: strlen() can never return SIZE_MAX. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX, so the max value strlen() can return is SIZE_MAX - 1.

boa
take SIZE_MAX 65535

char * foo = calloc(256,256);
memset(foo,"x",SIZE_MAX);
strlen(foo) == SIZE_MAX;

I don't think there's an explicit requirement in the standard that there
never be an object smaller than SIZE_MAX. There may, however, be a
requirement that the product of the arguments to calloc be less than
SIZE_MAX. [that, i don't know.]


That's my point below. You can't allocate more than SIZE_MAX so the
strlen would be SIZE_MAX -1, so in the example above, size could not be
0 (zero). I may be missing something, but the check 'if (size == 0)'
seems pointless.


See my other post, the standard does not forbid Jordan's code as far as
I can see. So if the call to calloc succeeds you have *legally* created
an object larger than SIZE_MAX.


Yeah - it's not particularly _likely_ to succeed on any reasonable
implementation [relevant code from my implementation: if (size != 0 &&
SIZE_T_MAX / size < num) { errno = ENOMEM; return (NULL); }], but if it
does succeed, there's nothing wrong with the rest of the code, and
strlen()+1 will be 0. [with some work, it would be possible to make a
strdup that uses the same trickery to successfully duplicate such an
object]

Note that this strdup does have another problem with implementations
that allow an object larger than SIZE_MAX.

/* SIZE_MAX 65535 */
char *p = calloc(256,257)
memset(p,'a',256);
memset(p+256,'a',SIZE_MAX);
char *q = clc_strdup(p);

if everything succeeds (and strlen does what can reasonably be expected
- it's unclear, and thus probaby undefined, what passing a string longer
than SIZE_MAX to strlen actually will result in), q points to an array
of 256 chars, each set to 'a', and not null-terminated.

Attempting to catch a corner case like a string that is exactly SIZE_MAX
long seems pointless when it ignores the possibility of a string longer
than SIZE_MAX.
Jan 26 '06 #21

P: n/a

Jordan Abel wrote:
On 2006-01-26, boa <bo*****@gmail.com> wrote:
boa wrote:
Jordan Abel wrote:
On 2006-01-26, Jeff <jo******@gmail.com> wrote:
> In the function below, can size ever be 0 (zero)?

I've never heard of the "clc string lib" - where can i find it?

http://libclc.sf.net
> char *clc_strdup(const char * CLC_RESTRICT s)
> {
> size_t size;
> char *p;
>
> clc_assert_not_null(clc_strdup, s);

No idea how this would work in a way that needs the function pointer as
its first argument. Is it a macro that stringizes its first argument to
print an error?

> size = strlen(s) + 1;

The result of this assignment cannot be zero.

> if (size == 0)
> p = NULL;

hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.

size can never be SIZE_MAX as size doesn't include '\0'. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX.

boa
One more try: strlen() can never return SIZE_MAX. A valid string
always has the '\0' and the max size of a buffer containing a string is
SIZE_MAX, so the max value strlen() can return is SIZE_MAX - 1.

boa


take SIZE_MAX 65535

char * foo = calloc(256,256);
memset(foo,"x",SIZE_MAX);


I'm pretty sure this is undefined behavior.
It parallels, I think, with an issue brought up
months ago(in comp.std.c) that labeled the
following as undefined behavior.

int array[10][10];

array[0][10] = 10;

and that each array object is not guaranteed to
be adjacent to the other.
strlen(foo) == SIZE_MAX;

I don't think there's an explicit requirement in the standard that there
never be an object smaller than SIZE_MAX. There may, however, be a
requirement that the product of the arguments to calloc be less than
SIZE_MAX. [that, i don't know.]


--
aegis

Jan 27 '06 #22

P: n/a
On 2006-01-27, aegis <ae***@mad.scientist.com> wrote:

Jordan Abel wrote:
On 2006-01-26, boa <bo*****@gmail.com> wrote:
> boa wrote:
>> Jordan Abel wrote:
>>> On 2006-01-26, Jeff <jo******@gmail.com> wrote:
>>>> In the function below, can size ever be 0 (zero)?
>>>
>>> I've never heard of the "clc string lib" - where can i find it?
>>
>> http://libclc.sf.net
>>
>>>
>>>> char *clc_strdup(const char * CLC_RESTRICT s)
>>>> {
>>>> size_t size;
>>>> char *p;
>>>>
>>>> clc_assert_not_null(clc_strdup, s);
>>>
>>> No idea how this would work in a way that needs the function pointer as
>>> its first argument. Is it a macro that stringizes its first argument to
>>> print an error?
>>>
>>>> size = strlen(s) + 1;
>>>
>>> The result of this assignment cannot be zero.
>>>
>>>> if (size == 0)
>>>> p = NULL;
>>>
>>> hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.
>>
>> size can never be SIZE_MAX as size doesn't include '\0'. A valid string
>> always has the '\0' and the max size of a buffer containing a string is
>> SIZE_MAX.
>>
>> boa
>
> One more try: strlen() can never return SIZE_MAX. A valid string
> always has the '\0' and the max size of a buffer containing a string is
> SIZE_MAX, so the max value strlen() can return is SIZE_MAX - 1.
>
> boa
take SIZE_MAX 65535

char * foo = calloc(256,256);
memset(foo,"x",SIZE_MAX);


I'm pretty sure this is undefined behavior. It parallels, I think,
with an issue brought up months ago(in comp.std.c) that labeled the
following as undefined behavior.


No it doesn't.

int array[10][10];

array[0][10] = 10;

and that each array object is not guaranteed to
be adjacent to the other.


Yes they are, but that's another discussion. Even if they weren't,
that's doesn't apply to calloc - nothing in the standard provides for
there to be any padding "gaps" in the memory returned by calloc.

How would that even work?
Jan 27 '06 #23

P: n/a
"aegis" <ae***@mad.scientist.com> writes:
Jordan Abel wrote: [...]
take SIZE_MAX 65535

char * foo = calloc(256,256);
memset(foo,"x",SIZE_MAX);


I'm pretty sure this is undefined behavior.


How so? The calloc() call can either succeed or fail; I don't see any
permission for it to do anything else. The memset() call is ok (if
calloc() succeeded). Note that it doesn't set the entire object; it
leaves the last byte as '\0'.
It parallels, I think, with an issue brought up
months ago(in comp.std.c) that labeled the
following as undefined behavior.

int array[10][10];

array[0][10] = 10;

and that each array object is not guaranteed to
be adjacent to the other.


As I understand the argument, the array elements are guaranteed to be
adjacent; the assignment invokes undefined behavior because an
implementation could do explicit bounds checking, not because the
address might be invalid. I don't see the connection between this and
the calloc() issue.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 27 '06 #24

P: n/a
Jeff wrote:

In the function below, can size ever be 0 (zero)?

char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);

size = strlen(s) + 1;
if (size == 0)
p = NULL;
else if ((p = malloc(size)) != NULL)
memcpy(p, s, size);

return p;
}


Very poor code. Apart from the missing definition of clc_assert...
and CLC_RESTRICT strlen returns a size_t, which is unsigned. Thus
size can never be less than 1, and the "p = NULL" will never be
executed. Assuming CLC_RESTRICT has something to do with the
restrict qualifier, it is pointless because s is const. Better
code might be:

char *clc_strdup(const char *s) {
size_t size;
char *p;

if (!s) p = NULL;
else {
size = strlen(s) + 1;
if ((p = malloc(size)) != NULL) memcpy(p, s, size);
}
return p;
}

Some will object to the guard agains s == NULL.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Jan 27 '06 #25

P: n/a
boa wrote:
Jordan Abel wrote:
On 2006-01-26, Jeff <jo******@gmail.com> wrote:

.... snip ...
size = strlen(s) + 1;


The result of this assignment cannot be zero.
if (size == 0)
p = NULL;


hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.


size can never be SIZE_MAX as size doesn't include '\0'. A valid
string always has the '\0' and the max size of a buffer containing
a string is SIZE_MAX.


wo****@newsguy.com (Michael Wojcik) writes:
[...] Fortunately, c.l.c contains one of the world's largest herds of free-
roaming pedants, thundering majestically across the virtual plains...


Goring all who stand in their path with their sharp horns, such as
the above claim that zero is a possible value. See, I can quote it
too :-)

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Jan 27 '06 #26

P: n/a
On 2006-01-27, CBFalconer <cb********@yahoo.com> wrote:
Jeff wrote:

In the function below, can size ever be 0 (zero)?

char *clc_strdup(const char * CLC_RESTRICT s)
{
size_t size;
char *p;

clc_assert_not_null(clc_strdup, s);

size = strlen(s) + 1;
if (size == 0)
p = NULL;
else if ((p = malloc(size)) != NULL)
memcpy(p, s, size);

return p;
}
Very poor code. Apart from the missing definition of clc_assert...
and CLC_RESTRICT


in headers whose inclusion was not pasted. geez, next you'll complain
that there's no definition for main().
strlen returns a size_t, which is unsigned. Thus
size can never be less than 1, and the "p = NULL" will never be
executed. Assuming CLC_RESTRICT has something to do with the
restrict qualifier, it is pointless because s is const. Better
code might be:

char *clc_strdup(const char *s) {
size_t size;
char *p;

if (!s) p = NULL;
else {
size = strlen(s) + 1;
if ((p = malloc(size)) != NULL) memcpy(p, s, size);
}
return p;
}

Some will object to the guard agains s == NULL.


It's an assertion, not a guard. like assert(), it is only used in debug
mode, outside of debug mode it expands to ((void)0).
Jan 27 '06 #27

P: n/a
On 2006-01-27, CBFalconer <cb********@yahoo.com> wrote:
boa wrote:
Jordan Abel wrote:
On 2006-01-26, Jeff <jo******@gmail.com> wrote:

... snip ...

size = strlen(s) + 1;

The result of this assignment cannot be zero.

if (size == 0)
p = NULL;

hold on - i take that back. size can be 0 if strlen returns SIZE_MAX.


size can never be SIZE_MAX as size doesn't include '\0'. A valid
string always has the '\0' and the max size of a buffer containing
a string is SIZE_MAX.


wo****@newsguy.com (Michael Wojcik) writes:
[...]
Fortunately, c.l.c contains one of the world's largest herds of free-
roaming pedants, thundering majestically across the virtual plains...


Goring all who stand in their path with their sharp horns, such as
the above claim that zero is a possible value. See, I can quote it
too :-)


So what do you think of this?

char *p = calloc(SIZE_MAX,2), q;
if(!p) exit(EXIT_FAILURE);
memset(p,'a',SIZE_MAX);
q = strdup(p); /* oops */
Jan 27 '06 #28

P: n/a
Jordan Abel wrote:
On 2006-01-27, CBFalconer <cb********@yahoo.com> wrote:

.... snip ...
Better code might be:

char *clc_strdup(const char *s) {
size_t size;
char *p;

if (!s) p = NULL;
else {
size = strlen(s) + 1;
if ((p = malloc(size)) != NULL) memcpy(p, s, size);
}
return p;
}

Some will object to the guard agains s == NULL.


It's an assertion, not a guard. like assert(), it is only used in
debug mode, outside of debug mode it expands to ((void)0).


Not in my code. It stands guard and returns sane values whenever
possible. It's called robustness.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>

Jan 27 '06 #29

P: n/a
Jordan Abel wrote:

<snip handling string of length SIZE_MAX>
Note that this strdup does have another problem with implementations
that allow an object larger than SIZE_MAX.

/* SIZE_MAX 65535 */
char *p = calloc(256,257)
memset(p,'a',256);
memset(p+256,'a',SIZE_MAX);
char *q = clc_strdup(p);

if everything succeeds (and strlen does what can reasonably be expected
- it's unclear, and thus probaby undefined, what passing a string longer
than SIZE_MAX to strlen actually will result in), q points to an array
of 256 chars, each set to 'a', and not null-terminated.
I believe passing a string with length longer than SIZE_MAX invokes
undefined behaviour. My reasoning being as follows:
The standard says strlen returns a result of type size_t
The standard says that strlen returns the number of characters before
the terminating null character of a string.
The standard does not say what strlen does if the result does not fit
in size_t
The standard says that in any instance that it does not define the
behaviour, the behaviour is undefined.
Attempting to catch a corner case like a string that is exactly SIZE_MAX
long seems pointless when it ignores the possibility of a string longer
than SIZE_MAX.


The string length being exactly SIZE_MAX is easy to catch, the string
length being longer is difficult to catch efficiently since you would
have to implement an error checking version of strlen as well.

Personally, when I wrote a strdup implementation (named ffstrdup) I
decided that anyone generating a string with length SIZE_MAX or longer
deserves to be shot, so I left it as undefined behaviour what the
library would do in such cases.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
Jan 27 '06 #30

P: n/a
On 2006-01-27, Flash Gordon <sp**@flash-gordon.me.uk> wrote:
Jordan Abel wrote:

<snip handling string of length SIZE_MAX>
Note that this strdup does have another problem with implementations
that allow an object larger than SIZE_MAX.

/* SIZE_MAX 65535 */
char *p = calloc(256,257)
memset(p,'a',256);
memset(p+256,'a',SIZE_MAX);
char *q = clc_strdup(p);

if everything succeeds (and strlen does what can reasonably be expected
- it's unclear, and thus probaby undefined, what passing a string longer
than SIZE_MAX to strlen actually will result in), q points to an array
of 256 chars, each set to 'a', and not null-terminated.
I believe passing a string with length longer than SIZE_MAX invokes
undefined behaviour. My reasoning being as follows:
The standard says strlen returns a result of type size_t
The standard says that strlen returns the number of characters before
the terminating null character of a string.
The standard does not say what strlen does if the result does not fit
in size_t
The standard says that in any instance that it does not define the
behaviour, the behaviour is undefined.


[size_t being an unsigned type and the fact that arithmetic involving
such types is always reduced modulo max+1 creates an argument that it is
not undefined]
Attempting to catch a corner case like a string that is exactly SIZE_MAX
long seems pointless when it ignores the possibility of a string longer
than SIZE_MAX.


The string length being exactly SIZE_MAX is easy to catch, the string
length being longer is difficult to catch efficiently since you would
have to implement an error checking version of strlen as well.


You could force the buffer to be null-terminated. or check for a null
terminator at the offset of what strlen has told you the length is.
(that won't tell you where it is, but it'll tell you where it's not)
Though, that's certainly not something you want to waste time doing
unless you're routinely passed huge strings.
Personally, when I wrote a strdup implementation (named ffstrdup) I
decided that anyone generating a string with length SIZE_MAX or longer
deserves to be shot, so I left it as undefined behaviour what the
library would do in such cases.


my implementation's strdup also relies on strlen, but that's beside the
point - my implementation also doesn't have a way to malloc an object
let alone a string larger than SIZE_MAX [though, i believe it is
possible to create one with mmap and a carefully-constructed file, i'm
not sure if such calls actually succeed]
Jan 27 '06 #31

P: n/a
Jordan Abel wrote:
On 2006-01-27, Flash Gordon <sp**@flash-gordon.me.uk> wrote:
Jordan Abel wrote:

<snip handling string of length SIZE_MAX>
Note that this strdup does have another problem with implementations
that allow an object larger than SIZE_MAX.

/* SIZE_MAX 65535 */
char *p = calloc(256,257)
memset(p,'a',256);
memset(p+256,'a',SIZE_MAX);
char *q = clc_strdup(p);

if everything succeeds (and strlen does what can reasonably be expected
- it's unclear, and thus probaby undefined, what passing a string longer
than SIZE_MAX to strlen actually will result in), q points to an array
of 256 chars, each set to 'a', and not null-terminated.

I believe passing a string with length longer than SIZE_MAX invokes
undefined behaviour. My reasoning being as follows:
The standard says strlen returns a result of type size_t
The standard says that strlen returns the number of characters before
the terminating null character of a string.
The standard does not say what strlen does if the result does not fit
in size_t
The standard says that in any instance that it does not define the
behaviour, the behaviour is undefined.


[size_t being an unsigned type and the fact that arithmetic involving
such types is always reduced modulo max+1 creates an argument that it is
not undefined]


I'm aware of the behaviour of arithmetic on unsigned types and that
size_t is such a type, but that is not the issue here because you are
not doing unsigned arithmetic you are calling a library function. The
issue is as I stated purely one of the standard not defining what strlen
does if the result does not fit in size_t.

strlen does not have to be implemented in standard C. It could increment
a pointer to the end then use a non-standard method to subtract the
start pointer from the end pointer in a way that produces an unsigned
result in range of size_t if it fits and crashes the program if it doesn't.
Attempting to catch a corner case like a string that is exactly SIZE_MAX
long seems pointless when it ignores the possibility of a string longer
than SIZE_MAX.

The string length being exactly SIZE_MAX is easy to catch, the string
length being longer is difficult to catch efficiently since you would
have to implement an error checking version of strlen as well.


You could force the buffer to be null-terminated. or check for a null
terminator at the offset of what strlen has told you the length is.
(that won't tell you where it is, but it'll tell you where it's not)
Though, that's certainly not something you want to waste time doing
unless you're routinely passed huge strings.


It also won't work with the strlen implementation I suggested above
because the program will already have crashed.
Personally, when I wrote a strdup implementation (named ffstrdup) I
decided that anyone generating a string with length SIZE_MAX or longer
deserves to be shot, so I left it as undefined behaviour what the
library would do in such cases.


my implementation's strdup also relies on strlen, but that's beside the
point - my implementation also doesn't have a way to malloc an object
let alone a string larger than SIZE_MAX [though, i believe it is
possible to create one with mmap and a carefully-constructed file, i'm
not sure if such calls actually succeed]


I've no idea if it does either. I don't think it si tremendously important.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
Jan 27 '06 #32

P: n/a
Flash Gordon wrote:
Jordan Abel wrote:
On 2006-01-27, Flash Gordon <sp**@flash-gordon.me.uk> wrote:

.... snip ...
Personally, when I wrote a strdup implementation (named ffstrdup)
I decided that anyone generating a string with length SIZE_MAX or
longer deserves to be shot, so I left it as undefined behaviour
what the library would do in such cases.


my implementation's strdup also relies on strlen, but that's
beside the point - my implementation also doesn't have a way to
malloc an object let alone a string larger than SIZE_MAX [though,
i believe it is possible to create one with mmap and a carefully-
constructed file, i'm not sure if such calls actually succeed]


I've no idea if it does either. I don't think it si tremendously
important.


Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Jan 28 '06 #33

P: n/a
On 2006-01-28, CBFalconer <cb********@yahoo.com> wrote:
Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.


A medium-length text file embedded as a char[] object is about the only
way i see this making sense - and that's for a SIZE_MAX of 65536.
Assuming an average line length of 64, that's 1024 lines, or 17 pages at
60 lines per page.

[I assume an ASR33 can handle printing 17 pages of text, or half that
anyway, even if it will take a few... hour and a half or so]

Jan 28 '06 #34

P: n/a
Jordan Abel wrote:
On 2006-01-28, CBFalconer <cb********@yahoo.com> wrote:
Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.

A medium-length text file embedded as a char[] object is about the only
way i see this making sense - and that's for a SIZE_MAX of 65536.
Assuming an average line length of 64, that's 1024 lines, or 17 pages at
60 lines per page.

[I assume an ASR33 can handle printing 17 pages of text, or half that
anyway, even if it will take a few... hour and a half or so]

What makes you want a whole document inside a
single string??
If you approach programming that way,be prepared
for a lot of problems(especially those 4GB files).
Jan 28 '06 #35

P: n/a
Sjouke Burry <bu*************@ppllaanneett.nnlll> writes:
Jordan Abel wrote:
On 2006-01-28, CBFalconer <cb********@yahoo.com> wrote:
Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.

A medium-length text file embedded as a char[] object is about the
only
way i see this making sense - and that's for a SIZE_MAX of 65536.
Assuming an average line length of 64, that's 1024 lines, or 17 pages at
60 lines per page.
[I assume an ASR33 can handle printing 17 pages of text, or half that
anyway, even if it will take a few... hour and a half or so]

What makes you want a whole document inside a
single string??
If you approach programming that way,be prepared
for a lot of problems(especially those 4GB files).


Suppose you want to sort it. Having the whole thing in memory (if it
fits) is a *lot* more efficient than trying to sort it on disk.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 28 '06 #36

P: n/a
Keith Thompson <ks***@mib.org> writes:
Sjouke Burry <bu*************@ppllaanneett.nnlll> writes:

[...]
What makes you want a whole document inside a
single string??
If you approach programming that way,be prepared
for a lot of problems(especially those 4GB files).


Suppose you want to sort it. Having the whole thing in memory (if it
fits) is a *lot* more efficient than trying to sort it on disk.


Correction: it's *likely* to be a lot more efficient. (C doesn't say
anything about the relative efficiency of memory access vs. disk
access, and thrashing on a virtual memory system can slow things down
considerably.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jan 28 '06 #37

P: n/a
On 2006-01-28, Sjouke Burry <bu*************@ppllaanneett.nnlll> wrote:
Jordan Abel wrote:
On 2006-01-28, CBFalconer <cb********@yahoo.com> wrote:
Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.

A medium-length text file embedded as a char[] object is about the only
way i see this making sense - and that's for a SIZE_MAX of 65536.
Assuming an average line length of 64, that's 1024 lines, or 17 pages at
60 lines per page.

[I assume an ASR33 can handle printing 17 pages of text, or half that
anyway, even if it will take a few... hour and a half or so]

What makes you want a whole document inside a
single string??
If you approach programming that way,be prepared
for a lot of problems(especially those 4GB files).


How about it's your usage message? x11vnc, the longest i've seen,
clocks in at 74084. X has a relatively long one [but nowhere
approaching the min max for SIZE_MAX] at 6090.

[the more relevant question is why would you want _two_ of such a
thing - given that this is a strdup implementation we're talking
about]
Jan 28 '06 #38

P: n/a
Keith Thompson wrote:
Sjouke Burry <bu*************@ppllaanneett.nnlll> writes:
Jordan Abel wrote:
On 2006-01-28, CBFalconer <cb********@yahoo.com> wrote:
Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.

A medium-length text file embedded as a char[] object is about the
only
way i see this making sense - and that's for a SIZE_MAX of 65536.
Assuming an average line length of 64, that's 1024 lines, or 17 pages at
60 lines per page.
[I assume an ASR33 can handle printing 17 pages of text, or half that
anyway, even if it will take a few... hour and a half or so]


What makes you want a whole document inside a
single string??
If you approach programming that way,be prepared
for a lot of problems(especially those 4GB files).

Suppose you want to sort it. Having the whole thing in memory (if it
fits) is a *lot* more efficient than trying to sort it on disk.

He was talking about everything in one string!!
I dont see how you are going to sort that.
Of course if you read each item in a seperate
string,you can sort it, but then you will not
run into SIZE_MAX .
In 99 out of 100 cases,if you run into SIZE_MAX
your program has(or does) something wrong .
Jan 29 '06 #39

P: n/a
CBFalconer wrote:
Flash Gordon wrote:
Jordan Abel wrote:
On 2006-01-27, Flash Gordon <sp**@flash-gordon.me.uk> wrote: ... snip ... Personally, when I wrote a strdup implementation (named ffstrdup)
I decided that anyone generating a string with length SIZE_MAX or
longer deserves to be shot, so I left it as undefined behaviour
what the library would do in such cases.

my implementation's strdup also relies on strlen, but that's
beside the point - my implementation also doesn't have a way to
malloc an object let alone a string larger than SIZE_MAX [though,
i believe it is possible to create one with mmap and a carefully-
constructed file, i'm not sure if such calls actually succeed]


I've no idea if it does either. I don't think it si tremendously
important.


Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.


In the tests for "The Better String Library", when running in a 16-bit
environment, the test strings exceed this value. (This is an important
part of the test, BTW.) This can be an issue if you have single string
objects (like the contents of a .HTML file for example) that exceeds
32K in Bstrlib. (Bstrlib correctly deals with all these cases as
detected errors -- you should generally use bstreams, not bstrings for
large entries like that.)

Obviously, if you were to try to write big string/text manipulation
programs in a 16-bit environments without Bstrlib, you would run into
at least those problems.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Jan 29 '06 #40

P: n/a

In article <sl***********************@random.yi.org>, Jordan Abel <ra*******@gmail.com> writes:
On 2006-01-26, Michael Wojcik <mw*****@newsguy.com> wrote:
Each identifier with file scope listed in any of the following
subclauses (including the future library directions) is reserved for
use as a macro name and as an identifier with file scope in the same
name space if any of its associated headers is included.

Note the "str..." identifiers in string.h are identifiers with file
scope.

"strdup" as a macro name is an identifier that begins with "str"
and a lowercase letter. It has file scope, because it is a macro
name. That means it is covered by 7.26.10 and 7.26.11. By 7.1.3,
it is thus reserved if stdlib.h or string.h is included.
Can I #undef it and then #define my own?


It's my understanding that this invokes undefined behavior. Whether
it's a problem for any actual existing implementation is another
question, of course.
It seems like this would fall
into the same dubious area as redefining a keyword, but would still be
legal (just as it is to redefine a keyword)


Redefining a keyword is only "legal" because they're only reserved
in translation phases 7 and 8 (C99 6.4.1#2). (This is a "shall"
requirement, so violating it causes UB, by 4#2.) Since macros are
expanded in translation phase 4, by the time phase 7 is reached,
macros that have the same names as keywords (eg "#define if foo")
no longer appear as identifiers in the translation unit.

No such special rule exists for other reserved identifiers, as far
as I can see.

--
Michael Wojcik mi************@microfocus.com

Push up the bottom with your finger, it will puffy and makes stand up.
-- instructions for "swan" from an origami kit
Jan 31 '06 #41

P: n/a
CBFalconer wrote:

Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.


The item on hand is whether or not 'size==0' can ever be true. I claim
no.

Jeff

Jan 31 '06 #42

P: n/a

In article <dr**********@nwrdmz03.dmz.ncs.ea.ibs-infra.bt.com>, "Vladimir S. Oka" <no****@btopenworld.com> writes:
Michael Wojcik wrote:
In article <dr**********@nwrdmz01.dmz.ncs.ea.ibs-infra.bt.com>,
"Vladimir S. Oka" <no****@btopenworld.com> writes:
Jordan Abel wrote:

> Incidentally, speaking of prefixes, the str* prefix, and reserved
> identifiers, is something along the lines of
>
> #define strdup clc_strdup
>
> legal? My reading of the standard says it is, but thought i'd ask
> here.

I believe it is (and can't be asked to open the Standard right now).
If you can't be bothered to check the Standard, why reply? That's
a serious question. Jordan's question was about the Standard, not
about anyone else's belief.


Your premise here seems to be that I have never actually opened the
Standard, and am in fact guessing.


No, my premise is that you explicitly claimed that you had not
checked the Standard in this particular instance, and I can't see
any justification for your conclusion above.
If you happened to agree with me (or
Jordan), I wonder whether you'd have bothered to chastise me.
My argument appears to apply regardless of whether I agreed with you.
Instead,
you proceed believing (no pun intended) that your reading of the
Standard will justify the above paragraph, and the below statement.
I certainly do not believe that any reading of the Standard justifies
"the above paragraph", assuming the antecedent is my first paragraph
(and I don't see what else it could reasonably be). The Standard says
nothing about whether you should reply to a Usenet query, or whether
Jordan's question referred to the Standard, which are the only two
claims in the paragraph in question.
I *am* looking at the Standard (I'll cite C99 here, but C90 has
equivalent, indeed mostly identical, language). My interpretation
of the Standard contradicts Jordan's.


But, let's see what your reading of the standard yields:


Yes, that's the idea.
Note first that macro names are identifiers (6.2.1).

7.1.3 ("Reserved identifiers"):

Each header declares or defines all identifiers listed in its
associated subclause, and optionally declares or defines
identifiers listed in its associated future library directions
subclause and identifiers which are always reserved either for any
use or for use as file scope identifiers.
...
Each identifier with file scope listed in any of the following
subclauses (including the future library directions) is reserved
for use as a macro name and as an identifier with file scope in the
same name space if any of its associated headers is included.

Note the "str..." identifiers in string.h are identifiers with file
scope.

7.26.10 ("General utilities <stdlib.h>"):

Function names that begin with str and a lowercase letter may be
added to the declarations in the <stdlib.h> header.

7.26.11 ("String handling <string.h>"):

Function names that begin with str, mem, or wcs and a lowercase
letter may be added to the declarations in the <string.h> header.
"strdup" as a macro name is an identifier that begins with "str"
and a lowercase letter. It has file scope, because it is a macro
name. That means it is covered by 7.26.10 and 7.26.11. By 7.1.3,
it is thus reserved if stdlib.h or string.h is included.

The only thing that making it a macro name rather than simply having
a function (with external linkage) named "strdup" gives you is
relief from 7.26:

7.26 ("Future library directions"):

All external names described below are reserved no matter what
headers are included by the program.

So calling the function "clc_strdup" and using a macro to refer to it
as "strdup" is legal provided stdlib.h and string.h are not included
- but that seems rather unlikely, and you could achieve the same thing
by giving "strdup" internal linkage (ie by declaring it static).


In other words, there _are_ circumstances where it actually _is_ legal,
even according to your reading of the Standard.


Since Jordan's question did not refer to those restricted circum-
stances, it applies to the general case; and in the general case,
his proposition ("this action is permitted by the Standard") is
false. "p -> q" does not imply "q".

Had Jordan asked, in some appropriate venue, "is it legal to stick a
knife into someone?", would you consider, "yes, if you're a surgeon
performing within the terms of your medical license and other
applicable regulation", and from that conclude that the answer to his
question is "yes"?

If so, I suggest you reconsider your personal definition of "legal",
as it does not appear to accord with common usage.
Why would it be
unlikely for a source file not to include certain standard headers?
I've seen quite a few.
Actually, this is more plausible than I originally thought. I
was thinking that the Standard allows standard headers to include
other standard headers, but it does not (there was a thread about
this on comp.std.c back in 2001[1]). The C++ standard does allow
this, which may have been what I was thinking of.

However, the point is moot; as I noted above, Jordan's question does
not specify any special circumstances, and neither does your initial
reply. As written they are false. (Well, technically yours isn't
necessarily false, since it's a claim of belief; but your belief, as
stated, does not accord with fact.)
So, your reading does not actually contradict Jordan's (or mine, for
that matter), at least not entirely.


I submit that it does - that as the question is expressed, there is no
partial correctness that can attach. It is incorrect as stated.
Fortunately, c.l.c contains one of the world's largest herds of free-
roaming pedants, thundering majestically across the virtual plains...


And sometimes, just sometimes, being a bit too self-righteous... ;-)


I don't believe I was self-righteous in the least. My claim about
the unlikelihood of the special circumstances was overly strong (due
to an erroneous unstated understanding which I ought to have verified
first), but aside from that I stand by everything I wrote. None of
it is intended to glorify me; I merely report what the Standard says,
and question - I think correctly - the utility of a response to a
question about the Standard which fails to refer to the Standard.
(That it did so ostentatiously might, by some, seem a bit self-
indulgent, but I will pass over that in silence.)
1. http://groups.google.com/group/comp....7020e3ea6fda78

--
Michael Wojcik mi************@microfocus.com

Advertising Copy in a Second Language Dept.:
The precious ovum itself is proof of the oath sworn to those who set
eyes upon Mokona: Your wishes will be granted if you are able to invest
it with eternal radiance... -- Noriyuki Zinguzi
Jan 31 '06 #43

P: n/a

In article <43**************@yahoo.com>, CBFalconer <cb********@yahoo.com> writes:
wo****@newsguy.com (Michael Wojcik) writes:
Fortunately, c.l.c contains one of the world's largest herds of free-
roaming pedants, thundering majestically across the virtual plains...


Goring all who stand in their path with their sharp horns, such as
the above claim that zero is a possible value. See, I can quote it
too :-)


Well, you know what they say: "Starve a troll, feed a sigmonster".

--
Michael Wojcik mi************@microfocus.com
Jan 31 '06 #44

P: n/a

In article <aq************@news.flash-gordon.me.uk>, Flash Gordon <sp**@flash-gordon.me.uk> writes:

Personally, when I wrote a strdup implementation (named ffstrdup) I
decided that anyone generating a string with length SIZE_MAX or longer
deserves to be shot, so I left it as undefined behaviour what the
library would do in such cases.


While I'll agree that excessively-long strings are not a hallmark of
good programming, I'll note that leaving such as undefined behavior
has produced a common and widely-exploited group of security holes,
most of which are exploited using integer-underflow attacks where
signed computations on data supplied by the attacker (eg a "request
length" field in a message received over a network) are coerced into
large unsigned values. (This is Sin 3 in Howard/LeBlanc/Viega's _19
Deadly Sins of Software Security_.)

Like many security holes, this relies on UB - but one of the
difficulties of secure coding is that the attacker can use UB, while
the defender must eschew it.

For that reason, in production code, I like to have my own object
size limits (configurable by the software administrator), rather than
simply trusting the implementation to decide when I've asked for too
much. I tend to side with CBF on this issue: the more sanity
checking, the better - and while UB is never my friend, it may be
someone else's.

--
Michael Wojcik mi************@microfocus.com

Shakespeare writes bombast and knows it; Mr Thomas writes bombast and
doesn't. That is the difference. -- Geoffrey Johnson
Jan 31 '06 #45

P: n/a

In article <dr*********@news1.newsguy.com>, mw*****@newsguy.com (Michael Wojcik) writes:
In article <sl***********************@random.yi.org>, Jordan Abel <ra*******@gmail.com> writes:
It seems like this would fall
into the same dubious area as redefining a keyword, but would still be
legal (just as it is to redefine a keyword)


Redefining a keyword is only "legal" because they're only reserved
in translation phases 7 and 8 (C99 6.4.1#2). (This is a "shall"
requirement, so violating it causes UB, by 4#2.) Since macros are
expanded in translation phase 4, by the time phase 7 is reached,
macros that have the same names as keywords (eg "#define if foo")
no longer appear as identifiers in the translation unit.


I just realized there's an additional restriction on macro names
that are keywords - there cannot be any such before the inclusion
of a standard header (C99 7.1.2#4). That's irrelevant to the
original question, but I thought I should add it for completeness.

--
Michael Wojcik mi************@microfocus.com

The movie culminated with a bit of everything. -- Jeremy Stephens
Jan 31 '06 #46

P: n/a
Jeff wrote:
CBFalconer wrote:

Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.


The item on hand is whether or not 'size==0' can ever be true.
I claim no.


If it is the result of a "size = strlen(s);" statement, of course
it can be zero.

--
"The power of the Executive to cast a man into prison without
formulating any charge known to the law, and particularly to
deny him the judgement of his peers, is in the highest degree
odious and is the foundation of all totalitarian government
whether Nazi or Communist." -- W. Churchill, Nov 21, 1943
Jan 31 '06 #47

P: n/a
Michael Wojcik wrote:
In article <dr**********@nwrdmz03.dmz.ncs.ea.ibs-infra.bt.com>, "Vladimir S. Oka" <no****@btopenworld.com> writes:
Michael Wojcik wrote:
In article <dr**********@nwrdmz01.dmz.ncs.ea.ibs-infra.bt.com>,
"Vladimir S. Oka" <no****@btopenworld.com> writes:
> Jordan Abel wrote:
>
> > Incidentally, speaking of prefixes, the str* prefix, and reserved
> > identifiers, is something along the lines of
> >
> > #define strdup clc_strdup
> >
> > legal? My reading of the standard says it is, but thought i'd ask
> > here.
>
> I believe it is (and can't be asked to open the Standard right now).

If you can't be bothered to check the Standard, why reply? That's
a serious question. Jordan's question was about the Standard, not
about anyone else's belief.
Your premise here seems to be that I have never actually opened the
Standard, and am in fact guessing.


No, my premise is that you explicitly claimed that you had not
checked the Standard in this particular instance, and I can't see
any justification for your conclusion above.


My wording was obviously unfortunate. I wanted to say: "I did not check
the Standard on this immediatelly before posting, I had before and this
is what I believe it said". FWIW, I'm not a native speaker.
If you happened to agree with me (or
Jordan), I wonder whether you'd have bothered to chastise me.


My argument appears to apply regardless of whether I agreed with you.


Yes, of course it does. However, I believe the manner of your response
would have been different, provided you responded at all.
Instead,
you proceed believing (no pun intended) that your reading of the
Standard will justify the above paragraph, and the below statement.


I certainly do not believe that any reading of the Standard justifies
"the above paragraph", assuming the antecedent is my first paragraph
(and I don't see what else it could reasonably be). The Standard says
nothing about whether you should reply to a Usenet query, or whether
Jordan's question referred to the Standard, which are the only two
claims in the paragraph in question.


You're right. Even I don't see now how I could have referred to the
"above paragraph". I apologise for any confusion, or worse.

<snipped quotes from the Standard and their discussion>

As I think I've said (if I didn't I wanted to), I do agree with your
reading of the Standard.
So calling the function "clc_strdup" and using a macro to refer to it
as "strdup" is legal provided stdlib.h and string.h are not included
- but that seems rather unlikely, and you could achieve the same thing
by giving "strdup" internal linkage (ie by declaring it static).


In other words, there _are_ circumstances where it actually _is_ legal,
even according to your reading of the Standard.


Since Jordan's question did not refer to those restricted circum-
stances, it applies to the general case; and in the general case,
his proposition ("this action is permitted by the Standard") is
false. "p -> q" does not imply "q".


Jordan said: "it is legal to do X".
You said: "no, what you just said is false".

To me, if "is legal" is false, "is not legal" is true, and that does
not allow exceptions, especially in pedantic mode you claim to be in.
As I see it "true" and "false" cannot be applied directly to what
Jordan said (and what I agreed with in original post).

If you told me: 'your reply to Jordan (and/or Jordan's original claim)
is not strictly correct (or not correct in all circumstances -- as is
the case), the only correct answer is "in many/most circumstances
you're not allowed to do it, but there are some where you can, and
here's why"', I'd have remained as quiet as a mouse.

Again, I /do/ agree with your analysis/interpreattion of the Standard.
What I /don't/ agree with is your trying to disqualify me using logic
that is broken, and your belief of what I mean and/or know.

I'll henceforth try to stay out of this sort of
pedantic-mode-standard-related discussions, as it became obvious to me
that I'm prone to misuse of my non-native English.

(BTW, In my graduate maths class we used `->` to express implication,
so I genuinely don't understand your: "p -> q" does not imply "q".)
Had Jordan asked, in some appropriate venue, "is it legal to stick a
knife into someone?", would you consider, "yes, if you're a surgeon
performing within the terms of your medical license and other
applicable regulation", and from that conclude that the answer to his
question is "yes"?

If so, I suggest you reconsider your personal definition of "legal",
as it does not appear to accord with common usage.
If my analysis above is correct, I'm now worried about yours. Yours
seems to want to lock up all the suregeons. ;-)
Why would it be
unlikely for a source file not to include certain standard headers?
I've seen quite a few.
Actually, this is more plausible than I originally thought. I
was thinking that the Standard allows standard headers to include
other standard headers, but it does not (there was a thread about
this on comp.std.c back in 2001[1]). The C++ standard does allow
this, which may have been what I was thinking of.


I was talking about (user) source files. Even Jordan did not say
anything about (standard) headers. Not being specific, I chose to
interpret the snippet he gave as sitting in a user source file (header
or not). As we've seen, that would be perfectly OK (legal? ;-) )
provided standard headers stdlib.h and string.h were not included.
However, the point is moot; as I noted above, Jordan's question does
not specify any special circumstances, and neither does your initial
reply. As written they are false. (Well, technically yours isn't
necessarily false, since it's a claim of belief; but your belief, as
stated, does not accord with fact.)
Your problem wiht my post seems to revolve around my (quite possibly
poor) choice of words. I wonder have I worded my post differently (yet
saying essentially the same thing) would your reply be so heated. This
is not meant as a defense, but as I already stated I am /not/ a native
English speaker.
So, your reading does not actually contradict Jordan's (or mine, for
that matter), at least not entirely.
I submit that it does - that as the question is expressed, there is no
partial correctness that can attach. It is incorrect as stated.


Here, I'll respectfully disagree.

I still firmly believe that the existence of (not at all contrived)
circumstances where it is legal, does not make that statement
"incorrect as stated" (to my ear, it was stated neither very precisely
nor in a very pedantic way).
Fortunately, c.l.c contains one of the world's largest herds of free-
roaming pedants, thundering majestically across the virtual plains...
And sometimes, just sometimes, being a bit too self-righteous... ;-)


I don't believe I was self-righteous in the least.


This was not writen to offend (note the smiley). If it did, I apologise
unreservedly.
My claim about
the unlikelihood of the special circumstances was overly strong (due
to an erroneous unstated understanding which I ought to have verified
first), but aside from that I stand by everything I wrote. None of
it is intended to glorify me; I merely report what the Standard says,
and question - I think correctly - the utility of a response to a
question about the Standard which fails to refer to the Standard.


My understanding of the question was that Jordan has read the Standard
himself and has drawn own conclusions (had a look above, and it /is/
what he said). I didn't think that quoting the Standard to him was what
he asked for, rather what conclusions other people have drawn from
their understanding of the Standard. In different circumstances, I'd
agree that quoting the Standard would have been sine qua non.

Let's drop this now. I agreed that you're correct in your
interpretation of the Standard (at least twice), and I don't intend to
hold grudge for anything that was said. I'll also try harder to spot
the pedantic-mode discussions and stay away from them. Hopefully, I'll
still be able to contribute something useful in other areas...

Cheers

Vladimir

Jan 31 '06 #48

P: n/a
CBFalconer wrote:
Jeff wrote:
CBFalconer wrote:
Strings of length approaching SIZE_MAX are so common in my code
that I worry about this possibility all the time. They are playing
havoc with my printer, and eating up the toner. It is especially
bad when I have to dump those strings out on a 110 baud ASR33
Teletype. Wears out the clutch and makes holes in the ribbon.

I challenge any c.l.c reader to provide any real working code that
uses a string of even SIZE_MAX / 2 length.

The item on hand is whether or not 'size==0' can ever be true.
I claim no.


If it is the result of a "size = strlen(s);" statement, of course
it can be zero.


I happen to remember that the size in question was strlen(s)+1, but if
Jeff wants to claim this cannot be 0 he should address the points made
in this thread by myself and, IIRC, Jordan, on how a string with a
strlen of SIZE_MAX can legally be generated.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
Jan 31 '06 #49

P: n/a
Michael Wojcik wrote:
In article <aq************@news.flash-gordon.me.uk>, Flash Gordon <sp**@flash-gordon.me.uk> writes:
Personally, when I wrote a strdup implementation (named ffstrdup) I
decided that anyone generating a string with length SIZE_MAX or longer
deserves to be shot, so I left it as undefined behaviour what the
library would do in such cases.

To clarify. Although the interface is not documented as to the
behaviour, it will do one of the following:
Fail on the memory allocation and abort the program, which is
accepted behaviour for this system.
Produce a resultant string shorter than the original string.

Seeing as there is no mechanism provided for the user to enter a long
string which will be executed, this means the worst you could achieve is
storing a truncated document.

Not documenting the behaviour means I can also put in checking for these
conditions and aborting the program if they happen (acceptable behaviour
for this system) without breaking any code that could be considered valid.
While I'll agree that excessively-long strings are not a hallmark of
good programming, I'll note that leaving such as undefined behavior
<snip>

I agree in the general case.
For that reason, in production code, I like to have my own object
size limits (configurable by the software administrator), rather than
simply trusting the implementation to decide when I've asked for too
We actually have limits else where in the system that will prevent
strings long enough to invoke undefined behaviour from occurring. These
limits being applied during the generation of the string.
much. I tend to side with CBF on this issue: the more sanity
checking, the better - and while UB is never my friend, it may be
someone else's.


Agreed. However if you can ensure that something cannot happen then
there is no need to check everywhere else that it hasn't happened.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
Jan 31 '06 #50

53 Replies

This discussion thread is closed

Replies have been disabled for this discussion.