473,412 Members | 2,306 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,412 software developers and data experts.

Degenerate strcmp

One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp. But it seems to me that this could create
an inconsistency in the degenerate case when s1 points to memory that is
not null-terminated, i.e. by some freak chance, all of the memory from
s1 till the computer reaches the end of all its memory pages (however
that works) don't contain a single null byte. In this case, strcmp
should not say that s1 and s2 are "equal strings" since neither is
actually a string (because not null terminated).

Is my thinking correct?

--
Q: "What is the burning question on the mind of every dyslexic
existentialist?"
A: "Is there a dog?"
Aug 17 '07 #1
47 2951
fi******@invalid.com wrote:
>
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp.
But it seems to me that this could create
an inconsistency in the degenerate case when s1 points
to memory that is
not null-terminated, i.e. by some freak chance, all of the memory from
s1 till the computer reaches the end of all its memory pages (however
that works) don't contain a single null byte. In this case, strcmp
should not say that s1 and s2 are "equal strings" since neither is
actually a string (because not null terminated).

Is my thinking correct?
No.
The behavior of strcmp is only defined for
cases when s1 and s2 both point to strings.
If it's not null terminated, it's not a string.

In cases where the behavior of the code is not defined,
the running program can do whatever it wants.
That's the rules of the C programming language.

--
pete
Aug 17 '07 #2
On 17 Aug 2007 at 23:01, pete wrote:
fi******@invalid.com wrote:
>>
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp.
But it seems to me that this could create
an inconsistency in the degenerate case when s1 points
to memory that is
not null-terminated, i.e. by some freak chance, all of the memory from
s1 till the computer reaches the end of all its memory pages (however
that works) don't contain a single null byte. In this case, strcmp
should not say that s1 and s2 are "equal strings" since neither is
actually a string (because not null terminated).

Is my thinking correct?

No.
The behavior of strcmp is only defined for
cases when s1 and s2 both point to strings.
If it's not null terminated, it's not a string.
Your right that a string has to be null terminated, but for random
memory maybe by chance there just isn't any null byte.

For example, for the program

main() { printf("%d\n",strlen(malloc(0))); }

this will print out a random number, but in principal the strlen call
might never terminate, if the memory at the pointer returned by malloc
doesn't have any null bytes. (OK, it's very unlikely, but it could
happen in theory...)
>
In cases where the behavior of the code is not defined,
the running program can do whatever it wants.
That's the rules of the C programming language.

--
pete
--
Hlade's Law:
If you have a difficult task, give it to a lazy person --
they will find an easier way to do it.
Aug 17 '07 #3
fi******@invalid.com wrote:
>
On 17 Aug 2007 at 23:01, pete wrote:
but for random
memory maybe by chance there just isn't any null byte.
In cases where the behavior of the code is not defined,
the running program can do whatever it wants.
That's the rules of the C programming language.
Do you understand what I've quoted of myself here?

--
pete
Aug 17 '07 #4
On Aug 17, 3:47 pm, fishp...@invalid.com wrote:
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.
Adding an if() test for that is not (in general) a good idea.
A missed branch prediction is expensive.
How often are you really going to do this:
if (strcmp(p,p)==0) call_captain_obvious();
A library function with a quirk like that would make me worry about
the quality of the implementation.
Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp. But it seems to me that this could create
an inconsistency in the degenerate case when s1 points to memory that is
not null-terminated, i.e. by some freak chance, all of the memory from
s1 till the computer reaches the end of all its memory pages (however
that works) don't contain a single null byte. In this case, strcmp
should not say that s1 and s2 are "equal strings" since neither is
actually a string (because not null terminated).

Is my thinking correct?
It is undefined behavior in any case to call strcmp() with addresses
that are not null terminated strings.
--
Q: "What is the burning question on the mind of every dyslexic
existentialist?"
A: "Is there a dog?"
The actual joke goes:
Q: What does an agnostic, insomniac, dyslexic person do?
A: He lays awake at night, wondering if there is a dog.

Aug 17 '07 #5
user923005 wrote:
>
On Aug 17, 3:47 pm, fishp...@invalid.com wrote:
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Adding an if() test for that is not (in general) a good idea.
A missed branch prediction is expensive.
How often are you really going to do this:
if (strcmp(p,p)==0) call_captain_obvious();
A library function with a quirk like that would make me worry about
the quality of the implementation.
Anybody who writes code to compare string p with string p,
isn't in a rush.
And that's one of the reasons why I think that it's usually bad
to optimize the degenerate special case
at any cost at all to the general case.

--
pete
Aug 18 '07 #6
fi******@invalid.com wrote:
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp. But it seems to me that this could create
an inconsistency in the degenerate case when s1 points to memory that is
not null-terminated, i.e. by some freak chance, all of the memory from
s1 till the computer reaches the end of all its memory pages (however
that works) don't contain a single null byte. In this case, strcmp
should not say that s1 and s2 are "equal strings" since neither is
actually a string (because not null terminated).

Is my thinking correct?
What you seem to have missed is that there is no "correct"
behavior in the case you describe: The behavior is undefined
because the arguments are not strings. Returning zero is one
possible behavior, a SIGSEGV is another, a graphic of a nasal
demon whistling "Dixie" while riding backwards on a bicycle
is yet another.

--
Eric Sosman
es*****@ieee-dot-org.invalid
Aug 18 '07 #7
user923005 wrote:
On Aug 17, 3:47 pm, fishp...@invalid.com wrote:
>One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Adding an if() test for that is not (in general) a good idea.
A missed branch prediction is expensive.
How often are you really going to do this:
if (strcmp(p,p)==0) call_captain_obvious();
A library function with a quirk like that would make me worry about
the quality of the implementation.
I tried to use strcmp(p, p) as a deliberate time-waster
once. I wanted to study the behavior of a sorting function
with "fast" and "slow" user-supplied comparators: the "slow"
one was strcmp(p, p) followed by a call to the "fast" one.
My program adjusted the length of the string at p until I got
a ten-to-one speed ratio.

This worked fine on several systems, but alas! I ran
across one where my setup code kept making p longer and longer
without slowing anything down -- and it turned out that strcmp
was returning zero immediately, as described.

BUT: Was this a stupid test? I don't think so. The
strcmp implementation made a bunch of (highly non-portable)
tests to decide whether it could replace a byte-by-byte loop
with a loop that took bigger, er, bites: two, four, or even
eight at a time. The decision was based on the alignments
of the two operands -- and the "both operands equal" case just
fell out of the alignment tests.

Eventually, I arranged for p to consist entirely of 'X'
and called strcmp(p, p+1), proceeding to call the "fast" method
if ("if") strcmp didn't return zero. Works like a charm.

--
Eric Sosman
es*****@ieee-dot-org.invalid
Aug 18 '07 #8
user923005 said:
On Aug 17, 3:47 pm, fishp...@invalid.com wrote:
>--
Q: "What is the burning question on the mind of every dyslexic
existentialist?"
A: "Is there a dog?"

The actual joke goes:
Q: What does an agnostic, insomniac, dyslexic person do?
A: He lays awake at night, wondering if there is a dog.
Can we lay off the dyslexia jokes, please? They're not clever, and
they're not furry.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 18 '07 #9
On Aug 17, 6:56 pm, Eric Sosman <esos...@ieee-dot-org.invalidwrote:
user923005 wrote:
On Aug 17, 3:47 pm, fishp...@invalid.com wrote:
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.
Adding an if() test for that is not (in general) a good idea.
A missed branch prediction is expensive.
How often are you really going to do this:
if (strcmp(p,p)==0) call_captain_obvious();
A library function with a quirk like that would make me worry about
the quality of the implementation.

I tried to use strcmp(p, p) as a deliberate time-waster
once. I wanted to study the behavior of a sorting function
with "fast" and "slow" user-supplied comparators: the "slow"
one was strcmp(p, p) followed by a call to the "fast" one.
My program adjusted the length of the string at p until I got
a ten-to-one speed ratio.

This worked fine on several systems, but alas! I ran
across one where my setup code kept making p longer and longer
without slowing anything down -- and it turned out that strcmp
was returning zero immediately, as described.

BUT: Was this a stupid test? I don't think so. The
strcmp implementation made a bunch of (highly non-portable)
tests to decide whether it could replace a byte-by-byte loop
with a loop that took bigger, er, bites: two, four, or even
eight at a time. The decision was based on the alignments
of the two operands -- and the "both operands equal" case just
fell out of the alignment tests.

Eventually, I arranged for p to consist entirely of 'X'
and called strcmp(p, p+1), proceeding to call the "fast" method
if ("if") strcmp didn't return zero. Works like a charm.
Turns out that the possible missed branch prediction is
inconsequential (after all, it happens only once):

int testing_strcmp(const char *_s1, const char *_s2)
{
if (_s1 == _s2)
return 0;
else
while (*_s1 && (*_s1 == *_s2)) {
_s1++;
_s2++;
}
return (*_s1 *_s2) - (*_s1 < *_s2);
} /* strcmp */

int non_testing_strcmp(const char *_s1, const char *_s2)
{
while (*_s1 && (*_s1 == *_s2)) {
_s1++;
_s2++;
}
return (*_s1 *_s2) - (*_s1 < *_s2);
} /* strcmp */

#include <time.h>
#include <stdio.h>
#include <stdlib.h>

char s1[32767];
char s2[32767];
void cmptest1(FILE * f, int (*cmp) (const char *, const
char *))
{
while (fgets(s1, sizeof s1, f)) {
if (fgets(s2, sizeof s2, f)) {
(void) cmp(s1, s2);
} else
break;
}
}

int main(int argc, char **argv)
{
clock_t end, start;
if (argc <= 1) {
puts("USAGE: strcmptest <filename>");
exit(EXIT_FAILURE);
}
{
FILE *f = fopen(argv[1], "rt");
if (f == NULL) {
printf("ERROR: Failed to open file %s\n", argv[1]);
exit(EXIT_FAILURE);
}
start = clock();
cmptest1(f, testing_strcmp);
end = clock();
printf("With testing strcmp time is %f seconds\n", (end -
start)*1.0/CLOCKS_PER_SEC);
rewind(f);
start = clock();
cmptest1(f, non_testing_strcmp);
end = clock();
printf("With non-testing strcmp time is %f seconds\n", (end -
start)*1.0/CLOCKS_PER_SEC);
}
return 0;
}

/*
C:\tmp>dir b.txt
Volume in drive C has no label.
Volume Serial Number is 0890-87CA

Directory of C:\tmp

08/18/2007 12:34 AM 65,969,707 b.txt
1 File(s) 65,969,707 bytes
0 Dir(s) 1,602,658,304 bytes free

C:\tmp>strcmptest b.txt
With testing strcmp time is 0.859000 seconds
With non-testing strcmp time is 0.859000 seconds

C:\tmp>strcmptest b.txt
With testing strcmp time is 0.859000 seconds
With non-testing strcmp time is 0.875000 seconds

C:\tmp>strcmptest b.txt
With testing strcmp time is 0.859000 seconds
With non-testing strcmp time is 0.859000 seconds

C:\tmp>strcmptest b.txt
With testing strcmp time is 0.859000 seconds
With non-testing strcmp time is 0.859000 seconds

C:\tmp>strcmptest b.txt
With testing strcmp time is 0.875000 seconds
With non-testing strcmp time is 0.843000 seconds
*/

Aug 18 '07 #10
On Aug 17, 3:47 pm, fishp...@invalid.com wrote:
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp. But it seems to me that this could create
an inconsistency in the degenerate case when s1 points to memory that is
not null-terminated, i.e. by some freak chance, all of the memory from
s1 till the computer reaches the end of all its memory pages (however
that works) don't contain a single null byte. In this case, strcmp
should not say that s1 and s2 are "equal strings" since neither is
actually a string (because not null terminated).

Is my thinking correct?
Indefinitely searching for '\0' will do no good. In the simplest case
it may either cause some sort of memory protection exception or hang
the system if the address wrap around is permitted. If the access goes
where memory mapped devices are, it may be worse and may even damage
the hardware. In these cases strcmp() may not return at all and so I'm
not sure if talking about consistency is any meaningful.

Alex

Aug 18 '07 #11
On 18 Aug 2007 at 1:40, Eric Sosman wrote:
fi******@invalid.com wrote:
>One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp. But it seems to me that this could create
an inconsistency in the degenerate case when s1 points to memory that is
not null-terminated, i.e. by some freak chance, all of the memory from
s1 till the computer reaches the end of all its memory pages (however
that works) don't contain a single null byte. In this case, strcmp
should not say that s1 and s2 are "equal strings" since neither is
actually a string (because not null terminated).

Is my thinking correct?

What you seem to have missed is that there is no "correct"
behavior in the case you describe: The behavior is undefined
because the arguments are not strings. Returning zero is one
possible behavior, a SIGSEGV is another, a graphic of a nasal
demon whistling "Dixie" while riding backwards on a bicycle
is yet another.
I think the subtle point is the following: a char * isn't actually the
same thing as a string. A char * is a pointer to some bytes of memory,
but is s is a char * then for s to be a string, we need the sequence
*s, *(s+1), *(s+2), ..., *(s+i), ... to actually contain a 0 byte for
some i. In practice memory will have 0 bytes all over the place, but
there's still a theoretical possibility that there won't be zero byte
for any i until the memory space is completely exhausted.

Maybe the program I put in the other thread

main() { printf("%d\n",strlen(malloc(0))); }

illustrates this more simply than strcmp: malloc(0) returns a pointer to
some random place in memory, and there's no absolute guarantee that a
0-byte will occur later in memory, so what gets printed could be a
random number or in theory the program could just never terminate.

Part of the confusion seems to be the names: for example, strlen takes a
char * and returns an int. If the parameter is a string, then the
integer is the length of the string and that makes perfect sense. But
what strlen actually takes is a general char *, not necessarily a
string, and if you pass strlen a char * that isn't a string then you
need to think more carefully about how to interpret the return value of
strlen (or strlen might not terminate at all).
>
--
Eric Sosman
es*****@ieee-dot-org.invalid
--
The difference between a career and a job is about 20 hours a week.
Aug 18 '07 #12
On Aug 18, 10:19 am, Antoninus Twink <spam...@invalid.comwrote:
On 18 Aug 2007 at 1:40, Eric Sosman wrote:
fishp...@invalid.com wrote:
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.
Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp. But it seems to me that this could create
an inconsistency in the degenerate case when s1 points to memory that is
not null-terminated, i.e. by some freak chance, all of the memory from
s1 till the computer reaches the end of all its memory pages (however
that works) don't contain a single null byte. In this case, strcmp
should not say that s1 and s2 are "equal strings" since neither is
actually a string (because not null terminated).
Is my thinking correct?
What you seem to have missed is that there is no "correct"
behavior in the case you describe: The behavior is undefined
because the arguments are not strings. Returning zero is one
possible behavior, a SIGSEGV is another, a graphic of a nasal
demon whistling "Dixie" while riding backwards on a bicycle
is yet another.

I think the subtle point is the following: a char * isn't actually the
same thing as a string. A char * is a pointer to some bytes of memory,
but is s is a char * then for s to be a string, we need the sequence
*s, *(s+1), *(s+2), ..., *(s+i), ... to actually contain a 0 byte for
some i. In practice memory will have 0 bytes all over the place, but
there's still a theoretical possibility that there won't be zero byte
for any i until the memory space is completely exhausted.
What is "subtle" about this? It's just the definition of a string, and
it's very simple.
Maybe the program I put in the other thread

main() { printf("%d\n",strlen(malloc(0))); }

illustrates this more simply than strcmp: malloc(0) returns a pointer to
some random place in memory, and there's no absolute guarantee that a
0-byte will occur later in memory, so what gets printed could be a
random number or in theory the program could just never terminate.
I don't understand your point. You seem to be working hard to tell us
that a string is an array of chars up to and including the first zero-
valued character. Of course it is, since that's what it's defined to
be. If there is no zero-valued character in the array, then the array
doesn't contain a string.
Part of the confusion seems to be the names: for example, strlen takes a
char * and returns an int. If the parameter is a string, then the
integer is the length of the string and that makes perfect sense. But
what strlen actually takes is a general char *, not necessarily a
string, and if you pass strlen a char * that isn't a string then you
need to think more carefully about how to interpret the return value of
strlen (or strlen might not terminate at all).
You don't need to be careful about anything if you do this, since you
cannot predict how the system will behave. What you need to be careful
about is not doing this in the first place. strlen() takes a string;
it is your responsibility to ensure that you only ever give it a
string; if you give it anything else, there's no saying what might
happen.

Aug 18 '07 #13
On Aug 18, 12:47 am, fishp...@invalid.com wrote:
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp.
It would be interesting to see how many times (passing two equal
pointers) to strcmp() happens.

Aug 18 '07 #14
Antoninus Twink wrote:
malloc(0) returns a pointer to
some random place in memory,
malloc(0) may also return a null pointer instead.

--
pete
Aug 18 '07 #15
Antoninus Twink wrote:
Part of the confusion seems to be the names: for example, strlen takes a
char * and returns an int. If the parameter is a string, then the
integer is the length of the string and that makes perfect sense. But
what strlen actually takes is a general char *, not necessarily a
string, and if you pass strlen a char * that isn't a string then you
need to think more carefully about how to interpret the return value of
strlen (or strlen might not terminate at all).
Doesn't the Standard says specifically that `strlen` takes a string?
If so, passing a char* that /isn't/ a string is undefined behaviour.
You don't have to think carefully about how to interpret the
return value; you have to ensure it's passed a string.

--
Far-Fetched Hedgehog
"Who do you serve, and who do you trust?" /Crusade/

Aug 18 '07 #16
fi******@invalid.com writes:
One way I've seen strcmp(char *s1, char *s2) implemented is: return
immediately if s1==s2 (equality of pointers); otherwise do the usual
thing of searching through the memory at s1 and s2.

Of course the reason for doing this is to save time in case equal
pointers are passed to strcmp. But it seems to me that this could create
an inconsistency in the degenerate case when s1 points to memory that is
not null-terminated, i.e. by some freak chance, all of the memory from
s1 till the computer reaches the end of all its memory pages (however
that works) don't contain a single null byte. In this case, strcmp
should not say that s1 and s2 are "equal strings" since neither is
actually a string (because not null terminated).

Is my thinking correct?
No.

If you take that view then all the string functions are wrong because
they too could have "non terminated strings".

In addition, the chance of s1==s2 is probably very, very low in a real
program and the check will in fact impede performance over the run time.
Aug 18 '07 #17
Chris Dollin <eh@electrichedgehog.netwrites:
Antoninus Twink wrote:
>Part of the confusion seems to be the names: for example, strlen takes a
char * and returns an int. If the parameter is a string, then the
integer is the length of the string and that makes perfect sense. But
what strlen actually takes is a general char *, not necessarily a
string, and if you pass strlen a char * that isn't a string then you
need to think more carefully about how to interpret the return value of
strlen (or strlen might not terminate at all).

Doesn't the Standard says specifically that `strlen` takes a string?
If so, passing a char* that /isn't/ a string is undefined behaviour.
You don't have to think carefully about how to interpret the
return value; you have to ensure it's passed a string.
No, it says that strlen takes a char* (actually a const char*).

A char* is a pointer; it logically *cannot* be a string. A char*
value may or may not *point to* a string. (Strictly speaking it may
or may not point to the first character of a string, but the standard
specifically defines a "pointer to a string" as a pointer to its first
character.)

And yes, passing to strlen() a char* value that doesn't point to a
string invokes undefined behavior. One special case of this is that
strlen(NULL) invokes UB.

Anyone who's still confused should read sections 4, 6, and 8 of the
comp.lang.c FAQ, <http://www.c-faq.com/(and probably the rest of it
too).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 18 '07 #18
Antoninus Twink wrote:
>
.... snip ...
>
illustrates this more simply than strcmp: malloc(0) returns a
pointer to some random place in memory, and there's no absolute
guarantee that a 0-byte will occur later in memory, so what gets
printed could be a random number or in theory the program could
just never terminate.
No it doesn't, it can return a NULL. Check the standard,
carefully. This is probably in place to cover some unusual
implementations.
--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 18 '07 #19
In article <sl*******************@nospam.invalid>,
<fi******@invalid.comwrote:
>Your right that a string has to be null terminated, but for random
memory maybe by chance there just isn't any null byte.
In which case the program has a bug, because it's not allowed to call
strcmp on "random memory" that doesn't have a nul byte. Programs
with bugs of this kind are not required to behave in any particular way,
so it's quite legal for strcmp to return any value it likes.
so returning
>For example, for the program

main() { printf("%d\n",strlen(malloc(0))); }

this will print out a random number, but in principal the strlen call
might never terminate, if the memory at the pointer returned by malloc
doesn't have any null bytes.
The C library functions are only required to behave correctly if you call
them correctly. If you call them with random data, all bets are off.
(In this particular case, it may well produce a segmentation fault, because
malloc(0) may return null.)

As the post you were replying to said:
>In cases where the behavior of the code is not defined,
the running program can do whatever it wants.
-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Aug 18 '07 #20
In article <11**********************@57g2000hsv.googlegroups. com>,
user923005 <dc*****@connx.comwrote:
>How often are you really going to do this:
if (strcmp(p,p)==0) call_captain_obvious();
Never, but I often call strcmp(p,q) where p and q might be the same
string, depending on user input.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Aug 18 '07 #21
Richard Tobin wrote:
>
In article <11**********************@57g2000hsv.googlegroups. com>,
user923005 <dc*****@connx.comwrote:
How often are you really going to do this:
if (strcmp(p,p)==0) call_captain_obvious();

Never, but I often call strcmp(p,q) where p and q might be the same
string, depending on user input.
I don't think situations like that are common enough
to warrant strcmp checking for pointer equality.
I would prefer to have you write

if (p != q)

yourself, before calling strcmp,
for cases where it was shown to be significantly faster.

--
pete
Aug 18 '07 #22
Antoninus Twink wrote:
Part of the confusion seems to be the names:
for example, strlen takes a char * and returns an int.
The return type of strlen is size_t, not int.

--
pete
Aug 19 '07 #23
On 18 Aug 2007 at 21:59, Richard Tobin wrote:
In article <sl*******************@nospam.invalid>,
<fi******@invalid.comwrote:
>>Your right that a string has to be null terminated, but for random
memory maybe by chance there just isn't any null byte.

In which case the program has a bug, because it's not allowed to call
strcmp on "random memory" that doesn't have a nul byte. Programs
with bugs of this kind are not required to behave in any particular way,
so it's quite legal for strcmp to return any value it likes.
so returning
>>For example, for the program

main() { printf("%d\n",strlen(malloc(0))); }

this will print out a random number, but in principal the strlen call
might never terminate, if the memory at the pointer returned by malloc
doesn't have any null bytes.

The C library functions are only required to behave correctly if you call
them correctly. If you call them with random data, all bets are off.
(In this particular case, it may well produce a segmentation fault, because
malloc(0) may return null.)
No I believe malloc(0) can never return null - after all, how could it
not be possible to allocate 0 bytes of memory!
>
As the post you were replying to said:
>>In cases where the behavior of the code is not defined,
the running program can do whatever it wants.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
--
<knghtbrdNintendo Declares GCN Most Popular Console Ever
<knghtbrdWho are they kidding?
<Mercuryknghtbrd: Stock holders?
Aug 19 '07 #24
fi******@invalid.com wrote, On 19/08/07 13:03:
On 18 Aug 2007 at 21:59, Richard Tobin wrote:
<snip>
>(In this particular case, it may well produce a segmentation fault, because
malloc(0) may return null.)

No I believe malloc(0) can never return null - after all, how could it
not be possible to allocate 0 bytes of memory!
You believe wrong for at least three reasons.
1) All possible pointer values might have been used, and if it is
non-null it has to be unique.
2) There may be house-keeping structures required for which there is no
space.
3) The standard allows it.

<snip>
>-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Please don't quote peoples signatures, the bit typically after the "--
", unless you are commenting on them. In fact, you should trim anything
not relevant to your reply as I have done.
--
Flash Gordon
Aug 19 '07 #25
fi******@invalid.com wrote:
No I believe malloc(0) can never return null
Nevertheless, it is permitted to do so. Since it is permitted,
an implemention could start with

if (requestedSize == 0) return 0;

Hence `malloc(0)` could return null. This could be a good idea
if the degenerate general case of `allocate N bytes` with N=0
took up more room than was justfied.
- after all, how could it
not be possible to allocate 0 bytes of memory!
If there was no room for the /bookkeeping/ necessary to record
the allocation.

--
'M All OK Hedgehog
"I just wonder when we're going to have to sit down and re-evaluate
our decision-making paradigm." /Sahara/

Aug 19 '07 #26
fi******@invalid.com wrote:
No I believe malloc(0) can never return null
You believe wrongly.

N869
7.20.3 Memory management functions
[#1]

If the size of the space requested is zero,
the behavior is implementation-defined:
either a null pointer is returned,
or the behavior is as if the size were some nonzero value,
except that the returned pointer shall
not be used to access an object.
- after all, how could it
not be possible to allocate 0 bytes of memory!
That's not what a null pointer return value means for malloc(0).

--
pete
Aug 19 '07 #27
In article <sl********************@nospam.invalid>,
<fi******@invalid.comwrote:
>No I believe malloc(0) can never return null - after all, how could it
not be possible to allocate 0 bytes of memory!
The specification of library functions is a question best determined by
looking at the standard, not by arguing "it must be like this".

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
Aug 19 '07 #28
fi******@invalid.com wrote:
[...]
No I believe malloc(0) can never return null -
Would reading section 7.20.3 paragraph 1 of the language
Standard alter your belief?

"[...] If the size of the space requested is zero,
the behavior is implementation-defined: either a
null pointer is returned, or the behavior is as if
the size were some nonzero value, except that the
returned pointer shall not be used to access an object."
after all, how could it
not be possible to allocate 0 bytes of memory!
Most likely, because it fails to allocate the internal
bookkeeping space it uses for keeping track of the addresses
it has returned that have not yet been free()d.

There's a potentially interesting quibble here, for people
interested in quibbles. The Standard requires (same paragraph)
that memory obtained from malloc() be "disjoint" from all other
objects, which is not quite the same as requiring that it have
an address different from that of all other objects. Since the
value of malloc(0) cannot be used to access an object, it could
be argued that the program cannot test disjointness without
engaging in undefined behavior anyhow. This would seem to open
the door to a malloc(0) that returned the same non-null value
on every call (a value free() would ignore), avoiding the need
for bookkeeping and making it possible to call malloc(0) an
unlimited number of times without fear of failure.

But even if it did so, the proposed use
main() { printf("%d\n",strlen(malloc(0))); }
.... would be in trouble anyhow, because the strlen() call tries
to use the returned pointer to access an object -- an access
the Standard forbids.

--
Eric Sosman
es*****@ieee-dot-org.invalid
Aug 19 '07 #29
fi******@invalid.com wrote:
Richard Tobin wrote:
.... snip ...
>
>The C library functions are only required to behave correctly if
you call them correctly. If you call them with random data, all
bets are off. (In this particular case, it may well produce a
segmentation fault, because malloc(0) may return null.)

No I believe malloc(0) can never return null - after all, how
could it not be possible to allocate 0 bytes of memory!
Easy. Read the C standard.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 19 '07 #30
Eric Sosman wrote, On 19/08/07 15:43:
fi******@invalid.com wrote:
>[...]
No I believe malloc(0) can never return null -

Would reading section 7.20.3 paragraph 1 of the language
Standard alter your belief?

"[...] If the size of the space requested is zero,
the behavior is implementation-defined: either a
null pointer is returned, or the behavior is as if
the size were some nonzero value, except that the
returned pointer shall not be used to access an object."
after all, how could it
not be possible to allocate 0 bytes of memory!

Most likely, because it fails to allocate the internal
bookkeeping space it uses for keeping track of the addresses
it has returned that have not yet been free()d.

There's a potentially interesting quibble here, for people
interested in quibbles. The Standard requires (same paragraph)
that memory obtained from malloc() be "disjoint" from all other
objects, which is not quite the same as requiring that it have
an address different from that of all other objects. Since the
value of malloc(0) cannot be used to access an object, it could
be argued that the program cannot test disjointness without
engaging in undefined behavior anyhow.
The following does not invoke undefined behaviour since you are always
allowed to test for equality (the first byte is 1 beyond the end which
is still OK or you could not free it)...

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
void *p1 = malloc(0);
void *p2 = malloc(0);

if (p1==NULL || p2==NULL)
puts("At least one null pointer returned");
else if (p1==p2)
puts("Regions are not disjoint");
else
puts("Regions are disjoint");

free(p1);
free(p2);

return 0;
}
This would seem to open
the door to a malloc(0) that returned the same non-null value
on every call (a value free() would ignore), avoiding the need
for bookkeeping and making it possible to call malloc(0) an
unlimited number of times without fear of failure.
No, I don't think so. See above.
But even if it did so, the proposed use
main() { printf("%d\n",strlen(malloc(0))); }

... would be in trouble anyhow, because the strlen() call tries
to use the returned pointer to access an object -- an access
the Standard forbids.
Agreed. That is not allowed whatever malloc(0) returns.
--
Flash gordon
Aug 19 '07 #31
fi******@invalid.com wrote:
>
No I believe malloc(0) can never return null - after all, how could it
not be possible to allocate 0 bytes of memory!
You've never used IBM's AIX C runtime obviously...
Aug 20 '07 #32
Keith Thompson wrote:
Chris Dollin <eh@electrichedgehog.netwrites:
>Doesn't the Standard says specifically that `strlen` takes a string?
If so, passing a char* that /isn't/ a string is undefined behaviour.
You don't have to think carefully about how to interpret the
return value; you have to ensure it's passed a string.

No, it says that strlen takes a char* (actually a const char*).
It also says (at least it does in this C90 draft here) "... computes the
length of the string pointed to by s". The Hedgehog presumably counts
that as saying specifically that `strlen` takes a string (as opposed
to just any `char*` value).

--
Chris "presumptive, since mind-reading known to be unsound" Dollin

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England

Aug 20 '07 #33
Chris Dollin <ch**********@hp.comwrites:
Keith Thompson wrote:
>Chris Dollin <eh@electrichedgehog.netwrites:
>>Doesn't the Standard says specifically that `strlen` takes a string?
If so, passing a char* that /isn't/ a string is undefined behaviour.
You don't have to think carefully about how to interpret the
return value; you have to ensure it's passed a string.

No, it says that strlen takes a char* (actually a const char*).

It also says (at least it does in this C90 draft here) "... computes the
length of the string pointed to by s". The Hedgehog presumably counts
that as saying specifically that `strlen` takes a string (as opposed
to just any `char*` value).
No doubt. Unfortunately, the Hedgehog is mistaken (though I have no
doubt it was an innocent mistake). A pointer to a string is not a
string.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 20 '07 #34
Keith Thompson wrote:
Chris Dollin <ch**********@hp.comwrites:
>Keith Thompson wrote:
>>Chris Dollin <eh@electrichedgehog.netwrites:
Doesn't the Standard says specifically that `strlen` takes a string?
If so, passing a char* that /isn't/ a string is undefined behaviour.
You don't have to think carefully about how to interpret the
return value; you have to ensure it's passed a string.

No, it says that strlen takes a char* (actually a const char*).

It also says (at least it does in this C90 draft here) "... computes the
length of the string pointed to by s". The Hedgehog presumably counts
that as saying specifically that `strlen` takes a string (as opposed
to just any `char*` value).

No doubt. Unfortunately, the Hedgehog is mistaken (though I have no
doubt it was an innocent mistake). A pointer to a string is not a
string.
My sloopiness. I forget that a string is the null-terminated-char-sequence
and tend to use "string" to mean pointer-to-ditto.

--
Pointer To Hedgehog
"Our future looks secure, but it's all out of our hands"
- Magenta, /Man and Machine/

Aug 20 '07 #35
On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
main() { printf("%d\n",strlen(malloc(0))); }
Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 20 '07 #36
On Aug 20, 8:36 pm, Army1987 <army1...@NOSPAM.itwrote:
On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
main() { printf("%d\n",strlen(malloc(0))); }

Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.
I'll have a go:
1) Uses variadic function with no prototype in scope.
2) Whether or not malloc(0) returns null, it is UB to try to
dereference it, and strlen is going to dereference it all right...
3) ...and there's no reason to expect it to point to a string, as
strlen requires.
4) Uses %d as a format specifier for a size_t... though as the
compiler assumes that strlen (used without a prototype) returns an
int, maybe these two bugs cancel each other out.
5) Similarly, if the compiler assumes malloc returns int, and doesn't
know anything about strlen's arguments as there's no prototype around,
the conversion void * -int -char * is, I believe, implementation
defined.
6) Execution falls off the end of a non-void function without
returning a value.
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower
Aug 20 '07 #37
fi******@invalid.com wrote:
>
On 18 Aug 2007 at 21:59, Richard Tobin wrote:
In article <sl*******************@nospam.invalid>,
[...]
>main() { printf("%d\n",strlen(malloc(0))); }
[...]
The C library functions are only required to behave correctly if you call
them correctly. If you call them with random data, all bets are off.
(In this particular case, it may well produce a segmentation fault, because
malloc(0) may return null.)

No I believe malloc(0) can never return null - after all, how could it
not be possible to allocate 0 bytes of memory!
[...]

What you believe is irrelevent. Quoting 7.20.3:

If the size of the space requested is zero, the behavior is
implementation defined: either a null pointer is returned, or
the behavior is as if the size were some nonzero value, except
that the returned pointer shall not be used to access an object.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h|
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:Th*************@gmail.com>
Aug 20 '07 #38
Fr************@googlemail.com wrote, On 20/08/07 21:04:
On Aug 20, 8:36 pm, Army1987 <army1...@NOSPAM.itwrote:
>On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
>>main() { printf("%d\n",strlen(malloc(0))); }
Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.

I'll have a go:
1) Uses variadic function with no prototype in scope.
2) Whether or not malloc(0) returns null, it is UB to try to
dereference it, and strlen is going to dereference it all right...
3) ...and there's no reason to expect it to point to a string, as
strlen requires.
4) Uses %d as a format specifier for a size_t... though as the
compiler assumes that strlen (used without a prototype) returns an
int, maybe these two bugs cancel each other out.
No, they definitely produce one instance of UB for using a function
which does not return int without a prototype.
5) Similarly, if the compiler assumes malloc returns int, and doesn't
know anything about strlen's arguments as there's no prototype around,
the conversion void * -int -char * is, I believe, implementation
defined.
No, because there is no prototype in scope for malloc and it does not
return an int calling it invokes UB.

Passing an int to strlen without a prototype in scope invokes UB (with
one in scope it requires a diagnostic.

No conversions were used.
6) Execution falls off the end of a non-void function without
returning a value.
That returns an undefined status, some people argue it does not invoke
undefined behaviour.
>--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower
Please don't quote peoples signatures, the bit typically after the "-- "
unless you are actually quoting on them.
--
Flash Gordon
Aug 20 '07 #39
Fr************@googlemail.com writes:
On Aug 20, 8:36 pm, Army1987 <army1...@NOSPAM.itwrote:
>On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
main() { printf("%d\n",strlen(malloc(0))); }

Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.

4) Uses %d as a format specifier for a size_t... though as the
compiler assumes that strlen (used without a prototype) returns an
int, maybe these two bugs cancel each other out.
I would say that the %d is not in error. The compiler will arrange
that the function called strlen will return an int and an int will be
printed (it might be trap representation, but that is because of the
*other* problem you identified).

I think it amusing that one of the few correct things about this
one-line program will have to change if the rest of it is corrected!
6) Execution falls off the end of a non-void function without
returning a value.
You get to chose here. Falling off main is not a problem in C99 but
the implicit int in main's definition is -- take your pic based on
language standard.

--
Ben.
Aug 20 '07 #40
Flash Gordon <sp**@flash-gordon.me.ukwrites:
Fr************@googlemail.com wrote, On 20/08/07 21:04:
>On Aug 20, 8:36 pm, Army1987 <army1...@NOSPAM.itwrote:
>>On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
main() { printf("%d\n",strlen(malloc(0))); }
Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.
[...]
>6) Execution falls off the end of a non-void function without
returning a value.

That returns an undefined status, some people argue it does not invoke
undefined behaviour.
In C90, it returns an undefined status. IMHO that's not undefined
behavior; it only affects the behavior of the environment, which is
outside the scope of the standard.

In C99, since the function in question is main, a special rule says
that falling off the end is equivalent to 'return 0;'. (But then, in
C99 the 'main()' declaration is a constraint violation.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 20 '07 #41
Ben Bacarisse <be********@bsb.me.ukwrites:
Fr************@googlemail.com writes:
>On Aug 20, 8:36 pm, Army1987 <army1...@NOSPAM.itwrote:
>>On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
main() { printf("%d\n",strlen(malloc(0))); }

Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.

4) Uses %d as a format specifier for a size_t... though as the
compiler assumes that strlen (used without a prototype) returns an
int, maybe these two bugs cancel each other out.

I would say that the %d is not in error. The compiler will arrange
that the function called strlen will return an int and an int will be
printed (it might be trap representation, but that is because of the
*other* problem you identified).
Not necesarily. strlen actually returns a size_t. Calling it as if
it returned an int (regardless of what's done with the result) invokes
undefined behavior. One of the infinitely many possible consequences
of this undefined behavior is that the compiler uses its knowledge of
the standard library and treats strlen as if it returned a size_t
(which, of course, it does). Passing this size_t to printf with a
"%d" format invokes UB again (and the compiler could pretend that the
format is really "%zu"). This would be overly helpful in my opinion
(if I make a mistake, I want the compiler to tell me about it, not to
fix it), but it's legal.

[...]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 20 '07 #42
Keith Thompson <ks***@mib.orgwrites:
Ben Bacarisse <be********@bsb.me.ukwrites:
>Fr************@googlemail.com writes:
>>On Aug 20, 8:36 pm, Army1987 <army1...@NOSPAM.itwrote:
On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
main() { printf("%d\n",strlen(malloc(0))); }

Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.

4) Uses %d as a format specifier for a size_t... though as the
compiler assumes that strlen (used without a prototype) returns an
int, maybe these two bugs cancel each other out.

I would say that the %d is not in error. The compiler will arrange
that the function called strlen will return an int and an int will be
printed (it might be trap representation, but that is because of the
*other* problem you identified).

Not necesarily. strlen actually returns a size_t. Calling it as if
it returned an int (regardless of what's done with the result) invokes
undefined behavior. One of the infinitely many possible consequences
of this undefined behavior is that the compiler uses its knowledge of
the standard library and treats strlen as if it returned a size_t
(which, of course, it does). Passing this size_t to printf with a
"%d" format invokes UB again (and the compiler could pretend that the
format is really "%zu").
[...]

And the real point I think, is that once you have a single instance of
undefined behavior, all bets are off. It makes some sense to go
through a piece of code and enumerate the instances of UB (since each
one is something that needs to be fixed), but don't expect to come up
with a definitive list.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 21 '07 #43
Keith Thompson <ks***@mib.orgwrites:
Ben Bacarisse <be********@bsb.me.ukwrites:
>Fr************@googlemail.com writes:
>>On Aug 20, 8:36 pm, Army1987 <army1...@NOSPAM.itwrote:
On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
main() { printf("%d\n",strlen(malloc(0))); }

Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.

4) Uses %d as a format specifier for a size_t... though as the
compiler assumes that strlen (used without a prototype) returns an
int, maybe these two bugs cancel each other out.

I would say that the %d is not in error. The compiler will arrange
that the function called strlen will return an int and an int will be
printed (it might be trap representation, but that is because of the
*other* problem you identified).

Not necesarily. strlen actually returns a size_t. Calling it as if
it returned an int (regardless of what's done with the result) invokes
undefined behavior. One of the infinitely many possible consequences
of this undefined behavior is that the compiler uses its knowledge of
the standard library and treats strlen as if it returned a size_t
(which, of course, it does).
Yes. I should have said no more than that the %d *may* not be wrong.
Some implementations might ignore what they might know of strlen thus
rendering the format, oddly, OK. This is probably no more than
Francine Neary was saying in the first place -- I have not added
anything!

Your other point, elsewhere, that once there is one UB all bets are
off makes this sort of exercise rather odd.

--
Ben.
Aug 21 '07 #44
On Aug 19, 11:05 am, Flash Gordon <s...@flash-gordon.me.ukwrote:
Eric Sosman wrote, On 19/08/07 15:43:
fishp...@invalid.com wrote:
[...]
No I believe malloc(0) can never return null -
Would reading section 7.20.3 paragraph 1 of the language
Standard alter your belief?
"[...] If the size of the space requested is zero,
the behavior is implementation-defined: either a
null pointer is returned, or the behavior is as if
the size were some nonzero value, except that the
returned pointer shall not be used to access an object."
after all, how could it
not be possible to allocate 0 bytes of memory!
Most likely, because it fails to allocate the internal
bookkeeping space it uses for keeping track of the addresses
it has returned that have not yet been free()d.
There's a potentially interesting quibble here, for people
interested in quibbles. The Standard requires (same paragraph)
that memory obtained from malloc() be "disjoint" from all other
objects, which is not quite the same as requiring that it have
an address different from that of all other objects. Since the
value of malloc(0) cannot be used to access an object, it could
be argued that the program cannot test disjointness without
engaging in undefined behavior anyhow.

The following does not invoke undefined behaviour since you are always
allowed to test for equality (the first byte is 1 beyond the end which
is still OK or you could not free it)...

#include <stdlib.h>
#include <stdio.h>

int main(void)
{
void *p1 = malloc(0);
void *p2 = malloc(0);

if (p1==NULL || p2==NULL)
puts("At least one null pointer returned");
else if (p1==p2)
puts("Regions are not disjoint");
else
puts("Regions are disjoint");

free(p1);
free(p2);

return 0;

}
All this does is to check if p1 is the same as p2.
The question is

Does the fact that p1 = p2 mean that the memory area
pointed to by p1 is not disjoint to the memory area pointed
to by p2?

This clearly depends on the meaning assigned to disjoint. If we take

A memory area A is disjoint to a memory area B iff there does not
exist a byte that belongs to both A and B

then any two memory areas of zero bytes are disjoint, in particular,
a memory area of zero bytes is disjoint to itself.

So knowing that p1 is equal to p2 does not allow you to conclude
"Regions are not disjoint".

- William Hughes

Aug 21 '07 #45
Keith Thompson wrote, On 21/08/07 00:09:
Flash Gordon <sp**@flash-gordon.me.ukwrites:
>Fr************@googlemail.com wrote, On 20/08/07 21:04:
>>On Aug 20, 8:36 pm, Army1987 <army1...@NOSPAM.itwrote:
On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
main() { printf("%d\n",strlen(malloc(0))); }
Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.
[...]
>>6) Execution falls off the end of a non-void function without
returning a value.
That returns an undefined status, some people argue it does not invoke
undefined behaviour.

In C90, it returns an undefined status. IMHO that's not undefined
behavior; it only affects the behavior of the environment, which is
outside the scope of the standard.
By saying "some" I implied others did not thing that :-)
In C99, since the function in question is main, a special rule says
that falling off the end is equivalent to 'return 0;'. (But then, in
C99 the 'main()' declaration is a constraint violation.)
Sine it would not compile as C99 I did not bother with C99 rules.
--
Flash Gordon
Aug 21 '07 #46
On Mon, 20 Aug 2007 16:09:08 -0700, Keith Thompson wrote:
Flash Gordon <sp**@flash-gordon.me.ukwrites:
>Fr************@googlemail.com wrote, On 20/08/07 21:04:
>>On Aug 20, 8:36 pm, Army1987 <army1...@NOSPAM.itwrote:
On Sat, 18 Aug 2007 11:19:49 +0200, Antoninus Twink wrote:
main() { printf("%d\n",strlen(malloc(0))); }
Enumerating all of the reasons why this causes UB is left as an
exercise to the reader.
[...]
>>6) Execution falls off the end of a non-void function without
returning a value.

That returns an undefined status, some people argue it does not invoke
undefined behaviour.

In C90, it returns an undefined status. IMHO that's not undefined
behavior; it only affects the behavior of the environment, which is
outside the scope of the standard.
Not completely. For example, it specifies when the functions
registered with atexit() etc. are called, when files are closed
etc.
A return from main() is not *completely* equivalent to directly
calling exit(). First, main() returns, then exit() is called.
This can be seen by using functions which use pointers to auto
variables of main() and register them with atexit().
So I think that in this case [main() without a return] the
implementation tries to call exit() with an indeterminate
argument. I don't have a copy of the C90 standard, but I think it
could be UB.

--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 21 '07 #47
On 18 Aug, 10:19, Antoninus Twink <spam...@invalid.comwrote:
<snip>
... malloc(0) returns a pointer to
some random place in memory
Not on some of the systems I work on. It returns (in total conformance
to the standard) NULL.

This makes for some difficulties when porting code which makes the
same assumption as you...

Aug 21 '07 #48

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: muser | last post by:
The following error appears: 'strcmp' : cannot convert parameter 1 from 'char' to 'const char *'. I've already tried using single quotations. the header file only contains the struct contents....
3
by: jl_post | last post by:
Hi, I recently wrote two benchmark programs that compared if two strings were equal: one was a C program that used C char arrays with strcmp(), and the other was a C++ program that used...
53
by: Allan Bruce | last post by:
Hi there, I am reading a file into a char array, and I want to find if a string exists in a given line. I cant use strcmp since the line ends with '\n' and not '\0'. Is there a similar function...
11
by: Eirik | last post by:
Shouldn't this code work? If not, why shouldn't it? #include <stdio.h> int main(void) { char yesno; char *yes = "yes";
9
by: Steven | last post by:
Hello, I have a question about strcmp(). I have four words, who need to be compared if it were two strings. I tried adding the comparison values like '(strcmp(w1, w2) + strcmp(w3, w4))', where...
36
by: Chuck Faranda | last post by:
I'm trying to debug my first C program (firmware for PIC MCU). The problem is getting serial data back from my device. My get commands have to be sent twice for the PIC to respond properly with...
0
by: noobcprogrammer | last post by:
#include "IndexADT.h" int IndexInit(IndexADT* word) { word->head = NULL; word->wordCount = 0; return 1; } int IndexCreate(IndexADT* wordList,char* argv)
2
by: thungmail | last post by:
There is partial code in C typedef struct message { int messageId; char *messageText; struct message *next; }message; ..... ..... ..... /* Get a node before a node */
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.