473,399 Members | 3,302 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

UDB and pointer increments and decrements


I'm still battling with this causing UDB:

while(e-- s);

if s points to the start of a string and e becomes less than s then e is
not really pointing to defined char. Fine.

But UDB?

Yes, e has an UDV (undefined value) but would this really cause a
program to misbehave? In any platfrom? Remember this value of e is never
used again in this case.

I ask because theoretically s can be pointing to the middle of a bigger
string. We then call a function with s as a parameter.

The function called can have no idea that s is the pointer to a middle
string. therefore it can have no idea how to "do undefined things" when
e is decremented past the start of s. e and s are strictly char *s. It
would be so "not C" if the compiler generated code to check the contents
pointed to do determine the range of the object to the middle of which s
points. I mean then we may as well have array limits and exceptions
built into the language.

I'm not being difficult here. Explain how this works. My problem (and I
admit its a problem) is that i feel too much of C is being elevated to
an almost ADA type status and (in this group) C is losing that "down and
dirty and efficient" feeling which it is famous for.
Sep 23 '08 #1
9 1707
Richard<rg****@gmail.comwrites:
I'm still battling with this causing UDB:

while(e-- s);

if s points to the start of a string and e becomes less than s then e is
not really pointing to defined char. Fine.

But UDB?

Yes, e has an UDV (undefined value) but would this really cause a
program to misbehave? In any platfrom? Remember this value of e is never
used again in this case.
1/ C has 4 levels of definition (defined behavior, implementation --
includede locale -- defined behavior, unspecified behavior, undefined
behavior), no more. Spending effort to try and classify undefined behavior
more finely is probably not worthwhile. And it seems to be that's what you
want, having different rules for the undefined value created by
decrementing a pointer and all they others. There is precedence (the
similar past of end of an array pointer comes immediately to mind), but
your's would be more limited than that one (or you'd have got opposition
from DOS folk as allowing them in comparison would have constrained them a
lot, probably limitting the size of an object to 32767 instead of 65535
they have got).

2/ Optimizers tend to use undefined behavior in creative way. For example,
things like value propagation can optimize out the then part of the if in
this code:

if (i == INT_MAX) {
do something not modifying i;
}
++i;

(reasonning: as incrementing i is an overflow if i is INT_MAX, so it would
be undefined behavior is that was the case, then the optimizer can assume i
isn't INT_MAX, the result of the comparison is false). Optimizations like
this is one of the reasons for which undefined behavior can be non causal
(you just have to be sure that the code causing the undefined behavior
would have been executed). And note that optimized do such propagation to
more than the current function, they potentially can even do it for the
whole program, and that's the way they are heading.

Yours,

--
Jean-Marc
Sep 23 '08 #2
Richard<rg****@gmail.comwrites:
I'm still battling with this causing UDB:

while(e-- s);

if s points to the start of a string and e becomes less than s then e is
not really pointing to defined char. Fine.

But UDB?
A small note: You're the only person I've ever seen refer to undefined
behavior as "UDB". Most posters here (at least those who choose to
abbreviate it) refer to it as "UB". Why do you feel the need to
invent your own abbreviation when there's already a perfectly good one
in widespread use? (One could argue that "UB" could also mean
unspecified behavior, but i've never seen it used that way, and it's
generally clear enough from the context.)

Yes, the behavior is undefined, simply because the standard doesn't
define the behavior. That's all "undefined behavior" means.
Yes, e has an UDV (undefined value) but would this really cause a
program to misbehave? In any platfrom? Remember this value of e is never
used again in this case.
Yes. I don't have a real-world example, but if the containing object
happens to be allocated at the beginning of a memory segment, it could
easily blow up. And, as has been mentioned elsethread, a compiler is
allowed to *assume* that undefined behavior does not occur, and
perform code transformations based on that assumption (after all, if
the behavior is already undefined, it can't make things worse); that
may be a more realistic risk for most modern systems.
I ask because theoretically s can be pointing to the middle of a bigger
string. We then call a function with s as a parameter.
Undefined behavior occurs if a pointer is decremented past the
beginning of an array object, not if it's decremented past the initial
value of a function parameter. Given this:

char s[100];

char *func(char *ptr) { return ptr - 1; }

calling func(s+10) has well-defined behavior, but calling func(s) has
undefined behavior. (I haven't compiled the above, so there may be
some dumb mistakes.)
The function called can have no idea that s is the pointer to a middle
string.
Right.
therefore it can have no idea how to "do undefined things" when
e is decremented past the start of s. e and s are strictly char *s.
It doesn't deliberately "do undefined things"; that's not the point.
The point is that the standard doesn't define what it does. In my
example above, I'm thinking of a hypothetical system on which
constructing the pointer value s-1 causes a hardware trap (because s
is allocated at the beginning of a segment, and the hardware
"decrement address" instruction traps in this case). The code
generated for the body of the function has no awareness of this.

For example, assume an implementation on which signed integer overflow
causes a trap.

int func(int n) { return n + 1; }

func(42) has well-defined behavior, and returns 43. func(INT_MAX) has
undefined behavior, and (on this particular implementation) causes a
trap (or does something arbitrarily strange if an optimizing compiler
rearranges code based on the assumption that no UB occurs). The
function has no awareness of this; it just returns the result of n +
1.
It
would be so "not C" if the compiler generated code to check the contents
pointed to do determine the range of the object to the middle of which s
points. I mean then we may as well have array limits and exceptions
built into the language.
The compiler is *allowed* to perform such checks, but it's not
required to. That's why the behavior is undefined, rather than being
defined to do whatever a failing check would do.
I'm not being difficult here. Explain how this works. My problem (and I
admit its a problem) is that i feel too much of C is being elevated to
an almost ADA type status and (in this group) C is losing that "down and
dirty and efficient" feeling which it is famous for.
(It's "Ada", not "ADA".)

C loses none of its "down and dirty and efficient" feeling because of
this. In fact, the generated code can gain in efficiency because the
compiler is allowed to trust the user to avoid undefined behavior and
to perform aggressive optimization based on that assumption.

A C implementation that does exactly what you seem to expect it to do
(treat addresses as simple integers, allow arbitrary addresses to be
computed, etc.) would be conforming. An implementation that performs
aggressive bounds checking can also be conforming.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 23 '08 #3
Richard<rg****@gmail.comwrites:
ja*********@verizon.net writes:
[...]
>checks mandatory, the behavior could not be undefined - it would have
to be either standard-defined or implementation-defined. Because the
behavior is undefined, an implementation is currently free to deal
with array limits by ignoring them.

And them remaining undefined? Unspecified would have been better surely?
Better how?

Unspecified behavior is "use of an unspecified value, or other
behavior where this International Standard provides two or more
possibilities and imposes no further requirements on which is chosen
in any instance".

For the behavior of, for example, attempting to access an array
outside its bounds to be unspecified rather than undefined, the
standard would have to provide a number of possible behaviors, and
anything other than one of those behaviors would be non-conforming.

Suppose I have an array object declared within a function, and I write
to element -1 of that array. I could clobber nearly anything,
including the function's stored return address or some other vital
piece of information. How would you restrict the possible
consequences of that to "two or more possibilities"?

[snip]
I appreciate the time you have taken to explain. I would still love
someone to explain the case I asked about above though. The one where s
is pointing into the middle of an array. Or did you and I didn't
understand?
See my other recent response in this thread.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Sep 23 '08 #4
Richard wrote:
I'm still battling with this causing UDB:

while(e-- s);

if s points to the start of a string and e becomes less than s then e is
not really pointing to defined char. Fine.

But UDB?
Yes. 6.5.6p8, penultimate sentence.
Yes, e has an UDV (undefined value) but would this really cause a
program to misbehave? In any platfrom? Remember this value of e is never
used again in this case.
The "UDV" need not even exist. Undefined behavior is not limited
to generating an indeterminate value.
I ask because theoretically s can be pointing to the middle of a bigger
string. We then call a function with s as a parameter.
No problem. Decrementing a pointer to the first element of this
string is then well-defined, because the result points to an extant
element of the larger array.
The function called can have no idea that s is the pointer to a middle
string. therefore it can have no idea how to "do undefined things" when
e is decremented past the start of s. e and s are strictly char *s. It
would be so "not C" if the compiler generated code to check the contents
pointed to do determine the range of the object to the middle of which s
points. I mean then we may as well have array limits and exceptions
built into the language.
It's not clear what you're getting at, or why you think any
checking is necessary or implied. One of the reasons the Standard
leaves things undefined is to *relieve* implementations of the burden
of checking for errors. The benefit is that the generated code can
be simpler and faster (dramatically so, in some cases), and the
penalty is that there's no way to guarantee what happens when an
error goes undetected.

The Standard *could* have required that the implementation detect
out-of-range pointer use and raise SIGSLOPPY, but that's the "so not C"
philosophy that you mention. Instead, the Standard says "Try to
generate an out-of-range pointers and all bets are off; I wash my hands
of you and refuse to make any promises about what will or won't happen.
Hasta la vista, baby." That's what "undefined behavior" means.
I'm not being difficult here. Explain how this works. My problem (and I
admit its a problem) is that i feel too much of C is being elevated to
an almost ADA type status and (in this group) C is losing that "down and
dirty and efficient" feeling which it is famous for.
Portability is one of many aspects a program can have, in greater
or lesser degree. It is seldom if ever the only important aspect, nor
even at the front of the line. Sometimes portability is compromised
for a good reason, and I don't think you'll find anyone who says
otherwise.

But when portability is sacrificed for no reason, out of ignorance
("Right-shifting always propagates the sign bit"), or out of laziness
("It's easier to write `2' than `sizeof(int)'"), or out of sloppiness
("Don't worry where the pointer points; we'll only use it if it's OK"),
or even out of arrogance ("All systems are just like mine"), then it's
worth pointing out the sacrifice and suggesting safer alternatives.

It's also worth noting that "efficient" is not an antonym of
"portable" and not a synonym of "dirty."

--
Er*********@sun.com

Sep 23 '08 #5
Richard wrote, On 23/09/08 16:44:
I'm still battling with this causing UDB:

while(e-- s);

if s points to the start of a string and e becomes less than s then e is
not really pointing to defined char. Fine.

But UDB?
<snip>
I'm not being difficult here. Explain how this works. My problem (and I
admit its a problem) is that i feel too much of C is being elevated to
an almost ADA type status and (in this group) C is losing that "down and
dirty and efficient" feeling which it is famous for.
Myself and another poster suggested an object starting at the beginning
of a page or segment and *hardware* that traps on trying to decrement to
before the start of the page/segment. No software checks need be involved!
--
Flash Gordon
If spamming me sent it to sm**@spam.causeway.com
If emailing me use my reply-to address
See the comp.lang.c Wiki hosted by me at http://clc-wiki.net/
Sep 23 '08 #6
On Sep 24, 3:44*am, Richard<rgr...@gmail.comwrote:
I'm still battling with this causing UDB:

while(e-- s);

if s points to the start of a string and e becomes less than s then e is
not really pointing to defined char. Fine.
What if the string is at the very start of
the address space? Where does 'e' point after
decrementing it?

There are CPUs or MMUs that will trap upon
loading of an obviously bogus pointer such
as this one that doesn't even describe a
memory location that exists.
Sep 23 '08 #7
"Richard" <rg****@gmail.comha scritto nel messaggio
news:gb**********@registered.motzarella.org...
>
I'm still battling with this causing UDB:

while(e-- s);

if s points to the start of a string and e becomes less than s then e is
not really pointing to defined char. Fine.

But UDB?

Yes, e has an UDV (undefined value) but would this really cause a
program to misbehave? In any platfrom? Remember this value of e is never
used again in this case.

I ask because theoretically s can be pointing to the middle of a bigger
string. We then call a function with s as a parameter.
if you want a language that there is few to say, use assembly

for what i can see for this group the speaking time of varios "UB"
(undefinite behaviours) is more time consuming that programming

Sep 26 '08 #8
Rosario wrote:
....
for what i can see for this group the speaking time of varios "UB"
(undefinite behaviours) is more time consuming that programming
That's because undefined (not "undefinite") behavior is the single most
serious kind of problem C code can have. It's also because most code
that people bring to this group because they're having problems with it,
has undefined behavior. That's a selection effect; syntax errors and
constraint violations are easily caught by the compiler; the programs
that actually compile and fail tend to have subtler problems, usually
involving undefined behavior.
Sep 26 '08 #9
ja*********@verizon.net writes:
Richard wrote:
I'm still battling with this causing UDB:

while(e-- s);

if s points to the start of a string and e becomes less than s then e is
not really pointing to defined char. Fine.

But UDB?

Yes, e has an UDV (undefined value) but would this really cause a
program to misbehave? In any platfrom? Remember this value of e is never
used again in this case.

I ask because theoretically s can be pointing to the middle of a bigger
string. We then call a function with s as a parameter.
The function called can have no idea that s is the pointer to a middle
string. therefore it can have no idea how to "do undefined things" when
e is decremented past the start of s. e and s are strictly char *s. It
would be so "not C" if the compiler generated code to check the contents
pointed to do determine the range of the object to the middle of which s
points. I mean then we may as well have array limits and exceptions
built into the language.

It's too late - the language that makes the behavior undefined was
inserted into the standard precisely for the purpose of allowing (but
not mandating) array limit checks. [...]
Nonsense. Allowing a pointer to be decremented to before the
start of an array is still compatible with doing array limit
checks, just as allowing a pointer to be incremented past the end
of an array is compatible with doing array limit checks.
The rationale document makes clear that decrementing a pointer
to before the start of an array was rejected because it would
impose overly burdensome requirements on implementations.
Array limit checks are equally possible whether e-- is allowed
or not.
Oct 9 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

22
by: lokman | last post by:
Hi, In the following code, can someone tell me the difference between *p++ and p++ ? I can see both achieve the same result. Thanks a lot !
4
by: Aire | last post by:
1. If there is a pointer, int* ptr: int* ptr; int a = 456; p=&a; (*ptr)++; *ptr++; ++*ptr;
2
by: Yossarian | last post by:
Hi, I'm a bit confused about something, hopefully someone can put me straight. I'd like to be able to call a function which takes a pointer to pointer, have that function allocate memory and...
16
by: aegis | last post by:
Given the following: int a = 10; int *p; void *p1; unsigned char *p2; p = &a;
33
by: Ney André de Mello Zunino | last post by:
Hello. I have written a simple reference-counting smart pointer class template called RefCountPtr<T>. It works in conjunction with another class, ReferenceCountable, which is responsible for the...
2
weaknessforcats
by: weaknessforcats | last post by:
Handle Classes Handle classes, also called Envelope or Cheshire Cat classes, are part of the Bridge design pattern. The objective of the Bridge pattern is to separate the abstraction from the...
2
by: mdh | last post by:
In one of K&R's exercises, I was caught out by my doing this. If p is a pointer of type double, then *--p was what I wanted, not --*p. I know that the former returns a double one position...
50
by: Juha Nieminen | last post by:
I asked a long time ago in this group how to make a smart pointer which works with incomplete types. I got this answer (only relevant parts included): ...
3
by: tfelb | last post by:
Hi group! I have here five different declarations but I have some problems to understand this concept. I know there are more examples if I would use parentheses but I think the following ones...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.