473,324 Members | 2,166 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Re: Parsing a PATH null-terminated variable

In an otherwise excellent article, Eric Sosman said:
DiAvOl wrote:
<snip>
>>
char *check_variable(const char *var, const char *program) {
char *lvar = NULL;

This is all right, but I dislike initializing variables with
values that will never be used: Your function will either set lvar
to something else, or will return without ever touching it,
....or will read it without ever writing it, in which case the
initialisation value protects you against reading an indeterminate value
and invoking undefined behaviour.
so this initial value is useless.
So is indentation. The compiler ignores it completely. Nevertheless, it can
make programming an easier task, and so can careful initialisation.
Yes, I know, some people think it's a
good idea to initialize every variable, especially every pointer
variable, but those people are mistaken.
Not all of them. This one, for instance, thinks that, and is not mistaken.
It /is/ a good idea. There are reasonable arguments for it and reasonable
arguments against it. The converse - not initialising very object - is
also a good idea, and again there are reasonable arguments for it and
reasonable arguments against it. This makes it a style issue, not a
correctness issue, and matters of style are a matter for programmers'
personal choice.
So Say I, The Authority On Whose Word You Should Always Rely.
Uh, yeah - right. :-)
> char *p;
char *t;
const char *pname;
static char path[PATH_MAX] = {0};

Why make this static? Also, see below for some thoughts about
the initialization -- it's not useless this time,
Oddly, because it /is/ static, the initialisation /is/ useless - it only
becomes useful if you remove the staticness, which it seems you may be
planning on advising - so this defensive programming exercise turned out
to be a good idea not for your reason but for mine.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 13 '08 #1
14 3058
Richard Heathfield wrote:
In an otherwise excellent article, Eric Sosman said:
>DiAvOl wrote:
<snip>
>>char *check_variable(const char *var, const char *program) {
char *lvar = NULL;
This is all right, but I dislike initializing variables with
values that will never be used: Your function will either set lvar
to something else, or will return without ever touching it,

...or will read it without ever writing it, in which case the
initialisation value protects you against reading an indeterminate value
and invoking undefined behaviour.
As far as I can see, the O.P.'s code never attempts to
read lval without first executing `lval = strdup(val);'.
So unless I've overlooked some execution path, there's no
possibility of reading before writing. What have you seen
that I've missed?

--
Eric Sosman
es*****@ieee-dot-org.invalid
Aug 13 '08 #2
Richard Heathfield wrote:
In an otherwise excellent article, Eric Sosman said:
>DiAvOl wrote:
<snip>
>>char *check_variable(const char *var, const char *program) {
char *lvar = NULL;
This is all right, but I dislike initializing variables with
values that will never be used: Your function will either set lvar
to something else, or will return without ever touching it,

...or will read it without ever writing it, in which case the
initialisation value protects you against reading an indeterminate value
and invoking undefined behaviour.
Keep in mind that the real bug is reading the value without having
written a value that was intended to be used. Initializing with a value
that is NOT intended to be used serves only to make the failure mode
less nasty; this can have the undesirable result of making the failure
much harder to detect.

I've had a lot of personal experience with this, because the SGI C
compiler will zero-initialize automatic variables that are not
explicitly initialized. As a result, several bugs in code that I did not
write, but am responsible for, went unnoticed for years. 0 was not the
correct value, but is was a value that caused behavior that differed
from the correct behavior only in ways that were too subtle to be easily
noticed. As a result, we did not detect those errors until we ported the
code to Linux machines using gcc, which does not initialize those variables.

Initializing variables with carefully chosen values could, in some
cases, cause the catastrophic failure to occur more reliably and in a
safer fashion than it would when they are uninitialized. However, many
compilers can diagnose the reading of uninitialized variables;
initializing them disables this feature, without actually solving the
underlying problem. I greatly prefer finding these problems by a
diagnostic message at compile time, than by a catastrophic failure at
run time. I prefer either of those, to not finding the bug at all,
because it has been masked by a "safe" initialization that leaves me
with a subtly-incorrect program.
Aug 13 '08 #3
On Aug 13, 3:16*pm, James Kuyper <jameskuy...@verizon.netwrote:
<snip>
Initializing variables with carefully chosen values could, in some
cases, cause the catastrophic failure to occur more reliably and in a
safer fashion than it would when they are uninitialized. However, many
compilers can diagnose the reading of uninitialized variables;
initializing them disables this feature, without actually solving the
underlying problem. I greatly prefer finding these problems by a
diagnostic message at compile time, than by a catastrophic failure at
run time. I prefer either of those, to not finding the bug at all,
because it has been masked by a "safe" initialization that leaves me
with a subtly-incorrect program.
In case you miss the compiler warning though and use an uninitialized
variable the program "may" work because the uninitialized pointer can
point anywhere. On the other hand if you initialize the pointer to a
NULL value and use it the buggy program will always crash with a
segmentation fault.

Thanks for your replies
Aug 13 '08 #4
DiAvOl wrote:
On Aug 13, 3:16 pm, James Kuyper <jameskuy...@verizon.netwrote:
<snip>
>Initializing variables with carefully chosen values could, in some
cases, cause the catastrophic failure to occur more reliably and in a
safer fashion than it would when they are uninitialized. However, many
compilers can diagnose the reading of uninitialized variables;
initializing them disables this feature, without actually solving the
underlying problem. I greatly prefer finding these problems by a
diagnostic message at compile time, than by a catastrophic failure at
run time. I prefer either of those, to not finding the bug at all,
because it has been masked by a "safe" initialization that leaves me
with a subtly-incorrect program.

In case you miss the compiler warning though and use an uninitialized
variable the program "may" work because the uninitialized pointer can
point anywhere. On the other hand if you initialize the pointer to a
NULL value and use it the buggy program will always crash with a
segmentation fault.
That's an example of what I meant by a "carefully chosen value". In
practice, it doesn't make much of a difference. An uninitialized pointer
value is extremely unlikely to be dereferencable, at least on the
systems I've used. Other types are more forgiving.
Aug 13 '08 #5
Eric Sosman said:

<snip>
As far as I can see, the O.P.'s code never attempts to
read lval without first executing `lval = strdup(val);'.
So unless I've overlooked some execution path, there's no
possibility of reading before writing. What have you seen
that I've missed?
Maintenance.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 13 '08 #6
DiAvOl wrote:
[... concerning uninitialized pointer variables ...]
In case you miss the compiler warning though and use an uninitialized
variable the program "may" work because the uninitialized pointer can
point anywhere. On the other hand if you initialize the pointer to a
NULL value and use it the buggy program will always crash with a
segmentation fault.
That is a common outcome, but is by no means guaranteed. On some
systems, writing through a null pointer will crash but reading will
silently retrieve zeroes.

In any event, a crash is in no sense a "fix" even if it is
reproducible. The fix for using an uninitialized variable is not
to initialize it to a predictably wrong value, but to ensure that
it gets properly initialized before you try to use it. As James
Kuyper points out, compilers and lints have become pretty good at
diagnosing read-before-write errors, and by providing a bogus
initialization you defeat the tools' attempt to help you. You get
rid of a compile-time diagnostic in exchange for a run-time error,
and that's a bad trade.

Richard Heathfield disagrees with this line of reasoning, and
he's a smart egg. But on this matter, he's a cracked egg. That's
my story, and I'm sticking to it, and I'm not saying anything more
without my lawyer present.

--
Er*********@sun.com

Aug 13 '08 #7
James Kuyper <ja*********@verizon.netwrites:
DiAvOl wrote:
[...]
>In case you miss the compiler warning though and use an uninitialized
variable the program "may" work because the uninitialized pointer can
point anywhere. On the other hand if you initialize the pointer to a
NULL value and use it the buggy program will always crash with a
segmentation fault.

That's an example of what I meant by a "carefully chosen value". In
practice, it doesn't make much of a difference. An uninitialized
pointer value is extremely unlikely to be dereferencable, at least on
the systems I've used. Other types are more forgiving.
Unless the pointer happens to occupy the same memory address as a
previous pointer object that had a valid, or at least dereferencable,
value.

Uninitialized garbage isn't mathematically random.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 13 '08 #8
Keith Thompson wrote:
James Kuyper <ja*********@verizon.netwrites:
....
practice, it doesn't make much of a difference. An uninitialized
pointer value is extremely unlikely to be dereferencable, at least on
the systems I've used. Other types are more forgiving.

Unless the pointer happens to occupy the same memory address as a
previous pointer object that had a valid, or at least dereferencable,
value.
You do have a point; this isn't even incredibly unlikely. On at least
some systems, repeated calls to the same function, with no intervening
calls to any other function, allocate the same exact pieces of memory
to hold all of the automatic variables, which still hold the values
they had at the end of the previous call to that function. Still, I
prefer relying upon compile-time detection, rather than run-time.
Aug 13 '08 #9
Eric Sosman said:

<snip>
You get
rid of a compile-time diagnostic in exchange for a run-time error,
and that's a bad trade.

Richard Heathfield disagrees with this line of reasoning, and
he's a smart egg.
Right. :-)
But on this matter, he's a cracked egg.
If reading an indeterminate value were a constraint violation or a syntax
error requiring the implementation to issue a diagnostic message, I'd
agree with you. Since it isn't and doesn't, I don't (and amn't).

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 13 '08 #10
>James Kuyper <ja*********@verizon.netwrites:
practice, it doesn't make much of a difference. An uninitialized
pointer value is extremely unlikely to be dereferencable, at least on
the systems I've used. Other types are more forgiving.
>Keith Thompson wrote:
>Unless the pointer happens to occupy the same memory address as a
previous pointer object that had a valid, or at least dereferencable,
value.
In article <ce**********************************@x35g2000hsb. googlegroups.com>
<ja*********@verizon.netwrote:
>You do have a point; this isn't even incredibly unlikely. On at least
some systems, repeated calls to the same function, with no intervening
calls to any other function, allocate the same exact pieces of memory
to hold all of the automatic variables, which still hold the values
they had at the end of the previous call to that function.
Local variables that act as if they were static. Interesting
failure mode. :-)

One other "interesting failure mode" that I have seen in real code,
namely mh (the RAND "mail handler" suite), many years ago, when code
for Unix systems ran on PDP-11s and VAXen (and no other machines).

The code structure was something like this:

int somefunc() {
register MSG *p;
...
p = get_current_msg();
...
if (otherfunc())
do something;
else
do something else;
...
}

int otherfunc() {
register MSG *cur;
...
use(cur->field); /* without ever setting "cur" */
...
return some result or another;
}

This code actually worked, because the "register" keyword told the
C compiler to use a machine register, and it always did. (There were
only two C compilers -- the PDP-11 C Compiler and the VAX C Compiler
-- at the time, so it was easy to say what "the C compiler" did.
Actually there were two PDP-11 C compilers, dmr's and scj's, but
both behaved the same way with regard to "register" variables.)

The code thus passed the register's value by "register inference".
On the PDP-11, the first three "register" variables went in r5,
r4, and r3. On the VAX, the first six "register" variables went
in r11, r10, r9, r8, r7, and r6. As long as the two pointers were
both in the same machine register, and the register was already
set in the caller (somefunc() above), that register held the correct
value in the callee (otherfunc() above). (The fact that the
register was callee-save -- pushed at entry to otherfunc(), and
popped again at exit -- was irrelevant since it was not modified
inside otherfunc().)

We (actually Fred Blonder) found the problem when we compiled mh
for the Pyramid, a machine on which registers worked quite differently.
Finding the bug was not too difficult -- it was obvious from code
inspection that the pointer in otherfunc() was never initialized
-- but figuring out why it worked at all on the VAX was interesting.
>Still, I prefer relying upon compile-time detection, rather than run-time.
Indeed, had gcc existed at the time, compiling with gcc (with
optimization and warnings both turned on) would have found the bug
right away, even before running the code. (Initializing the pointer
to NULL might not have found the problem, since on the VAX and
PDP-11, *(MSG *)NULL would not fault. Depending on what the
sub-function was doing, it might have *seemed* to be working, up
until cur->field was supposed to have some useful value anyway.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html
Aug 13 '08 #11
On 2008-08-13, DiAvOl <di****@freemail.grwrote:
On Aug 13, 3:16*pm, James Kuyper <jameskuy...@verizon.netwrote:
<snip>
>Initializing variables with carefully chosen values could, in some
cases, cause the catastrophic failure to occur more reliably and in a
safer fashion than it would when they are uninitialized. However, many
compilers can diagnose the reading of uninitialized variables;
initializing them disables this feature, without actually solving the
underlying problem. I greatly prefer finding these problems by a
diagnostic message at compile time, than by a catastrophic failure at
run time. I prefer either of those, to not finding the bug at all,
because it has been masked by a "safe" initialization that leaves me
with a subtly-incorrect program.

In case you miss the compiler warning though and use an uninitialized
variable the program "may" work because the uninitialized pointer can
point anywhere. On the other hand if you initialize the pointer to a
NULL value and use it the buggy program will always crash with a
segmentation fault.
Actually, we just found a bug in our code for our embedded controllers
that potentially dereferenced a NULL pointer (some code was outside an
if(pPointer != NULL) {} block and shouldn't have been), but we never
saw the bug actually cause any issues in real life, since the processor
we use does not have any memory protection.

Therefore, '0' was a value pointer value and dereferencing it did not
cause any visible problems. Nothing reset, nothing failed, and when I
looked through the code, it looked like the value was checked to make
sure it was one of two specific vales, and in the case that it /was/
one of those values, everything would be peachy, and if not an error
code would have been returned just as it would've if the NULL value
had been checked.

So NULL isn't always "safe".

--
Andrew Poelstra ap*******@wpsoftware.com
To email me, use the above email addresss with .com set to .net
Aug 14 '08 #12
On Wed, 13 Aug 2008 06:01:03 -0700 (PDT), DiAvOl <di****@freemail.gr>
wrote:
>On Aug 13, 3:16*pm, James Kuyper <jameskuy...@verizon.netwrote:
<snip>
>Initializing variables with carefully chosen values could, in some
cases, cause the catastrophic failure to occur more reliably and in a
safer fashion than it would when they are uninitialized. However, many
compilers can diagnose the reading of uninitialized variables;
initializing them disables this feature, without actually solving the
underlying problem. I greatly prefer finding these problems by a
diagnostic message at compile time, than by a catastrophic failure at
run time. I prefer either of those, to not finding the bug at all,
because it has been masked by a "safe" initialization that leaves me
with a subtly-incorrect program.

In case you miss the compiler warning though and use an uninitialized
variable the program "may" work because the uninitialized pointer can
point anywhere. On the other hand if you initialize the pointer to a
NULL value and use it the buggy program will always crash with a
segmentation fault.
Only for a very limited meaning of the word "always" and a very
expansive meaning of the phrase "segmentation fault". There is a lot
more variety in the real world than what you use on your home or
office system.

--
Remove del for email
Aug 14 '08 #13
Chris Torek <no****@torek.netwrites:
This code actually worked, because the "register" keyword told the
C compiler to use a machine register, and it always did. (There were
only two C compilers -- the PDP-11 C Compiler and the VAX C Compiler
-- at the time, so it was easy to say what "the C compiler" did.
Actually there were two PDP-11 C compilers, dmr's and scj's, but
both behaved the same way with regard to "register" variables.)
Historical precisions.

Reading DMR's "The development of the C Programming Language" in HOPL-II.
DMR's compiler seems to have been ported to Honeywell 635 and IBM 360/370.
Then Steve Johnson wrote pcc at the same time as he, dmr and Thompson
ported Unix to Interdata 8/32. The port induced changes in the language
(unsigned, cast, tying struct members to the struct). The success of the
port led another port to the VAX. (It isn't mentionned if the VAX compiler
was based on pcc or not -- which makes me think it probably was).

Yours,

--
Jean-Marc
Aug 14 '08 #14
>Chris Torek <no****@torek.netwrites:
>This code actually worked, because the "register" keyword told the
C compiler to use a machine register, and it always did. (There were
only two C compilers -- the PDP-11 C Compiler and the VAX C Compiler
-- at the time, so it was easy to say what "the C compiler" did.
Actually there were two PDP-11 C compilers, dmr's and scj's, but
both behaved the same way with regard to "register" variables.)
In article <87************@news.bourguet.org>
Jean-Marc Bourguet <jm@bourguet.orgwrote:
>Historical precisions.

Reading DMR's "The development of the C Programming Language" in HOPL-II.
DMR's compiler seems to have been ported to Honeywell 635 and IBM 360/370.
Then Steve Johnson wrote pcc at the same time as he, dmr and Thompson
ported Unix to Interdata 8/32. The port induced changes in the language
(unsigned, cast, tying struct members to the struct).
Right. Before that point, you could get unsigned integers, by
using pointers (in dmr's original C compiler, pointers and integers
were freely interconvertible and one did things like 0177440->csr
to access hardware). There was no unsigned char or short, though.
Even "long" was a relatively late addition (V6 Unix used arrays of
two "int"s, hence the odd calling sequence for C's time() function).
>The success of the port led another port to the VAX. (It isn't
mentioned if the VAX compiler was based on pcc or not -- which
makes me think it probably was).
It was. (Hence "two compilers".) PCC, the Portable C Compiler,
used a hand-written lexer, yacc to parse, and a table-driven code
generator (and an optional separate optimizer, "c2", written by
John Reiser, at least the VAX version). There were PCC back-ends
(i.e., tables) for at least the PDP-11, Interdata, and VAX (and
probably IBM 360, but I think Honeywell was never done, as the VAX
PCC rejected GECOS-style backquote constants, which never formally
made it into the C language either). More machines were added
later, when Unix got commercialized.

I never used or modified dmr's compiler, but it had a hand-written
recursive-descent parser and ad-hoc code generator.

I am not sure if there were several different "cpp" preprocessor
programs; the only one I ever dealt with was the Reiser version
(which, like c2, was nearly unmaintainable -- jfr's code was
atrocious).
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html
Aug 15 '08 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home...
3
by: Neil Ginsberg | last post by:
I need to write some code that will get all filenames under a particular directory and add them to a table, including any in subdirectories. I realize that Dir can be used to get all filenames in a...
2
by: John Young | last post by:
I'm trying to parse a directory, but am not sure of the best way of doing it. Preferably using only .net instructions. Can anyone give me an idea of how to do this? Thanks in advance for any...
0
by: Seth | last post by:
First off, my apologies if this is in the wrong newsgroup, but I hope I'm close enough. I'm trying to do some parsing of a CSV file using OleDbConnection, but for some reason, when I populate my...
4
by: Rick Walsh | last post by:
I have an HTML table in the following format: <table> <tr><td>Header 1</td><td>Header 2</td></tr> <tr><td>1</td><td>2</td></tr> <tr><td>3</td><td>4</td></tr> <tr><td>5</td><td>6</td></tr>...
30
by: drhowarddrfine | last post by:
I'm working with a server that will provide me the pathname to a file, among many paths. So from getenv I may get /home/myweb/page1 but, of course, there will be many variations of that. I'm...
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
3
by: Aaron | last post by:
I'm trying to parse a table on a webpage to pull down some data I need. The page is based off of information entered into a form. when you submit the data from the form it displays a...
1
by: janakivenk | last post by:
Hello, I am running Oracle 10g R2 in our office. I created the following procedure. It is suppose to access an xml file ( family.xml). The procedure is compiled and when I try to run it, i get the...
5
by: ShadowLocke | last post by:
This class makes use of System.Collections.Hashtable to enumerate all the settings in an INI file for easy access. Its very simplistic, and completely re-useable. Solid addition for any app that...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.