Assume the main function is defined with
int main(int argc, char *argv[]) { /*...*/ }
So, is it permitted to modify the argv array? The standard says
"The parameters argc and argv and the strings pointed to by the
argv array shall be modifiable by the program,[...]". According to
my reading of the standard, for example, ++argv and ++argv[0][0]
are both permitted, but not ++argv[0] because it says nothing about
the argv array itself. Is my interpretation correct ? 32 8532
mnaydin said: Assume the main function is defined with int main(int argc, char *argv[]) { /*...*/ }
So, is it permitted to modify the argv array? The standard says "The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program,[...]". According to my reading of the standard, for example, ++argv and ++argv[0][0] are both permitted, but not ++argv[0] because it says nothing about the argv array itself. Is my interpretation correct ?
<caveat class="this is from memory, not the Standard">
I believe so, yes. You can modify argv because you get a
copy of the caller's value, so why should the caller care
what you do with it? You can modify the contents of each
string because there's no particular reason to forbid you
to, so long as you don't try to stretch the string - i.e.
scribble over or past the null terminator. But for all you
know, the implementation might have used dynamic allocation
to get the memory it needs for storing those strings, and
might have no spare copy of the pointer values returned by
the allocator - so (if I recall correctly) the Standard
doesn't offer any behaviour guarantees whatsoever if you
mess with those pointers.
</caveat>
--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999 http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
On 2005-12-15, Richard Heathfield <in*****@invalid.invalid> wrote: mnaydin said:
Assume the main function is defined with int main(int argc, char *argv[]) { /*...*/ }
So, is it permitted to modify the argv array? The standard says "The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program,[...]". According to my reading of the standard, for example, ++argv and ++argv[0][0] are both permitted, but not ++argv[0] because it says nothing about the argv array itself. Is my interpretation correct ?
<caveat class="this is from memory, not the Standard"> I believe so, yes. You can modify argv because you get a copy of the caller's value, so why should the caller care what you do with it? You can modify the contents of each string because there's no particular reason to forbid you to, so long as you don't try to stretch the string - i.e. scribble over or past the null terminator. But for all you know, the implementation might have used dynamic allocation to get the memory it needs for storing those strings, and might have no spare copy of the pointer values returned by the allocator - so (if I recall correctly) the Standard doesn't offer any behaviour guarantees whatsoever if you mess with those pointers. </caveat>
Can you swap two of them? [suppose you want to bring all arguments
starting with '-' to the beginning of the array]
Jordan Abel wrote: On 2005-12-15, Richard Heathfield <in*****@invalid.invalid> wrote:
mnaydin said:
Assume the main function is defined with int main(int argc, char *argv[]) { /*...*/ }
So, is it permitted to modify the argv array? The standard says "The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program,[...]". According to my reading of the standard, for example, ++argv and ++argv[0][0] are both permitted, but not ++argv[0] because it says nothing about the argv array itself. Is my interpretation correct ?
<caveat class="this is from memory, not the Standard"> I believe so, yes. You can modify argv because you get a copy of the caller's value, so why should the caller care what you do with it? You can modify the contents of each string because there's no particular reason to forbid you to, so long as you don't try to stretch the string - i.e. scribble over or past the null terminator. But for all you know, the implementation might have used dynamic allocation to get the memory it needs for storing those strings, and might have no spare copy of the pointer values returned by the allocator - so (if I recall correctly) the Standard doesn't offer any behaviour guarantees whatsoever if you mess with those pointers. </caveat>
Can you swap two of them? [suppose you want to bring all arguments starting with '-' to the beginning of the array]
Not reliably. There are three different things one might
be talking about when one says `argv':
- The function parameter variable: This is modifiable.
- The individual pointers argv[0], argv[1], ... The
Standard says nothing about whether these are modifiable.
- The strings whose first characters are *argv[0],
*argv[1], ... The Standard says these are modifiable.
Section 5.1.2.2.1, paragraph 2, final constraint.
--
Eric Sosman es*****@acm-dot-org.invalid
Given that there are no const keywords in use, one would expect that
argv is modifyable in any and all senses. Naturally, main is something
of an exception case, but even so, I trust that the people who
established the standard were reasonably sensible and rigorous people,
and if they had meant for something to be const, they would have used
the const keyword to so indicate.
As for compiler designers...
-bluejack
Jordan Abel wrote: On 2005-12-15, Richard Heathfield <in*****@invalid.invalid> wrote: mnaydin said:
Assume the main function is defined with int main(int argc, char *argv[]) { /*...*/ }
So, is it permitted to modify the argv array? The standard says "The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program,[...]". According to my reading of the standard, for example, ++argv and ++argv[0][0] are both permitted, but not ++argv[0] because it says nothing about the argv array itself. Is my interpretation correct ?
<caveat class="this is from memory, not the Standard"> I believe so, yes. You can modify argv because you get a copy of the caller's value, so why should the caller care what you do with it? You can modify the contents of each string because there's no particular reason to forbid you to, so long as you don't try to stretch the string - i.e. scribble over or past the null terminator. But for all you know, the implementation might have used dynamic allocation to get the memory it needs for storing those strings, and might have no spare copy of the pointer values returned by the allocator - so (if I recall correctly) the Standard doesn't offer any behaviour guarantees whatsoever if you mess with those pointers. </caveat>
Can you swap two of them? [suppose you want to bring all arguments starting with '-' to the beginning of the array]
Yes, my primary intention is to bring some arguments to the beginning
of the array. But swapping two of them on the argv array is not a
solution because the assignment argv[i] = argv[j] is not guaranteed to
work since argv array may not be modifiable, as Richard and Eric said
in this thread. On the other hand, I thought this was a common
practice. At least in K&R2 there is an example on the page 117,
section 5.10, where argv[0] is modified, though with a different
purpose from mine. Interestingly, in the K&R1 version of the same
example, on the page 113, section 5.11, the argv[0] was not modified
and a pointer to char, named s, was used to loop through the string.
In any case, I think one of the easy and guranteed solutions is to
clone the original argv array and work on the cloned array,
something like that:
char **arglist = malloc((argc + 1) * sizeof *arglist);
if (arglist == NULL) ... Ouch ! ...
memcpy(arglist, argv, (argc + 1) * sizeof *arglist);
bluejack wrote: Given that there are no const keywords in use, one would expect that argv is modifyable in any and all senses. Naturally, main is something of an exception case, but even so, I trust that the people who established the standard were reasonably sensible and rigorous people, and if they had meant for something to be const, they would have used the const keyword to so indicate.
On the other hand, the authors of the Standard stated
explicitly that the pointed-to strings are modifiable, even
though the "no `const' appears" argument would apply to them
with equal force. Why did they bother?
Keep in mind the large body of C code already in existence
before `const' entered the language. The ANSI committee could
not invalidate two-plus decades' worth of existing code because
they'd thought of a better way. They codified existing practice,
even though (with the new tools) more explicit practice was
possible.
It seems to me not unlike the situation with string literals:
They are not `const', yet you are forbidden to try to alter them.
The Rationale explains that they were not made `const' because a
lot of existing code would break; instead, they are non-`const'
and the Standard has special language warning you not to modify
them.
The argv question seems similar (although the Rationale does
not confirm it): Pre-`const' code declared argv as `char**', and
the Standard adopted that use but added special language describing
the writeability of argv[i][j]. I think it a "curious incident"
that the Standard says nothing about the writeability of argv[i].
--
Eric Sosman es*****@acm-dot-org.invalid
bluejack wrote: Given that there are no const keywords in use, one would expect that argv is modifyable in any and all senses. Naturally, main is something of an exception case, but even so, I trust that the people who established the standard were reasonably sensible and rigorous people, and if they had meant for something to be const, they would have used the const keyword to so indicate.
As for compiler designers...
-bluejack
But, by the same logic, one could argue that it is explicitly stated in
the standard that the parameters argc, argv, and the strings pointed to
by argv array shall be modifiable, even though there is no const
keyword qualifying them, but nothing is stated on the modifiability
of the argv array itself (ie, argv[0],...,argv[argc]), so there is a
strong indication that the argv array is not supposed to be modifiable.
I think relying on the absence of the const keyword is not a valid
argument.
[You might like to quote some context. If your message was not related
to Eric Sosman's then perhaps you should reply to the OP's message
rather than somewhere downthread.]
bluejack wrote: Given that there are no const keywords in use, one would expect that argv is modifyable in any and all senses. I trust that the people who established the standard were reasonably sensible and rigorous people, and if they had meant for something to be const, they would have used the const keyword to so indicate.
That is a naive, even dangerous, form of reasoning. C has many quirks
which are counter-intuitive. Some of them are far from sensisble, e.g.
gets.
Trusting (or blaming) the Committee is an irrelevance. At the end of
the day, the language is that written in the Standard. It is up to
programmers to educate themselves on what that language is.
C is one of the worst languages for programming by intuition and hope!
--
Peter
Peter Nilsson wrote: That is a naive, even dangerous, form of reasoning. C has many quirks which are counter-intuitive. Some of them are far from sensisble, e.g. gets.
Granted.
Trusting (or blaming) the Committee is an irrelevance. At the end of the day, the language is that written in the Standard. It is up to programmers to educate themselves on what that language is.
And, while there are several good approaches to educating yourself
on what the language is ... and I realize this is going to endear me
to nobody ... my preferred method is "trial and error" -- despite my
"naive and dangerous" form of reasoning, it's a perfectly effective
approach, assuming you start out by trusting nobody. I don't trust
the standard (in part because there's no guarantee it has been
implemented correctly, but mostly because I don't have a copy),
I don't trust compiler designers (because they don't necessarily
implement correctly), I don't trust secondary documentation (it's
like a photocopy of a photocopy), I *certainly* don't trust usenet,
and I trust my own memory *least of all*. What I trust are demonstrable
results.
Naturally, with that mentality, I tend to code defensively. It would
never
even occur to me to *want* to change argv (or use gets). Still I do
find
these conversations fascinating, and I always enjoy the cranky
attitude found on usenet!
-bluejack
bluejack wrote: Given that there are no const keywords in use, one would expect that argv is modifyable in any and all senses. Naturally, main is something of an exception case, but even so, I trust that the people who established the standard were reasonably sensible and rigorous people, and if they had meant for something to be const, they would have used the const keyword to so indicate.
This is fairly meaningless due to the total lack of context. See
my sig below for a way to use the broken google interface sanely.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
bluejack wrote: Peter Nilsson wrote: That is a naive, even dangerous, form of reasoning. C has many quirks which are counter-intuitive. Some of them are far from sensisble, e.g. gets. Granted.
These quirks won't be learnt by trial and error. The *most* you will
learn is how the specific version of the specific implementation you are
using works. Trusting (or blaming) the Committee is an irrelevance. At the end of the day, the language is that written in the Standard. It is up to programmers to educate themselves on what that language is.
And, while there are several good approaches to educating yourself on what the language is ... and I realize this is going to endear me to nobody ... my preferred method is "trial and error" -- despite my "naive and dangerous" form of reasoning, it's a perfectly effective approach,
No, it is most definitely NOT a perfectly effective method. All sorts of
things that you might think are correct, and might work on your compiler
this week, might fail abysmally when it actually matters to you.
assuming you start out by trusting nobody.
Start by not trusting trial and error, because it has been repeatedly
been shown that the people posting here having relied on it to learn C
have learnt to do things which are definitely wrong.
I don't trust the standard (in part because there's no guarantee it has been implemented correctly,
In that case build your own chip factory, design and build your own
chips, and write your own compiler.
but mostly because I don't have a copy),
Google for n1124.pdf to get a free public draft of the next version, or
buy a copy of the current version from a standards body (you can get it
for $18 last I heard).
I don't trust compiler designers (because they don't necessarily implement correctly),
In that case don't use any you have not implemented. You also can't
trust assemblers, text editors or the OS by that reasoning.
I don't trust secondary documentation (it's like a photocopy of a photocopy),
It is easy to find reviews of books to see if they are reliable, and you
can cross-reference to the standard if you are not sure.
I *certainly* don't trust usenet, and I trust my own memory *least of all*.
What I trust are demonstrable results.
I can demonstrate with one compiler that you can safely modify string
literals and get the expected result. I can also demonstrate with a
later version of the *same* compiler that you can't modify string
literals because it causes a SIGSEGV (I might be wrong on the exact
signal, but definitely a crash). The reality is that anything can happen
because it is undefined behaviour. However, had I relied on your method
of trial and error all my code could have suddenly gone from "working"
to "crashing".
If I could be bothered I could come up with lots of other examples, but
the above is one I know to be demonstrably true.
Naturally, with that mentality, I tend to code defensively.
Coding defensively REQUIRES understanding how the language is DEFINED to
work, what you are doing by relying on trial and error rather than a
reliable source of information is coding stupidly.
It would never even occur to me to *want* to change argv (or use gets). Still I do find these conversations fascinating, and I always enjoy the cranky attitude found on usenet!
Well, if you think trial and error is a substitute for a good text book
expect responses a lot more cranky than mine.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
On 2005-12-15, Eric Sosman <es*****@acm-dot-org.invalid> wrote: bluejack wrote: Given that there are no const keywords in use, one would expect that argv is modifyable in any and all senses. Naturally, main is something of an exception case, but even so, I trust that the people who established the standard were reasonably sensible and rigorous people, and if they had meant for something to be const, they would have used the const keyword to so indicate. On the other hand, the authors of the Standard stated explicitly that the pointed-to strings are modifiable, even though the "no `const' appears" argument would apply to them with equal force. Why did they bother?
Keep in mind the large body of C code already in existence before `const' entered the language. The ANSI committee could not invalidate two-plus decades' worth of existing code because they'd thought of a better way. They codified existing practice, even though (with the new tools) more explicit practice was possible.
They could have permitted an additional prototype:
int main(int argc, char * const *argv); which i think they would have
done if they had intended that the pointers may not be modifiable. It seems to me not unlike the situation with string literals: They are not `const', yet you are forbidden to try to alter them.
Except, of course, that you are inferring that by lack of analogy to the
explicit permission to write their targets, not from any actual language
in the standard.
The Rationale explains that they were not made `const' because a lot of existing code would break; instead, they are non-`const' and the Standard has special language warning you not to modify them.
The standard does not have such special language for the argv pointers.
The behavior in modifying a non-const variable that is not a string
literal and was not cast from the address of a const variable is
well-defined. The argv question seems similar (although the Rationale does not confirm it): Pre-`const' code declared argv as `char**', and the Standard adopted that use but added special language describing the writeability of argv[i][j]. I think it a "curious incident" that the Standard says nothing about the writeability of argv[i].
I think it's more curious that it does add such language for the
writeability of argv[i][j], given that it's non-const (and not a string
literal) and hence "should" be modifiable anyway.
"Chuck F. " <cb********@yahoo.com> writes: bluejack wrote: Given that there are no const keywords in use, one would expect that argv is modifyable in any and all senses. Naturally, main is something of an exception case, but even so, I trust that the people who established the standard were reasonably sensible and rigorous people, and if they had meant for something to be const, they would have used the const keyword to so indicate.
This is fairly meaningless due to the total lack of context. See my sig below for a way to use the broken google interface sanely.
-- "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson
Or, better yet, read the more detailed description at
<http://cfaj.freeshell.org/google/>.
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
On Fri, 16 Dec 2005 01:11:09 +0000, Jordan Abel wrote: On 2005-12-15, Eric Sosman <es*****@acm-dot-org.invalid> wrote:
[on string literals as an analogy to argv] The Rationale explains that they were not made `const' because a lot of existing code would break; instead, they are non-`const' and the Standard has special language warning you not to modify them.
The standard does not have such special language for the argv pointers. The behavior in modifying a non-const variable that is not a string literal and was not cast from the address of a const variable is well-defined.
The ultimate declaration of the argv variable passed into the program is
not specified though, all the program gets is the declaration of the
function parameter.
It's legal to cast a const-qualified variable to a non-const version of
the same and pass it into a function, it's just not legal to write to it
within the function. The argv question seems similar (although the Rationale does not confirm it): Pre-`const' code declared argv as `char**', and the Standard adopted that use but added special language describing the writeability of argv[i][j]. I think it a "curious incident" that the Standard says nothing about the writeability of argv[i].
I think it's more curious that it does add such language for the writeability of argv[i][j], given that it's non-const (and not a string literal) and hence "should" be modifiable anyway.
Without that language it would be implicitly undefined behaviour to
attempt to modify argv and argv[i][j], as it is now for arg[i]. The
curiosity is that the Standard left it implicit rather than making it
explicit.
-- http://members.dodo.com.au/~netocrat
On 2005-12-16, Netocrat <ne******@dodo.com.au> wrote: On Fri, 16 Dec 2005 01:11:09 +0000, Jordan Abel wrote: On 2005-12-15, Eric Sosman <es*****@acm-dot-org.invalid> wrote: [on string literals as an analogy to argv] The Rationale explains that they were not made `const' because a lot of existing code would break; instead, they are non-`const' and the Standard has special language warning you not to modify them.
The standard does not have such special language for the argv pointers. The behavior in modifying a non-const variable that is not a string literal and was not cast from the address of a const variable is well-defined.
The ultimate declaration of the argv variable passed into the program is not specified though, all the program gets is the declaration of the function parameter.
It's legal to cast a const-qualified variable to a non-const version of the same and pass it into a function, it's just not legal to write to it within the function.
There is, however no basis in the text for supposing that this is the
case for *argv (...etc). I think it's more curious that it does add such language for the writeability of argv[i][j], given that it's non-const (and not a string literal) and hence "should" be modifiable anyway.
Without that language it would be implicitly undefined behaviour
It would not. without that language, **argv (...etc) would still be of
type char, not const char, and since it's not a string literal (a listed
exception to an object of type char being modifiable), there's no basis
for supposing that it would be non-modifiable.
to attempt to modify argv and argv[i][j], as it is now for arg[i]. The curiosity is that the Standard left it implicit rather than making it explicit.
There is no basis in the text for believing that it might be the case,
other than your interpretation of a conspicuous lack of a similar
statement for argv[i] as for argv[i][j].
On Tue, 20 Dec 2005 04:23:24 +0000, Jordan Abel wrote: On 2005-12-16, Netocrat <ne******@dodo.com.au> wrote:
[...] It's legal to cast a const-qualified variable to a non-const version of the same and pass it into a function, it's just not legal to write to it within the function.
[I worded the above sloppily. More correctly the first sentence should
begin: "It's legal to take the address of a const-declared variable, cast
it to a pointer to a non-const qualified version of the variable's type,
and pass that pointer into a function, ..."] There is, however no basis in the text for supposing that this is the case for *argv (...etc).
(Assuming that you interpreted my sloppy wording as intended) I'd express
that in reverse: there's no basis in the text for supposing that the
variables passed into main() are uniquely unaffected by this possibility.
The mention that argc and argv are modifiable does seem redundant, but
useful clarification given that they are coming from an external
environment. The claim in my last post that it would be implicit UB to
attempt to modify argv without this mention may have been too strong, but
I'm not convinced that modifying argv[i][j] would be legal and defined
without mention.
-- http://members.dodo.com.au/~netocrat
comp.std.c added, it seems appropriate: for those who haven't been
following along, the issue is whether the text of the standard supports
a view that modification of the elements of argv [i.e. the individual
pointers] results in undefined behavior.
On 2005-12-20, Netocrat <ne******@dodo.com.au> wrote: On Tue, 20 Dec 2005 04:23:24 +0000, Jordan Abel wrote: On 2005-12-16, Netocrat <ne******@dodo.com.au> wrote: [...] It's legal to cast a const-qualified variable to a non-const version of the same and pass it into a function, it's just not legal to write to it within the function. [I worded the above sloppily. More correctly the first sentence should begin: "It's legal to take the address of a const-declared variable, cast it to a pointer to a non-const qualified version of the variable's type, and pass that pointer into a function, ..."]
But there's no reason to think that this has been done by whatever calls
main. There is, however no basis in the text for supposing that this is the case for *argv (...etc).
(Assuming that you interpreted my sloppy wording as intended) I'd express that in reverse: there's no basis in the text for supposing that the variables passed into main() are uniquely unaffected by this possibility.
It's hardly unique.
Jordan Abel wrote: comp.std.c added, it seems appropriate: for those who haven't been following along, the issue is whether the text of the standard supports a view that modification of the elements of argv [i.e. the individual pointers] results in undefined behavior.
Yes.
Section 5.1.2.2.1 of the ISO/IEC 9899:1999 seems quite explicit:
The parameters argc and argv and the strings pointed to by the argv
array shall be modifiable by the program, and retain their last-stored
values between program startup and program termination.
There's a note about the conventional but non-mandatory use of argc and
argv as the names of the parameters. Looks pretty clear to me...
--
Jonathan Leffler #include <disclaimer.h>
Email: jl******@earthlink.net, jl******@us.ibm.com
Guardian of DBD::Informix v2005.02 -- http://dbi.perl.org/
>Jordan Abel wrote: comp.std.c added, it seems appropriate: for those who haven't been following along, the issue is whether the text of the standard supports a view that modification of the elements of argv [i.e. the individual pointers] results in undefined behavior.
In article <Uu*****************@newsread3.news.pas.earthlink. net>
Jonathan Leffler <jl******@earthlink.net> wrote:Yes.
I think this claim is a bit premature....
Section 5.1.2.2.1 of the ISO/IEC 9899:1999 seems quite explicit:
The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.
That text guarantees that, in the "code" part of:
int main(int argc, char **argv) {
... code ...
}
the programmer may change argv itself (though this is hardly
controversial) and, for appropriate values of i and j, the programmer
may change argv[i][j] by ordinary assignment. Thus, e.g., the code
fragment below is fine given suitable i, p, and q:
/* suppose at this point, strcmp(argv[i], "this:that") == 0 */
p = argv[i];
p[4] = '\0';
q = p + 5;
/* now strcmp(p, "this") == 0 && strcmp(q, "that") == 0 */
The question in question (it is late, pardon the phrasing :-) ) is
whether this is also proper, given suitable i, k, and p:
p = argv[i];
argv[i] = argv[k];
argv[k] = p;
This writes on argv[i] and argv[k], rather than argv[i][j]. The
fact that the Standard explicitly allows the programmer to write
on argv[i][j] should make one wonder why it fails to mention whether
the programmer may write on argv[i] itself. The lack of a "const"
qualifier is not in itself permission, since:
void f(void) {
char *x = "this:that";
x[4] = '\0';
...
}
violates a "shall" outside a constraints section, rendering the
behavior undefined, yet no part of the declaration of x uses "const".
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
[within an arbitrary function, it can't be determined whether the
pointed-to objects of any non-const-qualified pointer parameters were
originally declared const]
Jordan Abel wrote: But there's no reason to think that this has been done by whatever calls main.
Irrelevant unless you can point to a specific prohibition against it or
something similar (i.e. analogous to string literals) for argv[i].
-- http://members.dodo.com.au/~netocrat
On 2005-12-20, Netocrat <ne******@dodo.com.au> wrote: [within an arbitrary function, it can't be determined whether the pointed-to objects of any non-const-qualified pointer parameters were originally declared const]
Jordan Abel wrote: But there's no reason to think that this has been done by whatever calls main.
Irrelevant unless you can point to a specific prohibition against it or something similar (i.e. analogous to string literals) for argv[i].
Eh? I was asserting that you _can_ modify the array [as there is no
specific prohibition against doing so - as far as i can tell you are the
one who who needs to find a specific prohibition analogous to string
literals]
On Tue, 20 Dec 2005 10:48:32 +0000, Jordan Abel wrote: On 2005-12-20, Netocrat <ne******@dodo.com.au> wrote: [within an arbitrary function, it can't be determined whether the pointed-to objects of any non-const-qualified pointer parameters were originally declared const]
Jordan Abel wrote: But there's no reason to think that this has been done by whatever calls main. Irrelevant unless you can point to a specific prohibition against it or something similar (i.e. analogous to string literals) for argv[i].
Eh?
For "it", substitute "passing a pointer to an object originally defined
with a const qualifier". I'm guessing as to where the misunderstanding
lies - that may clarify it. I interpreted the sentence to which I was
responding in a way that the same fragment could (inelegantly) also
substitute for "this".
I was asserting that you _can_ modify the array [as there is no specific prohibition against doing so - as far as i can tell you are the one who who needs to find a specific prohibition analogous to string literals]
Try this code snippet for a thought experiment:
/* allocates memory and assigns argument strings and count */
char *const *init_new_args(int *new_argc);
int main(int argc, char **argv) {
int new_argc;
char *const *new_argv = init_new_args(&new_argc);
static int first_time = 1;
if (first_time--)
main(new_argc, (char **)new_argv);
/* else argv[0] = a_char_pointer; */ /* illegal attempt
to write to an object that was
defined with a const qualifier */
return 0;
}
On the internal call, main is passed an argv parameter that was defined
with const-qualified members (i.e. argv[i] is non-modifiable). This const
qualifier has been (legally) cast away; nevertheless as commented in the
code it is illegal to write to the originally const-qualified pointer
objects.
The rules of the Standard allow this situation for an internal call. I
see nothing to indicate that the function main() is conceptually unique,
other than that it's the program entry point. So there's nothing to
prevent an equivalent situation for an external call; hence without a
mention otherwise by the Standard, modifying argv[i] is UB.
-- http://members.dodo.com.au/~netocrat
On 2005-12-21, Netocrat <ne******@dodo.com.au> wrote: On Tue, 20 Dec 2005 10:48:32 +0000, Jordan Abel wrote: On 2005-12-20, Netocrat <ne******@dodo.com.au> wrote: [within an arbitrary function, it can't be determined whether the pointed-to objects of any non-const-qualified pointer parameters were originally declared const]
Jordan Abel wrote: But there's no reason to think that this has been done by whatever calls main.
Irrelevant unless you can point to a specific prohibition against it or something similar (i.e. analogous to string literals) for argv[i]. Eh?
For "it", substitute "passing a pointer to an object originally defined with a const qualifier". I'm guessing as to where the misunderstanding lies - that may clarify it. I interpreted the sentence to which I was responding in a way that the same fragment could (inelegantly) also substitute for "this".
I was asserting that you _can_ modify the array [as there is no specific prohibition against doing so - as far as i can tell you are the one who who needs to find a specific prohibition analogous to string literals]
Try this code snippet for a thought experiment:
/* allocates memory and assigns argument strings and count */ char *const *init_new_args(int *new_argc);
int main(int argc, char **argv) { int new_argc; char *const *new_argv = init_new_args(&new_argc); static int first_time = 1; if (first_time--) main(new_argc, (char **)new_argv); /* else argv[0] = a_char_pointer; */ /* illegal attempt to write to an object that was defined with a const qualifier */ return 0; }
On the internal call, main is passed an argv parameter that was defined with const-qualified members (i.e. argv[i] is non-modifiable). This const qualifier has been (legally) cast away; nevertheless as commented in the code it is illegal to write to the originally const-qualified pointer objects.
The rules of the Standard allow this situation for an internal call.
I believe that for this to be allowed for values that "come from" the
depths of the library requires explicit permission [that is, for the
standard to say that the result, though a char *, may not be modified] -
there is such an explicit statement on getenv, for example. Anywhere
that a pointer non-const object of unknown origin is permitted to have
been cast from a const one, the standard has explicit language that
says this may be the case.
I see nothing to indicate that the function main() is conceptually unique, other than that it's the program entry point. So there's nothing to prevent an equivalent situation for an external call; hence without a mention otherwise by the Standard, modifying argv[i] is UB.
On Wed, 21 Dec 2005 09:11:48 +0000, Jordan Abel wrote:
[...] I believe that for this to be allowed for values that "come from" the depths of the library requires explicit permission [that is, for the standard to say that the result, though a char *, may not be modified] - there is such an explicit statement on getenv, for example.
That's definitely a useful statement. If it weren't present though, then
modifying the string returned by getenv would be (implicit) UB anyway,
because the Standard would not be prohibiting an implementation from
returning an immutable string, therefore portable programs couldn't rely
on the string's mutability.
This doesn't apply to most other char* returning library functions (e.g.
in string.h) since the returned pointer points within an object whose
definition is under the programmer's control - or that was at least
provided to the function by the programmer.
-- http://members.dodo.com.au/~netocrat
On 2005-12-21, Netocrat <ne******@dodo.com.au> wrote: On Wed, 21 Dec 2005 09:11:48 +0000, Jordan Abel wrote: [...] I believe that for this to be allowed for values that "come from" the depths of the library requires explicit permission [that is, for the standard to say that the result, though a char *, may not be modified] - there is such an explicit statement on getenv, for example.
That's definitely a useful statement. If it weren't present though, then modifying the string returned by getenv would be (implicit) UB anyway, because the Standard would not be prohibiting an implementation from returning an immutable string, therefore portable programs couldn't rely on the string's mutability.
But it only has to explicitly state it because itdoesn't return const
bluejack wrote: Naturally, with that mentality, I tend to code defensively. It would never even occur to me to *want* to change argv (or use gets). Still I do find these conversations fascinating, and I always enjoy the cranky attitude found on usenet!
-bluejack
I code defensively too. However, the client I am working for is
implementing many C programs that access a database. The vendor
designed these programs to accept the database user login and password
on the command line. This is unfortunate because using the UNIX (it's a
UNIX operating system) ps command you can see the arguments, and the
client is security concious. As an experiment I wrote a function that
allocated a replacement array of pointers. To each element of the array
I allocated memory and copied the string, with the exception of argv[0].
I then went through the original array an null terminated each string
pointed to by argv with the exception of argv[0]. I then overlayed argv
with the copy of that I had made. Presto, all of the command line
arguments disappear when using the ps command, and the program still has
its own private copy of the arguments. Management was estatic.
However, I warned that this technique might not work on other UNIXes or
work with later releases of the UNIX they are using now. They didn't
care. Their immediate problem was solved.
Regards,
Stan Milam.
On Wed, 21 Dec 2005 23:12:56 +0000, Jordan Abel wrote: On 2005-12-21, Netocrat <ne******@dodo.com.au> wrote: On Wed, 21 Dec 2005 09:11:48 +0000, Jordan Abel wrote: [...] I believe that for this to be allowed for values that "come from" the depths of the library requires explicit permission [that is, for the standard to say that the result, though a char *, may not be modified] - there is such an explicit statement on getenv, for example.
That's definitely a useful statement. If it weren't present though, then modifying the string returned by getenv would be (implicit) UB anyway, because the Standard would not be prohibiting an implementation from returning an immutable string, therefore portable programs couldn't rely on the string's mutability.
But it only has to explicitly state it because itdoesn't return const
It doesn't have to explicitly state it in either case, although it would
be more visibly redundant to state it if getenv() did return const.
Probably it doesn't return const for historical reasons and/or to make it
easy for an implementation to define the returned string as mutable if it
wants to.
You seem to genuinely object to the concept of implicit UB in general and
I can't see us finding mutual understanding without something more than
the brief back-and-forths thus far. So here's a "from first principles"
view of mutability which you may be able to use to identify your objection
more clearly:
Firstly we have 6.3.2.1#1 - which defines a "modifiable lvalue" as any
non-const-qualified object expression - and 6.5.16#2 - which defines the
assignment operators to require a modifiable lvalue as a target. The
other Standard-defined method of writing to an object is by passing its
address to a library function.
At this point[*], the programmer can without restriction write to
(almost[**]) any object by legally casting away any constness - there is
nothing to prevent the write methods from doing their (well-defined) jobs.
So if you were here to raise your claim that argv[i] is modifiable in the
absence of other mention by the Standard, I'd say you'd have an arguable
point: the concept that an object may be immutable is thus far not even a
part of the universe of discourse.
Let's then discover 6.7.3#5 which makes it explicit UB to attempt to write
to an object whose definition is const-qualified, even if done through a
modifiable lvalue, and 6.4.5#6 which does the same for string literals.
/Now/ there exists the concept that "some objects are guaranteed to be
mutable, and others are not guaranteed (but may be anyway on any given
implementation)". At this point, we can only be assured that the write
methods will work portably for objects whose definition was neither
const-qualified nor a string literal. So we cannot be sure that it is
safe to write to any object whose definition we do not have access to, or
whose definition we are not supplied with - if the Standard doesn't
explicitly make it clear what applies for such objects then writing to
them is implicit UB.
Two such (Standard-mandated) objects are the elements argv[i] and the
string returned as getenv()'s result, although as you've noted the
Standard helpfully makes the undefinedness of writing to the latter
explicit. It is, as stated earlier in the thread, curious that the
Standard does not do the same for argv[i] - even though it wouldn't affect
the end result.
[*] meaning in reference to a version of the Standard with the clauses
later "discovered" excised
[**] the exception is register-qualified objects, whose address may not be
taken, so that:
*(T *)&const_register_object_of_type_T
will not work as it will for any non-register qualified object.
-- http://members.dodo.com.au/~netocrat
In article <ig*******************@newssvr29.news.prodigy.ne t>
Stan Milam <st*****@swbell.net> wrote (in part): ... the client [wanted the arguments not to show up in "ps" output, and saving-then-overwriting the argv data accomplished that]. Management was estatic. However, I warned that this technique might not work on other UNIXes or work with later releases of the UNIX they are using now. They didn't care. Their immediate problem was solved.
For what it is worth (not all that much perhaps), your warning is
well-taken: it does not in fact work on some Unix variants. (There
is also a more subtle problem: even where it works, there is still
a race condition. The arguments can be captured by a "ps" command
at the right time.)
Still, for non-portable problems, non-portable solutions are
often appropriate.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
The answer to the original question is that of course the elements
of the array (of length equal to the value of main's int parameter)
pointed to by main's char** parameter can be modified. If we had
intended otherwise we would have used const qualification at the
appropriate position (between the *s) when specifying the interface.
"DAGwyn" <DA****@null.net> writes: The answer to the original question is that of course the elements of the array (of length equal to the value of main's int parameter) pointed to by main's char** parameter can be modified. If we had intended otherwise we would have used const qualification at the appropriate position (between the *s) when specifying the interface.
Then why does 5.1.2.2.1p2 explicitly state that the program can modify
argc, argv, and the strings pointed to by the argv array, but not make
the same statement about the elements of the argv array? Was it just
an oversight?
There is precedent (string literals) for making something not
explicitly const, but not allowing it to be modified.
If you're correct, than an implementation that makes argv a pointer to
an array (of char*) in read-only memory (or at least memory that can't
be modified by the program) would be non-conforming, but it's not
quite obvious (to me) from the standard.
And Doug, if you're going to post through Google Groups, please read
<http://cfaj.freeshell.org/google/>. Thanks, and Merry Christmas.
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Doug Gwyn writes: The answer to the original question is that of course the elements of the array (of length equal to the value of main's int parameter) pointed to by main's char** parameter can be modified. If we had intended otherwise we would have used const qualification at the appropriate position (between the *s) when specifying the interface.
Keith Thompson wrote: Then why does 5.1.2.2.1p2 explicitly state that the program can modify argc, argv, and the strings pointed to by the argv array, but not make the same statement about the elements of the argv array? Was it just an oversight?
If you're correct, than an implementation that makes argv a pointer to an array (of char*) in read-only memory (or at least memory that can't be modified by the program) would be non-conforming, but it's not quite obvious (to me) from the standard.
FWIW, I submitted a public comment on this issue (Feb 1998): http://david.tribble.com/text/c9xc004.txt
I suggested that since no mention of the argv elements was made,
to add a sentence making it implementation-defined whether they
are modifiable or not.
At the time I submitted the comment, I believe there were systems
that did not allow argv[i] to be modified, and it was not clear that
such systems were or were not conforming.
My comment was rejected, IIRC, because such verbiage was "not
necessary".
-drt
Doug Gwyn writes: The answer to the original question is that of course the elements of the array (of length equal to the value of main's int parameter) pointed to by main's char** parameter can be modified. If we had intended otherwise we would have used const qualification at the appropriate position (between the *s) when specifying the interface.
Keith Thompson wrote: Then why does 5.1.2.2.1p2 explicitly state that the program can modify argc, argv, and the strings pointed to by the argv array, but not make the same statement about the elements of the argv array? Was it just an oversight?
FWIW, I submitted a public comment about that very issue
(Feb 1998): http://david.tribble.com/text/c9xc004.txt
I suggested that it be deemed implementation-defined whether
or not argv[i] is modifiable. My comment was rejected, IIRC,
because the extra verbiage was not necessary.
If you're correct, than an implementation that makes argv a pointer to an array (of char*) in read-only memory (or at least memory that can't be modified by the program) would be non-conforming, but it's not quite obvious (to me) from the standard.
At the time I sumbitted my comment, there did exist systems that
did not allow modifying argv[i], and it was not clear whether or not
those systems were conforming.
-drt This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Charles Sullivan |
last post by:
I'm working on a program which has a "tree" of command line arguments,
i.e.,
myprogram level1 ]]
such that there can be more than one level2 argument for each level1
argument and more than one...
|
by: Martin |
last post by:
When referring to the conforming declaration for main, Lint displays
Info 818: Pointer parameter 'argv' (line 3) could be declared as
pointing to const
Presumably it's saying that the...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
| |