Is argv array modifiable ?

mnaydin

Assume the main function is defined with
int main(int argc, char *argv[]) { /*...*/ }

So, is it permitted to modify the argv array? The standard says
"The parameters argc and argv and the strings pointed to by the
argv array shall be modifiable by the program,[...]". According to
my reading of the standard, for example, ++argv and ++argv[0][0]
are both permitted, but not ++argv[0] because it says nothing about
the argv array itself. Is my interpretation correct ?

Dec 15 '05 #1

Subscribe Post Reply

8532

Richard Heathfield

mnaydin said:

Assume the main function is defined with
int main(int argc, char *argv[]) { /*...*/ }

So, is it permitted to modify the argv array? The standard says
"The parameters argc and argv and the strings pointed to by the
argv array shall be modifiable by the program,[...]". According to
my reading of the standard, for example, ++argv and ++argv[0][0]
are both permitted, but not ++argv[0] because it says nothing about
the argv array itself. Is my interpretation correct ?

<caveat class="this is from memory, not the Standard">
I believe so, yes. You can modify argv because you get a
copy of the caller's value, so why should the caller care
what you do with it? You can modify the contents of each
string because there's no particular reason to forbid you
to, so long as you don't try to stretch the string - i.e.
scribble over or past the null terminator. But for all you
know, the implementation might have used dynamic allocation
to get the memory it needs for storing those strings, and
might have no spare copy of the pointer values returned by
the allocator - so (if I recall correctly) the Standard
doesn't offer any behaviour guarantees whatsoever if you
mess with those pointers.
</caveat>

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Dec 15 '05 #2

Jordan Abel

On 2005-12-15, Richard Heathfield <in*****@invalid.invalid> wrote:

mnaydin said:
Assume the main function is defined with
int main(int argc, char *argv[]) { /*...*/ }

So, is it permitted to modify the argv array? The standard says
"The parameters argc and argv and the strings pointed to by the
argv array shall be modifiable by the program,[...]". According to
my reading of the standard, for example, ++argv and ++argv[0][0]
are both permitted, but not ++argv[0] because it says nothing about
the argv array itself. Is my interpretation correct ?

<caveat class="this is from memory, not the Standard">
I believe so, yes. You can modify argv because you get a
copy of the caller's value, so why should the caller care
what you do with it? You can modify the contents of each
string because there's no particular reason to forbid you
to, so long as you don't try to stretch the string - i.e.
scribble over or past the null terminator. But for all you
know, the implementation might have used dynamic allocation
to get the memory it needs for storing those strings, and
might have no spare copy of the pointer values returned by
the allocator - so (if I recall correctly) the Standard
doesn't offer any behaviour guarantees whatsoever if you
mess with those pointers.
</caveat>

Can you swap two of them? [suppose you want to bring all arguments
starting with '-' to the beginning of the array]

Dec 15 '05 #3

Eric Sosman

Jordan Abel wrote:

On 2005-12-15, Richard Heathfield <in*****@invalid.invalid> wrote:
mnaydin said:

Assume the main function is defined with
int main(int argc, char *argv[]) { /*...*/ }

So, is it permitted to modify the argv array? The standard says
"The parameters argc and argv and the strings pointed to by the
argv array shall be modifiable by the program,[...]". According to
my reading of the standard, for example, ++argv and ++argv[0][0]
are both permitted, but not ++argv[0] because it says nothing about
the argv array itself. Is my interpretation correct ?

<caveat class="this is from memory, not the Standard">
I believe so, yes. You can modify argv because you get a
copy of the caller's value, so why should the caller care
what you do with it? You can modify the contents of each
string because there's no particular reason to forbid you
to, so long as you don't try to stretch the string - i.e.
scribble over or past the null terminator. But for all you
know, the implementation might have used dynamic allocation
to get the memory it needs for storing those strings, and
might have no spare copy of the pointer values returned by
the allocator - so (if I recall correctly) the Standard
doesn't offer any behaviour guarantees whatsoever if you
mess with those pointers.
</caveat>

Can you swap two of them? [suppose you want to bring all arguments
starting with '-' to the beginning of the array]

Not reliably. There are three different things one might
be talking about when one says `argv':

- The function parameter variable: This is modifiable.

- The individual pointers argv[0], argv[1], ... The
Standard says nothing about whether these are modifiable.

- The strings whose first characters are *argv[0],
*argv[1], ... The Standard says these are modifiable.

Section 5.1.2.2.1, paragraph 2, final constraint.

--
Eric Sosman
es*****@acm-dot-org.invalid

Dec 15 '05 #4

bluejack

Given that there are no const keywords in use, one would expect that
argv is modifyable in any and all senses. Naturally, main is something
of an exception case, but even so, I trust that the people who
established the standard were reasonably sensible and rigorous people,
and if they had meant for something to be const, they would have used
the const keyword to so indicate.

As for compiler designers...

-bluejack

Dec 15 '05 #5

mnaydin

Jordan Abel wrote:

On 2005-12-15, Richard Heathfield <in*****@invalid.invalid> wrote:
mnaydin said:
Assume the main function is defined with
int main(int argc, char *argv[]) { /*...*/ }

So, is it permitted to modify the argv array? The standard says
"The parameters argc and argv and the strings pointed to by the
argv array shall be modifiable by the program,[...]". According to
my reading of the standard, for example, ++argv and ++argv[0][0]
are both permitted, but not ++argv[0] because it says nothing about
the argv array itself. Is my interpretation correct ?

<caveat class="this is from memory, not the Standard">
I believe so, yes. You can modify argv because you get a
copy of the caller's value, so why should the caller care
what you do with it? You can modify the contents of each
string because there's no particular reason to forbid you
to, so long as you don't try to stretch the string - i.e.
scribble over or past the null terminator. But for all you
know, the implementation might have used dynamic allocation
to get the memory it needs for storing those strings, and
might have no spare copy of the pointer values returned by
the allocator - so (if I recall correctly) the Standard
doesn't offer any behaviour guarantees whatsoever if you
mess with those pointers.
</caveat>

Can you swap two of them? [suppose you want to bring all arguments
starting with '-' to the beginning of the array]

Yes, my primary intention is to bring some arguments to the beginning
of the array. But swapping two of them on the argv array is not a
solution because the assignment argv[i] = argv[j] is not guaranteed to
work since argv array may not be modifiable, as Richard and Eric said
in this thread. On the other hand, I thought this was a common
practice. At least in K&R2 there is an example on the page 117,
section 5.10, where argv[0] is modified, though with a different
purpose from mine. Interestingly, in the K&R1 version of the same
example, on the page 113, section 5.11, the argv[0] was not modified
and a pointer to char, named s, was used to loop through the string.

In any case, I think one of the easy and guranteed solutions is to
clone the original argv array and work on the cloned array,
something like that:
char **arglist = malloc((argc + 1) * sizeof *arglist);
if (arglist == NULL) ... Ouch ! ...
memcpy(arglist, argv, (argc + 1) * sizeof *arglist);

Dec 15 '05 #6

Eric Sosman

bluejack wrote:

Given that there are no const keywords in use, one would expect that
argv is modifyable in any and all senses. Naturally, main is something
of an exception case, but even so, I trust that the people who
established the standard were reasonably sensible and rigorous people,
and if they had meant for something to be const, they would have used
the const keyword to so indicate.

On the other hand, the authors of the Standard stated
explicitly that the pointed-to strings are modifiable, even
though the "no `const' appears" argument would apply to them
with equal force. Why did they bother?

Keep in mind the large body of C code already in existence
before `const' entered the language. The ANSI committee could
not invalidate two-plus decades' worth of existing code because
they'd thought of a better way. They codified existing practice,
even though (with the new tools) more explicit practice was
possible.

It seems to me not unlike the situation with string literals:
They are not `const', yet you are forbidden to try to alter them.
The Rationale explains that they were not made `const' because a
lot of existing code would break; instead, they are non-`const'
and the Standard has special language warning you not to modify
them.

The argv question seems similar (although the Rationale does
not confirm it): Pre-`const' code declared argv as `char**', and
the Standard adopted that use but added special language describing
the writeability of argv[i][j]. I think it a "curious incident"
that the Standard says nothing about the writeability of argv[i].

--
Eric Sosman
es*****@acm-dot-org.invalid

Dec 15 '05 #7

mnaydin

bluejack wrote:

Given that there are no const keywords in use, one would expect that
argv is modifyable in any and all senses. Naturally, main is something
of an exception case, but even so, I trust that the people who
established the standard were reasonably sensible and rigorous people,
and if they had meant for something to be const, they would have used
the const keyword to so indicate.

As for compiler designers...

-bluejack

But, by the same logic, one could argue that it is explicitly stated in
the standard that the parameters argc, argv, and the strings pointed to
by argv array shall be modifiable, even though there is no const
keyword qualifying them, but nothing is stated on the modifiability
of the argv array itself (ie, argv[0],...,argv[argc]), so there is a
strong indication that the argv array is not supposed to be modifiable.
I think relying on the absence of the const keyword is not a valid
argument.

Dec 15 '05 #8

Peter Nilsson

[You might like to quote some context. If your message was not related
to Eric Sosman's then perhaps you should reply to the OP's message
rather than somewhere downthread.]

bluejack wrote:

Given that there are no const keywords in use, one would expect
that argv is modifyable in any and all senses. I trust that the
people who established the standard were reasonably sensible and
rigorous people, and if they had meant for something to be const,
they would have used the const keyword to so indicate.

That is a naive, even dangerous, form of reasoning. C has many quirks
which are counter-intuitive. Some of them are far from sensisble, e.g.
gets.

Trusting (or blaming) the Committee is an irrelevance. At the end of
the day, the language is that written in the Standard. It is up to
programmers to educate themselves on what that language is.

C is one of the worst languages for programming by intuition and hope!

--
Peter

Dec 15 '05 #9

bluejack

Peter Nilsson wrote:

That is a naive, even dangerous, form of reasoning. C has many quirks
which are counter-intuitive. Some of them are far from sensisble, e.g.
gets.
Granted.
Trusting (or blaming) the Committee is an irrelevance. At the end of
the day, the language is that written in the Standard. It is up to
programmers to educate themselves on what that language is.

And, while there are several good approaches to educating yourself
on what the language is ... and I realize this is going to endear me
to nobody ... my preferred method is "trial and error" -- despite my
"naive and dangerous" form of reasoning, it's a perfectly effective
approach, assuming you start out by trusting nobody. I don't trust
the standard (in part because there's no guarantee it has been
implemented correctly, but mostly because I don't have a copy),
I don't trust compiler designers (because they don't necessarily
implement correctly), I don't trust secondary documentation (it's
like a photocopy of a photocopy), I *certainly* don't trust usenet,
and I trust my own memory *least of all*. What I trust are demonstrable
results.

Naturally, with that mentality, I tend to code defensively. It would
never
even occur to me to *want* to change argv (or use gets). Still I do
find
these conversations fascinating, and I always enjoy the cranky
attitude found on usenet!

-bluejack

Dec 15 '05 #10

Chuck F.

bluejack wrote:

Given that there are no const keywords in use, one would expect
that argv is modifyable in any and all senses. Naturally, main
is something of an exception case, but even so, I trust that the
people who established the standard were reasonably sensible and
rigorous people, and if they had meant for something to be const,
they would have used the const keyword to so indicate.

This is fairly meaningless due to the total lack of context. See
my sig below for a way to use the broken google interface sanely.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Dec 15 '05 #11

Flash Gordon

bluejack wrote:

Peter Nilsson wrote:
That is a naive, even dangerous, form of reasoning. C has many quirks
which are counter-intuitive. Some of them are far from sensisble, e.g.
gets.
Granted.

These quirks won't be learnt by trial and error. The *most* you will
learn is how the specific version of the specific implementation you are
using works.

Trusting (or blaming) the Committee is an irrelevance. At the end of
the day, the language is that written in the Standard. It is up to
programmers to educate themselves on what that language is.

And, while there are several good approaches to educating yourself
on what the language is ... and I realize this is going to endear me
to nobody ... my preferred method is "trial and error" -- despite my
"naive and dangerous" form of reasoning, it's a perfectly effective
approach,

No, it is most definitely NOT a perfectly effective method. All sorts of
things that you might think are correct, and might work on your compiler
this week, might fail abysmally when it actually matters to you.
assuming you start out by trusting nobody.
Start by not trusting trial and error, because it has been repeatedly
been shown that the people posting here having relied on it to learn C
have learnt to do things which are definitely wrong.
I don't trust
the standard (in part because there's no guarantee it has been
implemented correctly,
In that case build your own chip factory, design and build your own
chips, and write your own compiler.
but mostly because I don't have a copy),
Google for n1124.pdf to get a free public draft of the next version, or
buy a copy of the current version from a standards body (you can get it
for $18 last I heard).
I don't trust compiler designers (because they don't necessarily
implement correctly),
In that case don't use any you have not implemented. You also can't
trust assemblers, text editors or the OS by that reasoning.
I don't trust secondary documentation (it's
like a photocopy of a photocopy),
It is easy to find reviews of books to see if they are reliable, and you
can cross-reference to the standard if you are not sure.
I *certainly* don't trust usenet,
and I trust my own memory *least of all*. What I trust are demonstrable
results.
I can demonstrate with one compiler that you can safely modify string
literals and get the expected result. I can also demonstrate with a
later version of the *same* compiler that you can't modify string
literals because it causes a SIGSEGV (I might be wrong on the exact
signal, but definitely a crash). The reality is that anything can happen
because it is undefined behaviour. However, had I relied on your method
of trial and error all my code could have suddenly gone from "working"
to "crashing".

If I could be bothered I could come up with lots of other examples, but
the above is one I know to be demonstrably true.
Naturally, with that mentality, I tend to code defensively.
Coding defensively REQUIRES understanding how the language is DEFINED to
work, what you are doing by relying on trial and error rather than a
reliable source of information is coding stupidly.
It would
never
even occur to me to *want* to change argv (or use gets). Still I do
find
these conversations fascinating, and I always enjoy the cranky
attitude found on usenet!

Well, if you think trial and error is a substitute for a good text book
expect responses a lot more cranky than mine.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.

Dec 16 '05 #12

Jordan Abel

On 2005-12-15, Eric Sosman <es*****@acm-dot-org.invalid> wrote:

bluejack wrote:
Given that there are no const keywords in use, one would expect that
argv is modifyable in any and all senses. Naturally, main is something
of an exception case, but even so, I trust that the people who
established the standard were reasonably sensible and rigorous people,
and if they had meant for something to be const, they would have used
the const keyword to so indicate.
On the other hand, the authors of the Standard stated
explicitly that the pointed-to strings are modifiable, even
though the "no `const' appears" argument would apply to them
with equal force. Why did they bother?

Keep in mind the large body of C code already in existence
before `const' entered the language. The ANSI committee could not
invalidate two-plus decades' worth of existing code because they'd
thought of a better way. They codified existing practice, even though
(with the new tools) more explicit practice was possible.

They could have permitted an additional prototype:

int main(int argc, char * const *argv); which i think they would have
done if they had intended that the pointers may not be modifiable.

It seems to me not unlike the situation with string literals:
They are not `const', yet you are forbidden to try to alter them.
Except, of course, that you are inferring that by lack of analogy to the
explicit permission to write their targets, not from any actual language
in the standard.
The
Rationale explains that they were not made `const' because a lot of
existing code would break; instead, they are non-`const' and the
Standard has special language warning you not to modify them.
The standard does not have such special language for the argv pointers.
The behavior in modifying a non-const variable that is not a string
literal and was not cast from the address of a const variable is
well-defined.

The argv question seems similar (although the Rationale does not
confirm it): Pre-`const' code declared argv as `char**', and the
Standard adopted that use but added special language describing the
writeability of argv[i][j]. I think it a "curious incident" that the
Standard says nothing about the writeability of argv[i].

I think it's more curious that it does add such language for the
writeability of argv[i][j], given that it's non-const (and not a string
literal) and hence "should" be modifiable anyway.

Dec 16 '05 #13

Keith Thompson

"Chuck F. " <cb********@yahoo.com> writes:

bluejack wrote:
Given that there are no const keywords in use, one would expect
that argv is modifyable in any and all senses. Naturally, main
is something of an exception case, but even so, I trust that the
people who established the standard were reasonably sensible and
rigorous people, and if they had meant for something to be const,
they would have used the const keyword to so indicate.

This is fairly meaningless due to the total lack of context. See my
sig below for a way to use the broken google interface sanely.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Or, better yet, read the more detailed description at
<http://cfaj.freeshell.org/google/>.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Dec 16 '05 #14

Netocrat

On Fri, 16 Dec 2005 01:11:09 +0000, Jordan Abel wrote:

On 2005-12-15, Eric Sosman <es*****@acm-dot-org.invalid> wrote:

[on string literals as an analogy to argv]

The Rationale explains that they were not made `const' because a lot of
existing code would break; instead, they are non-`const' and the
Standard has special language warning you not to modify them.

The standard does not have such special language for the argv pointers.
The behavior in modifying a non-const variable that is not a string
literal and was not cast from the address of a const variable is
well-defined.

The ultimate declaration of the argv variable passed into the program is
not specified though, all the program gets is the declaration of the
function parameter.

It's legal to cast a const-qualified variable to a non-const version of
the same and pass it into a function, it's just not legal to write to it
within the function.

The argv question seems similar (although the Rationale does not
confirm it): Pre-`const' code declared argv as `char**', and the
Standard adopted that use but added special language describing the
writeability of argv[i][j]. I think it a "curious incident" that the
Standard says nothing about the writeability of argv[i].

I think it's more curious that it does add such language for the
writeability of argv[i][j], given that it's non-const (and not a string
literal) and hence "should" be modifiable anyway.

Without that language it would be implicitly undefined behaviour to
attempt to modify argv and argv[i][j], as it is now for arg[i]. The
curiosity is that the Standard left it implicit rather than making it
explicit.

--
http://members.dodo.com.au/~netocrat

Dec 16 '05 #15

Jordan Abel

On 2005-12-16, Netocrat <ne******@dodo.com.au> wrote:

On Fri, 16 Dec 2005 01:11:09 +0000, Jordan Abel wrote:
On 2005-12-15, Eric Sosman <es*****@acm-dot-org.invalid> wrote: [on string literals as an analogy to argv]
The Rationale explains that they were not made `const' because a lot of
existing code would break; instead, they are non-`const' and the
Standard has special language warning you not to modify them.

The standard does not have such special language for the argv pointers.
The behavior in modifying a non-const variable that is not a string
literal and was not cast from the address of a const variable is
well-defined.

The ultimate declaration of the argv variable passed into the program is
not specified though, all the program gets is the declaration of the
function parameter.

It's legal to cast a const-qualified variable to a non-const version of
the same and pass it into a function, it's just not legal to write to it
within the function.
There is, however no basis in the text for supposing that this is the
case for *argv (...etc).

I think it's more curious that it does add such language for the
writeability of argv[i][j], given that it's non-const (and not a string
literal) and hence "should" be modifiable anyway.

Without that language it would be implicitly undefined behaviour

It would not. without that language, **argv (...etc) would still be of
type char, not const char, and since it's not a string literal (a listed
exception to an object of type char being modifiable), there's no basis
for supposing that it would be non-modifiable.
to attempt to modify argv and argv[i][j], as it is now for arg[i].
The curiosity is that the Standard left it implicit rather than making
it explicit.

There is no basis in the text for believing that it might be the case,
other than your interpretation of a conspicuous lack of a similar
statement for argv[i] as for argv[i][j].

Dec 20 '05 #16

Netocrat

On Tue, 20 Dec 2005 04:23:24 +0000, Jordan Abel wrote:

On 2005-12-16, Netocrat <ne******@dodo.com.au> wrote: [...]
It's legal to cast a const-qualified variable to a non-const version of
the same and pass it into a function, it's just not legal to write to
it within the function.

[I worded the above sloppily. More correctly the first sentence should
begin: "It's legal to take the address of a const-declared variable, cast
it to a pointer to a non-const qualified version of the variable's type,
and pass that pointer into a function, ..."]
There is, however no basis in the text for supposing that this is the
case for *argv (...etc).

(Assuming that you interpreted my sloppy wording as intended) I'd express
that in reverse: there's no basis in the text for supposing that the
variables passed into main() are uniquely unaffected by this possibility.

The mention that argc and argv are modifiable does seem redundant, but
useful clarification given that they are coming from an external
environment. The claim in my last post that it would be implicit UB to
attempt to modify argv without this mention may have been too strong, but
I'm not convinced that modifying argv[i][j] would be legal and defined
without mention.

--
http://members.dodo.com.au/~netocrat

Dec 20 '05 #17

Jordan Abel

comp.std.c added, it seems appropriate: for those who haven't been
following along, the issue is whether the text of the standard supports
a view that modification of the elements of argv [i.e. the individual
pointers] results in undefined behavior.

On 2005-12-20, Netocrat <ne******@dodo.com.au> wrote:

On Tue, 20 Dec 2005 04:23:24 +0000, Jordan Abel wrote:
On 2005-12-16, Netocrat <ne******@dodo.com.au> wrote:

[...]
It's legal to cast a const-qualified variable to a non-const version
of the same and pass it into a function, it's just not legal to
write to it within the function. [I worded the above sloppily. More correctly the first sentence
should begin: "It's legal to take the address of a const-declared
variable, cast it to a pointer to a non-const qualified version of the
variable's type, and pass that pointer into a function, ..."]

But there's no reason to think that this has been done by whatever calls
main.

There is, however no basis in the text for supposing that this is the
case for *argv (...etc).

(Assuming that you interpreted my sloppy wording as intended) I'd
express that in reverse: there's no basis in the text for supposing
that the variables passed into main() are uniquely unaffected by this
possibility.

It's hardly unique.

Dec 20 '05 #18

Jonathan Leffler

Jordan Abel wrote:

comp.std.c added, it seems appropriate: for those who haven't been
following along, the issue is whether the text of the standard supports
a view that modification of the elements of argv [i.e. the individual
pointers] results in undefined behavior.

Yes.

Section 5.1.2.2.1 of the ISO/IEC 9899:1999 seems quite explicit:

The parameters argc and argv and the strings pointed to by the argv
array shall be modifiable by the program, and retain their last-stored
values between program startup and program termination.
There's a note about the conventional but non-mandatory use of argc and
argv as the names of the parameters. Looks pretty clear to me...
--
Jonathan Leffler #include <disclaimer.h>
Email: jl******@earthlink.net, jl******@us.ibm.com
Guardian of DBD::Informix v2005.02 -- http://dbi.perl.org/

Dec 20 '05 #19

Chris Torek

>Jordan Abel wrote:

comp.std.c added, it seems appropriate: for those who haven't been
following along, the issue is whether the text of the standard supports
a view that modification of the elements of argv [i.e. the individual
pointers] results in undefined behavior.

In article <Uu*****************@newsread3.news.pas.earthlink. net>
Jonathan Leffler <jl******@earthlink.net> wrote:Yes.
I think this claim is a bit premature....
Section 5.1.2.2.1 of the ISO/IEC 9899:1999 seems quite explicit:

The parameters argc and argv and the strings pointed to by the argv
array shall be modifiable by the program, and retain their last-stored
values between program startup and program termination.

That text guarantees that, in the "code" part of:

int main(int argc, char **argv) {
... code ...
}

the programmer may change argv itself (though this is hardly
controversial) and, for appropriate values of i and j, the programmer
may change argv[i][j] by ordinary assignment. Thus, e.g., the code
fragment below is fine given suitable i, p, and q:

/* suppose at this point, strcmp(argv[i], "this:that") == 0 */
p = argv[i];
p[4] = '\0';
q = p + 5;
/* now strcmp(p, "this") == 0 && strcmp(q, "that") == 0 */

The question in question (it is late, pardon the phrasing :-) ) is
whether this is also proper, given suitable i, k, and p:

p = argv[i];
argv[i] = argv[k];
argv[k] = p;

This writes on argv[i] and argv[k], rather than argv[i][j]. The
fact that the Standard explicitly allows the programmer to write
on argv[i][j] should make one wonder why it fails to mention whether
the programmer may write on argv[i] itself. The lack of a "const"
qualifier is not in itself permission, since:

void f(void) {
char *x = "this:that";

x[4] = '\0';
...
}

violates a "shall" outside a constraints section, rendering the
behavior undefined, yet no part of the declaration of x uses "const".
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Dec 20 '05 #20

Netocrat

[within an arbitrary function, it can't be determined whether the
pointed-to objects of any non-const-qualified pointer parameters were
originally declared const]

Jordan Abel wrote:

But there's no reason to think that this has been done by whatever calls
main.

Irrelevant unless you can point to a specific prohibition against it or
something similar (i.e. analogous to string literals) for argv[i].

--
http://members.dodo.com.au/~netocrat

Dec 20 '05 #21

Jordan Abel

On 2005-12-20, Netocrat <ne******@dodo.com.au> wrote:

[within an arbitrary function, it can't be determined whether the
pointed-to objects of any non-const-qualified pointer parameters were
originally declared const]

Jordan Abel wrote:
But there's no reason to think that this has been done by whatever calls
main.

Irrelevant unless you can point to a specific prohibition against it or
something similar (i.e. analogous to string literals) for argv[i].

Eh? I was asserting that you _can_ modify the array [as there is no
specific prohibition against doing so - as far as i can tell you are the
one who who needs to find a specific prohibition analogous to string
literals]

Dec 20 '05 #22

Netocrat

On Tue, 20 Dec 2005 10:48:32 +0000, Jordan Abel wrote:

On 2005-12-20, Netocrat <ne******@dodo.com.au> wrote:
[within an arbitrary function, it can't be determined whether the
pointed-to objects of any non-const-qualified pointer parameters were
originally declared const]

Jordan Abel wrote:
But there's no reason to think that this has been done by whatever calls
main.
Irrelevant unless you can point to a specific prohibition against it or
something similar (i.e. analogous to string literals) for argv[i].

Eh?

For "it", substitute "passing a pointer to an object originally defined
with a const qualifier". I'm guessing as to where the misunderstanding
lies - that may clarify it. I interpreted the sentence to which I was
responding in a way that the same fragment could (inelegantly) also
substitute for "this".
I was asserting that you _can_ modify the array [as there is no
specific prohibition against doing so - as far as i can tell you are the
one who who needs to find a specific prohibition analogous to string
literals]

Try this code snippet for a thought experiment:

/* allocates memory and assigns argument strings and count */
char *const *init_new_args(int *new_argc);

int main(int argc, char **argv) {
int new_argc;
char *const *new_argv = init_new_args(&new_argc);
static int first_time = 1;
if (first_time--)
main(new_argc, (char **)new_argv);
/* else argv[0] = a_char_pointer; */ /* illegal attempt
to write to an object that was
defined with a const qualifier */
return 0;
}

On the internal call, main is passed an argv parameter that was defined
with const-qualified members (i.e. argv[i] is non-modifiable). This const
qualifier has been (legally) cast away; nevertheless as commented in the
code it is illegal to write to the originally const-qualified pointer
objects.

The rules of the Standard allow this situation for an internal call. I
see nothing to indicate that the function main() is conceptually unique,
other than that it's the program entry point. So there's nothing to
prevent an equivalent situation for an external call; hence without a
mention otherwise by the Standard, modifying argv[i] is UB.

--
http://members.dodo.com.au/~netocrat

Dec 21 '05 #23

Jordan Abel

On 2005-12-21, Netocrat <ne******@dodo.com.au> wrote:

On Tue, 20 Dec 2005 10:48:32 +0000, Jordan Abel wrote:
On 2005-12-20, Netocrat <ne******@dodo.com.au> wrote:
[within an arbitrary function, it can't be determined whether the
pointed-to objects of any non-const-qualified pointer parameters were
originally declared const]

Jordan Abel wrote:
But there's no reason to think that this has been done by whatever calls
main.

Irrelevant unless you can point to a specific prohibition against it or
something similar (i.e. analogous to string literals) for argv[i].
Eh?

For "it", substitute "passing a pointer to an object originally defined
with a const qualifier". I'm guessing as to where the misunderstanding
lies - that may clarify it. I interpreted the sentence to which I was
responding in a way that the same fragment could (inelegantly) also
substitute for "this".
I was asserting that you _can_ modify the array [as there is no
specific prohibition against doing so - as far as i can tell you are the
one who who needs to find a specific prohibition analogous to string
literals]

Try this code snippet for a thought experiment:

/* allocates memory and assigns argument strings and count */
char *const *init_new_args(int *new_argc);

int main(int argc, char **argv) {
int new_argc;
char *const *new_argv = init_new_args(&new_argc);
static int first_time = 1;
if (first_time--)
main(new_argc, (char **)new_argv);
/* else argv[0] = a_char_pointer; */ /* illegal attempt
to write to an object that was
defined with a const qualifier */
return 0;
}

On the internal call, main is passed an argv parameter that was defined
with const-qualified members (i.e. argv[i] is non-modifiable). This const
qualifier has been (legally) cast away; nevertheless as commented in the
code it is illegal to write to the originally const-qualified pointer
objects.

The rules of the Standard allow this situation for an internal call.

I believe that for this to be allowed for values that "come from" the
depths of the library requires explicit permission [that is, for the
standard to say that the result, though a char *, may not be modified] -
there is such an explicit statement on getenv, for example. Anywhere
that a pointer non-const object of unknown origin is permitted to have
been cast from a const one, the standard has explicit language that
says this may be the case.
I see nothing to indicate that the function main() is conceptually
unique, other than that it's the program entry point. So there's
nothing to prevent an equivalent situation for an external call; hence
without a mention otherwise by the Standard, modifying argv[i] is UB.

Dec 21 '05 #24

Netocrat

On Wed, 21 Dec 2005 09:11:48 +0000, Jordan Abel wrote:
[...]

I believe that for this to be allowed for values that "come from" the
depths of the library requires explicit permission [that is, for the
standard to say that the result, though a char *, may not be modified] -
there is such an explicit statement on getenv, for example.

That's definitely a useful statement. If it weren't present though, then
modifying the string returned by getenv would be (implicit) UB anyway,
because the Standard would not be prohibiting an implementation from
returning an immutable string, therefore portable programs couldn't rely
on the string's mutability.

This doesn't apply to most other char* returning library functions (e.g.
in string.h) since the returned pointer points within an object whose
definition is under the programmer's control - or that was at least
provided to the function by the programmer.

--
http://members.dodo.com.au/~netocrat

Dec 21 '05 #25

Jordan Abel

On 2005-12-21, Netocrat <ne******@dodo.com.au> wrote:

On Wed, 21 Dec 2005 09:11:48 +0000, Jordan Abel wrote:
[...]
I believe that for this to be allowed for values that "come from" the
depths of the library requires explicit permission [that is, for the
standard to say that the result, though a char *, may not be modified] -
there is such an explicit statement on getenv, for example.

That's definitely a useful statement. If it weren't present though, then
modifying the string returned by getenv would be (implicit) UB anyway,
because the Standard would not be prohibiting an implementation from
returning an immutable string, therefore portable programs couldn't rely
on the string's mutability.

But it only has to explicitly state it because itdoesn't return const

Dec 21 '05 #26

Stan Milam

bluejack wrote:

Naturally, with that mentality, I tend to code defensively. It would
never
even occur to me to *want* to change argv (or use gets). Still I do
find
these conversations fascinating, and I always enjoy the cranky
attitude found on usenet!

-bluejack

I code defensively too. However, the client I am working for is
implementing many C programs that access a database. The vendor
designed these programs to accept the database user login and password
on the command line. This is unfortunate because using the UNIX (it's a
UNIX operating system) ps command you can see the arguments, and the
client is security concious. As an experiment I wrote a function that
allocated a replacement array of pointers. To each element of the array
I allocated memory and copied the string, with the exception of argv[0].
I then went through the original array an null terminated each string
pointed to by argv with the exception of argv[0]. I then overlayed argv
with the copy of that I had made. Presto, all of the command line
arguments disappear when using the ps command, and the program still has
its own private copy of the arguments. Management was estatic.
However, I warned that this technique might not work on other UNIXes or
work with later releases of the UNIX they are using now. They didn't
care. Their immediate problem was solved.

Regards,
Stan Milam.

Dec 22 '05 #27

Netocrat

On Wed, 21 Dec 2005 23:12:56 +0000, Jordan Abel wrote:

On 2005-12-21, Netocrat <ne******@dodo.com.au> wrote:
On Wed, 21 Dec 2005 09:11:48 +0000, Jordan Abel wrote:
[...]
I believe that for this to be allowed for values that "come from" the
depths of the library requires explicit permission [that is, for the
standard to say that the result, though a char *, may not be modified]
- there is such an explicit statement on getenv, for example.

That's definitely a useful statement. If it weren't present though,
then modifying the string returned by getenv would be (implicit) UB
anyway, because the Standard would not be prohibiting an implementation
from returning an immutable string, therefore portable programs
couldn't rely on the string's mutability.

But it only has to explicitly state it because itdoesn't return const

It doesn't have to explicitly state it in either case, although it would
be more visibly redundant to state it if getenv() did return const.
Probably it doesn't return const for historical reasons and/or to make it
easy for an implementation to define the returned string as mutable if it
wants to.

You seem to genuinely object to the concept of implicit UB in general and
I can't see us finding mutual understanding without something more than
the brief back-and-forths thus far. So here's a "from first principles"
view of mutability which you may be able to use to identify your objection
more clearly:

Firstly we have 6.3.2.1#1 - which defines a "modifiable lvalue" as any
non-const-qualified object expression - and 6.5.16#2 - which defines the
assignment operators to require a modifiable lvalue as a target. The
other Standard-defined method of writing to an object is by passing its
address to a library function.

At this point[*], the programmer can without restriction write to
(almost[**]) any object by legally casting away any constness - there is
nothing to prevent the write methods from doing their (well-defined) jobs.
So if you were here to raise your claim that argv[i] is modifiable in the
absence of other mention by the Standard, I'd say you'd have an arguable
point: the concept that an object may be immutable is thus far not even a
part of the universe of discourse.

Let's then discover 6.7.3#5 which makes it explicit UB to attempt to write
to an object whose definition is const-qualified, even if done through a
modifiable lvalue, and 6.4.5#6 which does the same for string literals.

/Now/ there exists the concept that "some objects are guaranteed to be
mutable, and others are not guaranteed (but may be anyway on any given
implementation)". At this point, we can only be assured that the write
methods will work portably for objects whose definition was neither
const-qualified nor a string literal. So we cannot be sure that it is
safe to write to any object whose definition we do not have access to, or
whose definition we are not supplied with - if the Standard doesn't
explicitly make it clear what applies for such objects then writing to
them is implicit UB.

Two such (Standard-mandated) objects are the elements argv[i] and the
string returned as getenv()'s result, although as you've noted the
Standard helpfully makes the undefinedness of writing to the latter
explicit. It is, as stated earlier in the thread, curious that the
Standard does not do the same for argv[i] - even though it wouldn't affect
the end result.
[*] meaning in reference to a version of the Standard with the clauses
later "discovered" excised

[**] the exception is register-qualified objects, whose address may not be
taken, so that:
*(T *)&const_register_object_of_type_T
will not work as it will for any non-register qualified object.

--
http://members.dodo.com.au/~netocrat

Dec 22 '05 #28

Chris Torek

In article <ig*******************@newssvr29.news.prodigy.ne t>
Stan Milam <st*****@swbell.net> wrote (in part):

... the client [wanted the arguments not to show up in "ps" output,
and saving-then-overwriting the argv data accomplished that].
Management was estatic. However, I warned that this technique
might not work on other UNIXes or work with later releases
of the UNIX they are using now. They didn't care. Their
immediate problem was solved.

For what it is worth (not all that much perhaps), your warning is
well-taken: it does not in fact work on some Unix variants. (There
is also a more subtle problem: even where it works, there is still
a race condition. The arguments can be captured by a "ps" command
at the right time.)

Still, for non-portable problems, non-portable solutions are
often appropriate.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Dec 22 '05 #29

DAGwyn

The answer to the original question is that of course the elements
of the array (of length equal to the value of main's int parameter)
pointed to by main's char** parameter can be modified. If we had
intended otherwise we would have used const qualification at the
appropriate position (between the *s) when specifying the interface.

Dec 25 '05 #30

Keith Thompson

"DAGwyn" <DA****@null.net> writes:

The answer to the original question is that of course the elements
of the array (of length equal to the value of main's int parameter)
pointed to by main's char** parameter can be modified. If we had
intended otherwise we would have used const qualification at the
appropriate position (between the *s) when specifying the interface.

Then why does 5.1.2.2.1p2 explicitly state that the program can modify
argc, argv, and the strings pointed to by the argv array, but not make
the same statement about the elements of the argv array? Was it just
an oversight?

There is precedent (string literals) for making something not
explicitly const, but not allowing it to be modified.

If you're correct, than an implementation that makes argv a pointer to
an array (of char*) in read-only memory (or at least memory that can't
be modified by the program) would be non-conforming, but it's not
quite obvious (to me) from the standard.

And Doug, if you're going to post through Google Groups, please read
<http://cfaj.freeshell.org/google/>. Thanks, and Merry Christmas.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Dec 25 '05 #31

David R Tribble

Doug Gwyn writes:

The answer to the original question is that of course the elements
of the array (of length equal to the value of main's int parameter)
pointed to by main's char** parameter can be modified. If we had
intended otherwise we would have used const qualification at the
appropriate position (between the *s) when specifying the interface.

Keith Thompson wrote: Then why does 5.1.2.2.1p2 explicitly state that the program can modify
argc, argv, and the strings pointed to by the argv array, but not make
the same statement about the elements of the argv array? Was it just
an oversight?

If you're correct, than an implementation that makes argv a pointer to
an array (of char*) in read-only memory (or at least memory that can't
be modified by the program) would be non-conforming, but it's not
quite obvious (to me) from the standard.

FWIW, I submitted a public comment on this issue (Feb 1998):
http://david.tribble.com/text/c9xc004.txt

I suggested that since no mention of the argv elements was made,
to add a sentence making it implementation-defined whether they
are modifiable or not.

At the time I submitted the comment, I believe there were systems
that did not allow argv[i] to be modified, and it was not clear that
such systems were or were not conforming.

My comment was rejected, IIRC, because such verbiage was "not
necessary".

-drt

Dec 27 '05 #32

David R Tribble

Doug Gwyn writes:

The answer to the original question is that of course the elements
of the array (of length equal to the value of main's int parameter)
pointed to by main's char** parameter can be modified. If we had
intended otherwise we would have used const qualification at the
appropriate position (between the *s) when specifying the interface.

Keith Thompson wrote: Then why does 5.1.2.2.1p2 explicitly state that the program can modify
argc, argv, and the strings pointed to by the argv array, but not make
the same statement about the elements of the argv array? Was it just
an oversight?
FWIW, I submitted a public comment about that very issue
(Feb 1998):
http://david.tribble.com/text/c9xc004.txt

I suggested that it be deemed implementation-defined whether
or not argv[i] is modifiable. My comment was rejected, IIRC,
because the extra verbiage was not necessary.

If you're correct, than an implementation that makes argv a pointer to
an array (of char*) in read-only memory (or at least memory that can't
be modified by the program) would be non-conforming, but it's not
quite obvious (to me) from the standard.

At the time I sumbitted my comment, there did exist systems that
did not allow modifying argv[i], and it was not clear whether or not
those systems were conforming.

-drt

Dec 27 '05 #33

Is argv array modifiable ?

Similar topics