By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,876 Members | 1,206 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,876 IT Pros & Developers. It's quick & easy.

Crazy stuff

P: n/a
If I have this(please bear with me - this is a case of a former java
guy going back to C ;-)

int main () {
char *tokconvert(char*);
char str[] = "dot.delimited.str";
char *result;

result = tokconvert(str);
return 0;
}

char *tokconvert(char *strToConvert) {

char *token;
char *tokDelim =",.";

token=strtok(strToConvert, tokDelim);

while(token != NULL) {
printf("Token -> %s \n", token);
token=strtok(NULL, tokDelim);
}
}

the thing compiles and runs fine, however if I declare char *str =
"dot.delimited.str" (instead of char str[] = "dot.delimited.str") the
whole thing falls on its ass real bad (bus error) - strtok is defined
as char *strtok(char *, const char *) - what's going on?

more craziness: if I declare/initialise

char *str;
str = strdup("dot.delimited.str)

inside tokconvert (instead of main) and pass that to strtok - it runs
fine!!

any thoughts are most welcome..

chumbo.

P.S.
btw this is on FreeBSD 5.1.2 (using gcc 3.3.3)
Nov 14 '05 #1
Share this Question
Share on Google+
32 Replies


P: n/a
Chumbo wrote:
the thing compiles and runs fine, however if I declare char *str =
"dot.delimited.str" (instead of char str[] = "dot.delimited.str") the
whole thing falls on its ass real bad (bus error) - strtok is defined
as char *strtok(char *, const char *) - what's going on?


char *str = "blah";

is allowed to be placed in read-only memory. The data cannot be modified.

char str[], however, must be placed in memory modifiable by you (i.e. the
stack).

--
Jason Whitehurst
Nov 14 '05 #2

P: n/a
Jason Whitehurst wrote:
is allowed to be placed in read-only memory. The data cannot be
modified.


Err, and in case you don't know, strtok(3) modifies the input string. Thus,
your problem.

--
Jason Whitehurst
Nov 14 '05 #3

P: n/a
In article <50**************************@posting.google.com >,
Chumbo <ch*********@hotmail.com> wrote:
If I have this(please bear with me - this is a case of a former java
guy going back to C ;-)

int main () {
char *tokconvert(char*);
It will confuse people less if you put function prototypes at file scope
instead of inside functions. (In this particular case, you can avoid
needing the prototype altogether by putting tokconvert before main(),
but that's not always possible or reasonable.)
char str[] = "dot.delimited.str";
This allocates an array of char and populates it with the characters
in the string "dot.delimited.str" (including the terminating '\0').
The array is in the automatic-allocation space (we can call this "the
stack", but we prefer not to, because the function-invocation stack that
it exists in is not the same stack as the "processor's stack segment"
stack (which need not even exist) that people typically assume we're
talking about), and therefore that we can do pretty much whatever we
want with it (the relevant bit of that here is that we can write to it)
until the function returns.
char *result;

result = tokconvert(str);
When you pass str (an array) to a function (or do most other things with
it, notable exceptions being applying & or sizeof to it), the array name
decays to a pointer to the array's first element. In this case, this
is exactly what you want - a pointer to the first character of the string.
return 0;
}

char *tokconvert(char *strToConvert) {

char *token;
char *tokDelim =",.";

token=strtok(strToConvert, tokDelim);

while(token != NULL) {
printf("Token -> %s \n", token);
token=strtok(NULL, tokDelim);
}
You're not returning anything here. Your compiler should have warned
you about that.
(I'm not sure what you'd've wanted to return; possibly this is a leftover
from doing something with the tokens other than just printing them?)
}
the thing compiles and runs fine, however if I declare char *str =
"dot.delimited.str" (instead of char str[] = "dot.delimited.str") the
whole thing falls on its ass real bad (bus error) - strtok is defined
as char *strtok(char *, const char *) - what's going on?
When you say `char *str="a string literal"', the string literal (like
any other string literal in the program[1]) refers to an anonymous
not-const-but-not-writeable array of characters containing the string.
In English, that means you're allowed to point a pointer that you're
allowed to write through at it (remember, arrays decay to pointers),
but you're not actually allowed to write to it.

So, you have a pointer pointing at a string literal that you're not
allowed to write to... and then you try to write to it (indirectly,
by passing it to strtok, which writes to its first argument). That's
what's causing your problem. The solution is to Don't Do That, Then:
Allocate writeable space for the string you give strtok as its first
argument, either as an automatic variable (like in your code above)
or as dynamically allocated memory (f'rexample, memory from strdup as
below), and put the string into that.

more craziness: if I declare/initialise

char *str;
str = strdup("dot.delimited.str)

inside tokconvert (instead of main) and pass that to strtok - it runs
fine!!
Note that strdup is a unixism and not part of the C language.[2]

What strdup does is allocate (with malloc) enough memory to hold the
string you give it, and copy the string into that memory, and return a
pointer to the copy of the string. This memory is writeable, so giving
strdup a pointer to it isn't a problem, for the same reason that giving
it a pointer to the automatically allocated array isn't a problem.

P.S.
btw this is on FreeBSD 5.1.2 (using gcc 3.3.3)


If I'd needed to know that, then your question would have been
inappropriate for comp.lang.c and would have been better off asked in
a FreeBSD or GCC newsgroup.

But given that you're using GCC: If you'd compiled with:
gcc -W -Wall -ansi -pedantic -O myprog.c
then GCC would have warned you about failing to return a value from
tokconvert, along with a bunch of other warnings about things that you
typically don't want to do. This makes debugging some problems a lot
easier - they go away when the compiler stops warning you about them.
dave

[1] In language-lawyer-ese, the string in the array initialization
`char buf[]="a string"' is an initializer, not a string literal;
this is, as far as I know, the only place you can have a string in
the source code that doesn't represent an anonymous array of char.

[2] It's in the implementation's namespace, though, which means you're
not allowed to define it yourself. The solution in comp.lang.c is
to use my_strdup, which can portably be defined and have the same
behavior as the unix strdup; the usual solution outside comp.lang.c
is to use the library's strdup if it exists, and otherwise to apply
knowledge beyond that found in the language definition to establish
that the programmer is allowed to define a function by that name on
that implementation and do so.

--
Dave Vandervies dj******@csclub.uwaterloo.ca
Well, it's rare. The usual reaction to "let me just show you the bit..."
is "Rich! Rich! We ***believe*** you, okay?!?!?!?!?!"
--Richard Heathfield in comp.lang.c
Nov 14 '05 #4

P: n/a

"Chumbo" <ch*********@hotmail.com> wrote

token=strtok(strToConvert, tokDelim);
the thing compiles and runs fine, however if I declare char *str =
"dot.delimited.str" (instead of char str[] = "dot.delimited.str") the
whole thing falls on its ass real bad (bus error) - strtok is defined
as char *strtok(char *, const char *) - what's going on?

strtok()s first argument is a char *, and is overwritten to produce the
token (yes, this is a terrible way of doing things, as an ex-Java man you
probably expect something like the Java string tokeniser).
char *str = "My string"; creates a constant string in read-only memory.
char str[] = "My string"; creates a string in read-write memory.

The result of passing a constant string to strtok is undefined.
Nov 14 '05 #5

P: n/a
so I am a little confused...how does the compiler actually go about creating
read-only memory and read-write memory? Hows the distinction based?

"Malcolm" <ma*****@55bank.freeserve.co.uk> wrote in message
news:cm**********@newsg2.svr.pol.co.uk...

"Chumbo" <ch*********@hotmail.com> wrote

token=strtok(strToConvert, tokDelim);
the thing compiles and runs fine, however if I declare char *str =
"dot.delimited.str" (instead of char str[] = "dot.delimited.str") the
whole thing falls on its ass real bad (bus error) - strtok is defined
as char *strtok(char *, const char *) - what's going on?

strtok()s first argument is a char *, and is overwritten to produce the
token (yes, this is a terrible way of doing things, as an ex-Java man you
probably expect something like the Java string tokeniser).
char *str = "My string"; creates a constant string in read-only memory.
char str[] = "My string"; creates a string in read-write memory.

The result of passing a constant string to strtok is undefined.

Nov 14 '05 #6

P: n/a
"Siddharth Taneja" <ta****@usc.edu> wrote in message
news:cm**********@gist.usc.edu...
so I am a little confused...how does the compiler actually go about creating read-only memory and read-write memory?
However it wants.
Hows the distinction based?


Depends upon the compiler. Each one does things its own
way. If you want to know how yours does it, consult
its support resources.

BTW please don't top-post. Thanks.

-Mike

Nov 14 '05 #7

P: n/a

"Siddharth Taneja" <ta****@usc.edu> wrote

so I am a little confused...how does the compiler actually go about creating read-only memory and read-write memory? Hows the distinction based?

This one of the problems of C.

On a desktop system, typically the whole program will be loaded into
physical RAM. On many systems, pages may be marked as "read only" or
"executable only", so attempts to write to those pages cause runtime faults.
However on other systems, such as older microcomputers, there was no
mechanism for doing this, so a write to a constant area of memory would
change the contents and cause mysterious malfunctions.

On embedded systems, it is quite common for the executable code and the
constant data to be held in physical ROM. Obviously any attempt to write to
this will not possibly alter the contents.
Nov 14 '05 #8

P: n/a
On 6 Nov 2004 21:22:16 -0800, ch*********@hotmail.com (Chumbo) wrote:
If I have this(please bear with me - this is a case of a former java
guy going back to C ;-)

int main () {
char *tokconvert(char*);
char str[] = "dot.delimited.str";
char *result;

result = tokconvert(str);
return 0;
}

char *tokconvert(char *strToConvert) {

char *token;
char *tokDelim =",.";
This is an example of where using static might an impact on
performance if the function were called often.

token=strtok(strToConvert, tokDelim);

while(token != NULL) {
printf("Token -> %s \n", token);
token=strtok(NULL, tokDelim);
}
}
How can this function compile fine? It is required to return a
pointer to char yet there is no return statement in the function at
all.

the thing compiles and runs fine, however if I declare char *str =
"dot.delimited.str" (instead of char str[] = "dot.delimited.str") the
whole thing falls on its ass real bad (bus error) - strtok is defined
as char *strtok(char *, const char *) - what's going on?

more craziness: if I declare/initialise

char *str;
str = strdup("dot.delimited.str)
This is not a standard function.

inside tokconvert (instead of main) and pass that to strtok - it runs
fine!!


Others have explained why.
<<Remove the del for email>>
Nov 14 '05 #9

P: n/a
On Sun, 7 Nov 2004 09:13:08 -0000, "Malcolm"
<ma*****@55bank.freeserve.co.uk> wrote in comp.lang.c:

"Chumbo" <ch*********@hotmail.com> wrote

token=strtok(strToConvert, tokDelim);
the thing compiles and runs fine, however if I declare char *str =
"dot.delimited.str" (instead of char str[] = "dot.delimited.str") the
whole thing falls on its ass real bad (bus error) - strtok is defined
as char *strtok(char *, const char *) - what's going on?
strtok()s first argument is a char *, and is overwritten to produce the
token (yes, this is a terrible way of doing things, as an ex-Java man you
probably expect something like the Java string tokeniser).


So far, so good.
char *str = "My string"; creates a constant string in read-only memory.
The line above is completely wrong. "My string" is a string literal,
and the C standard states that it has the type array of char.
Specifically NOT array of const char. So it is not a constant string.
Furthermore, C does not define any such thing as "read-only memory".
char str[] = "My string"; creates a string in read-write memory.
The line above creates an array of chars that may be modified,
provided that its bounds are not exceeded. C does not define any such
thing as "read-write memory".
The result of passing a constant string to strtok is undefined.


That is true, but does not apply here. The result of attempting to
modify a string literal is undefined behavior. This is true not
because the type of the string literal is array of const char, but
because the C standard specifically states that it is so.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #10

P: n/a
>> strtok()s first argument is a char *, and is overwritten to produce the
token (yes, this is a terrible way of doing things, as an ex-Java man you
probably expect something like the Java string tokeniser).


So far, so good.
char *str = "My string"; creates a constant string in read-only memory.


The line above is completely wrong. "My string" is a string literal,
and the C standard states that it has the type array of char.
Specifically NOT array of const char. So it is not a constant string.
Furthermore, C does not define any such thing as "read-only memory".


C *DOES* define certain memory areas, such as string literals, which
cannot be written on without invoking undefined behavior. This
is something the implementation is free to put in "read-only memory".
(Or it might not, on implementations with no memory protection.)
char str[] = "My string"; creates a string in read-write memory.


The line above creates an array of chars that may be modified,
provided that its bounds are not exceeded. C does not define any such
thing as "read-write memory".


C *DOES* define certain memory areas, such as non-const variables,
that can be written on without invoking undefined behavior.
It is reasonable to conclude that the implementation must put
this in memory that can be read and written, hence "read-write memory".

C does not define a (human) name, address, social security number,
or quantity of currency, but it is still possible to talk about
programs that handle data described like this.

Gordon L. Burditt
Nov 14 '05 #11

P: n/a

"Chumbo" <ch*********@hotmail.com> wrote in message
the thing compiles and runs fine, however if I declare char *str =
"dot.delimited.str" (instead of char str[] = "dot.delimited.str") the
whole thing falls on its ass real bad (bus error) - strtok is defined
as char *strtok(char *, const char *) - what's going on?


This is because doing the declaration as
char *str = "dot.delimited.str";
embeds a CONSTANT string somewhere, and then sets the pointer to
point to it. You can not WRITE to constant strings. (Well, you shouldn't
and shouldn't be able to, but some old C implementations allowed this).

Doing it as
char str[]="dot.delimited.str";
creates a character array on the stack, of exactly the right
length, and initialises it with the given string (by something
similar to strcpy()). There are thus TWO copies of "dot.delimited.str":
one in the CONSTANT STRING space, and a second in the data space. You
can freely write the latter, but not the former.

Pointers and arrays are not interchangeable.

Richard [in PE12]
Nov 14 '05 #12

P: n/a
On Mon, 8 Nov 2004 19:47:25 -0000
"Endymion Ponsonby-Withermoor III"
<m_a_r_v_i_n@para----and----.want-to-do.coe.ukk> wrote:
"Chumbo" <ch*********@hotmail.com> wrote in message
the thing compiles and runs fine, however if I declare char *str =
"dot.delimited.str" (instead of char str[] = "dot.delimited.str")
the whole thing falls on its ass real bad (bus error) - strtok is
defined as char *strtok(char *, const char *) - what's going on?
This is because doing the declaration as
char *str = "dot.delimited.str";
embeds a CONSTANT string somewhere, and then sets the pointer to
point to it. You can not WRITE to constant strings. (Well, you
shouldn't and shouldn't be able to, but some old C implementations
allowed this).


The C standard says it is not const. The reason you should not write to
it is that the C standard says that writing to a string literal invokes
undefined behaviour. This meens that the compiler is allowed to let you
write to it, and one common effect of doing this would be to also change
the string literal "blah blah dot.delimited.str"
Doing it as
char str[]="dot.delimited.str";
creates a character array on the stack,
1) C does not require a stack.
2) that line could be placed outside of any function definitions in
which case the string is unlikely to be stored on the stack if the
machine has a stack.
of exactly the right
length, and initialises it with the given string
True.
(by something
similar to strcpy()). There are thus TWO copies of
"dot.delimited.str": one in the CONSTANT STRING space, and a second in
the data space. You can freely write the latter, but not the former.
Possibly true of it is an automatic array, definitely not always strue
if it is not an automatic array.

All you know is that the object gets initialised, this could be done
with code equivalent to
str[0]='d';
str[1]='o';
...

For a non-automatic other options are available.
Pointers and arrays are not interchangeable.


True.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.
Nov 14 '05 #13

P: n/a
Barry Schwarz <sc******@deloz.net> wrote:
(Chumbo) wrote:
char *tokconvert(char *strToConvert) {

char *token;
char *tokDelim =",.";
token=strtok(strToConvert, tokDelim);

while(token != NULL) {
printf("Token -> %s \n", token);
token=strtok(NULL, tokDelim);
}
}


How can this function compile fine? It is required to return a
pointer to char yet there is no return statement in the function at
all.


It must compile fine. There is only undefined behaviour if
control actually reaches the end of the function at runtime.
Nov 14 '05 #14

P: n/a
On 8 Nov 2004 06:57:17 GMT, go***********@burditt.org (Gordon Burditt)
wrote in comp.lang.c:
strtok()s first argument is a char *, and is overwritten to produce the
token (yes, this is a terrible way of doing things, as an ex-Java man you
probably expect something like the Java string tokeniser).
So far, so good.
char *str = "My string"; creates a constant string in read-only memory.


The line above is completely wrong. "My string" is a string literal,
and the C standard states that it has the type array of char.
Specifically NOT array of const char. So it is not a constant string.
Furthermore, C does not define any such thing as "read-only memory".


C *DOES* define certain memory areas, such as string literals, which
cannot be written on without invoking undefined behavior. This
is something the implementation is free to put in "read-only memory".
(Or it might not, on implementations with no memory protection.)


C does not define "memory areas" at all. Modifying a string literal,
or modifying any object defined with the const qualifier, is undefined
behavior. There is not even a requirement that such an operation
fails, merely that the behavior is undefined if the attempt is made.
char str[] = "My string"; creates a string in read-write memory.


The line above creates an array of chars that may be modified,
provided that its bounds are not exceeded. C does not define any such
thing as "read-write memory".


C *DOES* define certain memory areas, such as non-const variables,
that can be written on without invoking undefined behavior.
It is reasonable to conclude that the implementation must put
this in memory that can be read and written, hence "read-write memory".


C still does not define "memory areas" at all. It defines objects
that may be freely modified by a program. Almost all of them, in
fact, other than those defined const and string literals. Nowhere
does the standard mention that these must be stored in a special
"memory area". It also doesn't specify whether any particular
modifiable object is in SRAM, DRAM, or virtual memory that might be
swapped out to a page file at any particular time.
C does not define a (human) name, address, social security number,
or quantity of currency, but it is still possible to talk about
programs that handle data described like this.
It is indeed quite possible for a C program to use data objects as
representations of "real world" concepts.

But it is indeed quite impossible to force a C implementation to
provide "read-only memory" just because you define a string literal.
Gordon L. Burditt
The poster to whom I replied wrote a sentence that contained two very
specific factual errors, directly in contradiction to the C language
standard:
char *str = "My string"; creates a constant string in read-only memory.


The two errors are:

1. The type of "My string" in the snippet is 'array of char', most
specifically not 'array of constant char'. See 6.4.5 P5.

2. The standard states that an implementation may (note MAY) place
certain const objects and string literals in "a read-only region of
storage" (footnote 112). Note that footnotes are not normative, and
the term "read-only" is not specifically defined in the standard.
Most certainly, there is no requirement or guarantee that "My string"
in the snippet above will be placed in any sort of specially qualified
memory.

If you think either of my corrections is inaccurate, kindly quote
chapter and verse from the standard to contradict them.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html
Nov 14 '05 #15

P: n/a

"Flash Gordon" <sp**@flash-gordon.me.uk> wrote in message
news:vk************@brenda.flash-

The C standard says it is not const. The reason you should not write to
it is that the C standard says that writing to a string literal invokes
undefined behaviour.
OK. I sit corrected.
write to it, and one common effect of doing this would be to also change
the string literal "blah blah dot.delimited.str"

Yes. That is because "dot.delimited.str" can be duplicate-merged with any
quoted-string that ends with it.
Richard [in PE12]
Nov 14 '05 #16

P: n/a

"Jack Klein" <ja*******@spamcop.net> wrote
Furthermore, C does not define any such thing as "read-only memory".
C does not define any such thing as "read-write memory".

That is true, but does not apply here. The result of attempting to
modify a string literal is undefined behavior. This is true not
because the type of the string literal is array of const char, but
because the C standard specifically states that it is so.

So C defines various terms, but how does it define them? In the Standard,
which is an English-language document. So sometimes C will also define the
words used to define these terms.
See the problem?
Ultimately we have to use terms like "read-only memory", which are not
defined by the C standard, but which have meanings given to them by usage.

If the OP doesn't understand why an attempt to modify a string literal
causes a crash, then its unlikely that he will be familiar with the term
"string literal". He may also be shaky on "undefined behaviour". So a
technically more accurate explanation is in fact more confusing.
Nov 14 '05 #17

P: n/a
On 8 Nov 2004 16:13:46 -0800, ol*****@inspire.net.nz (Old Wolf) wrote:
Barry Schwarz <sc******@deloz.net> wrote:
(Chumbo) wrote:
>char *tokconvert(char *strToConvert) {
>
> char *token;
> char *tokDelim =",.";
> token=strtok(strToConvert, tokDelim);
>
> while(token != NULL) {
> printf("Token -> %s \n", token);
> token=strtok(NULL, tokDelim);
> }
>}


How can this function compile fine? It is required to return a
pointer to char yet there is no return statement in the function at
all.


It must compile fine. There is only undefined behaviour if
control actually reaches the end of the function at runtime.


I couldn't find in the standard where the missing return requires a
diagnostic. All the compilers I have used did provide one so I guess
I've been spoiled.
<<Remove the del for email>>
Nov 14 '05 #18

P: n/a
On Wed, 10 Nov 2004 18:55:27 -0800, Barry Schwarz wrote:
On 8 Nov 2004 16:13:46 -0800, ol*****@inspire.net.nz (Old Wolf) wrote:
....
It must compile fine. There is only undefined behaviour if
control actually reaches the end of the function at runtime.


I couldn't find in the standard where the missing return requires a
diagnostic. All the compilers I have used did provide one so I guess
I've been spoiled.


It doesn't. I guess this goes back to the days when C didn't have a void
type.

Also just reaching the end of the function doesn't invoke undefined
behaviour, in the words of C99 6.9.1:

"If the } that terminate a function is reached, and the value of the
function call is used by the caller, the behavior is undefined."

In C90 you can also use return; i.e. with no expression in a function with
a non-void return type. C99 makes this a constraint violation but still
allows falling off the end of the function.
Nov 14 '05 #19

P: n/a
On Sun, 7 Nov 2004 05:58:35 +0000 (UTC), dj******@csclub.uwaterloo.ca
(Dave Vandervies) wrote:
In article <50**************************@posting.google.com >,
Chumbo <ch*********@hotmail.com> wrote:


<snip: modification, namely strtok'enizing, of string literal value vs
array initialized to contain string>

FAQ 1.32, 16.6 at the usual places and
http://www.eskimo.com/~scs/C-faq/top.html
char str[] = "dot.delimited.str";


This allocates an array of char and populates it with the characters
in the string "dot.delimited.str" (including the terminating '\0').
The array is in the automatic-allocation space (we can call this "the
stack", but we prefer not to, because the function-invocation stack that
it exists in is not the same stack as the "processor's stack segment"
stack (which need not even exist) that people typically assume we're
talking about), and therefore that we can do pretty much whatever we


Huh? Unless I completely misunderstand what you are saying:
while the C standard does not specify implementation techniques, and
so can't topically rely on "the stack", on every machine I know of
that has a "processor stack segment", or indeed just a "processor
(memory) stack" anywhere, it was designed to be and in fact is used
for C (and other HLL) function-invocation frames in a stack fashion.

<snip rest>

- David.Thompson1 at worldnet.att.net
Nov 14 '05 #20

P: n/a
On Mon, 15 Nov 2004 04:09:47 +0000, Dave Thompson wrote:

....
Huh? Unless I completely misunderstand what you are saying:
while the C standard does not specify implementation techniques, and
so can't topically rely on "the stack", on every machine I know of
that has a "processor stack segment", or indeed just a "processor
(memory) stack" anywhere, it was designed to be and in fact is used
for C (and other HLL) function-invocation frames in a stack fashion.

Well, the 6502 processor, which certainly does have C compilers targetting
it, has a 256 byte processor stack. I suspect the C compilers don't use
that to hold automatic variables.

Lawrence
Nov 14 '05 #21

P: n/a
In article <jm********************************@4ax.com>,
Dave Thompson <da*************@worldnet.att.net> wrote:
On Sun, 7 Nov 2004 05:58:35 +0000 (UTC), dj******@csclub.uwaterloo.ca
(Dave Vandervies) wrote:
The array is in the automatic-allocation space (we can call this "the
stack", but we prefer not to, because the function-invocation stack that
it exists in is not the same stack as the "processor's stack segment"
stack (which need not even exist) that people typically assume we're
talking about),

Huh? Unless I completely misunderstand what you are saying:
while the C standard does not specify implementation techniques, and
so can't topically rely on "the stack", on every machine I know of
that has a "processor stack segment", or indeed just a "processor
(memory) stack" anywhere, it was designed to be and in fact is used
for C (and other HLL) function-invocation frames in a stack fashion.


This is true, but not what I was saying (or at least not what I intended
to say).

The C (and other HLL) function-invocation stack is an abstract thing on
which continuations[1], automatic variables, and possibly a few things I'm
not thinking of live. The fact that, where they exist, hardware-assisted
stack mechanisms are universally used to (and, in fact, designed to)
implement this doesn't mean they're the same thing; one is a construct
of the language being implemented, and one is the back-end mechanism
used to implement it.

Note that (relevant to the subject, but not to your comments if
I'm understanding them correctly) even on systems without such a
hardware-assisted mechanism (I believe older IBM systems are the
canonical example, though I'd probably get the details wrong if I tried
to remember them) the invocation records (implemented as, say, a linked
list of freestore-allocated invocation frames) need to look, feel, and
smell like a stack - this is an obvious example of a function-invocation
stack not being implemented with a hardware stack.
dave

[1] That is, what to do when the function finishes (including what to
do with the return value, if there is one). Typically this is
implemented as a machine address referring to the point in the calling
code immediately after the actual subroutine call, where the return
value is handled appropriately and the rest of the caller continues
to run.

--
Dave Vandervies dj******@csclub.uwaterloo.ca
Perhaps the original version of the program worked.
OK, this takes us *way* off topic for any computer related newsgroup, but
you've got to admit its a theoretical possibility. --Ken Hagan in comp.arch
Nov 14 '05 #22

P: n/a
"Malcolm" <ma*****@55bank.freeserve.co.uk> wrote in message
news:cm*********@news7.svr.pol.co.uk...

"Jack Klein" <ja*******@spamcop.net> wrote
Furthermore, C does not define any such thing as "read-only memory".
C does not define any such thing as "read-write memory".

That is true, but does not apply here. The result of attempting to
modify a string literal is undefined behavior. This is true not
because the type of the string literal is array of const char, but
because the C standard specifically states that it is so.

So C defines various terms, but how does it define them? In the Standard,
which is an English-language document. So sometimes C will also define the
words used to define these terms.
See the problem?
Ultimately we have to use terms like "read-only memory", which are not
defined by the C standard, but which have meanings given to them by usage.

If the OP doesn't understand why an attempt to modify a string literal
causes a crash, then its unlikely that he will be familiar with the term
"string literal". He may also be shaky on "undefined behaviour". So a
technically more accurate explanation is in fact more confusing.


I agree 100%
Furthermore, strdup() may not be specified in the Standard, it is defined by
POSIX and is widely available on the systems used by beginners. Its semantics
are easy to understand.
On the contrary, strtok() has always been part of the Standard, but is a
constant cause of misunderstandings and implementation bugs. IMHO programmers
should be instructed to not use this bastard, not its friends (gets, strncpy,
strncat...)

strtok modifies the string pointed to by its first argument (if not NULL) and
keeps a copy to it in a global variable.
passing it a string literal such as "this is a string" may cause undefined
behaviour, such as crash.
passing it a pointer to automatic storage, such as what you do in the example,
is error prone.
initializing a automatic character array with a string literal is inefficient.
strtok() is flawed is ways beyond the understanding of newbie developpers, it
should not be used.

Chqrlie.

Nov 14 '05 #23

P: n/a
"Endymion Ponsonby-Withermoor III"
<m_a_r_v_i_n@para----and----.want-to-do.coe.ukk> wrote in message
news:cm**********@newsg4.svr.pol.co.uk...

"Flash Gordon" <sp**@flash-gordon.me.uk> wrote in message
news:vk************@brenda.flash-

The C standard says it is not const. The reason you should not write to
it is that the C standard says that writing to a string literal invokes
undefined behaviour.


OK. I sit corrected.


you pedants are so picky and righteous !

The C standard did not make it const char to avoid breaking countless sloppy
programs with compile time errors...
And more often than not, inconsistent use of char and string literals, hiding
subtile bugs waiting to bite at runtime.
So it is not "const", but if you try to modify it, you get UB ! Its has the
flavor and behaviour, but not the type...
Gimme a break, I'd rather newbies think it *be* const are understand why they
get a crash.

The case of the missing return statement is even a worse hack to accomodate
historical trash. It is about time some of those remnents from the 70s should
be fixed.

Just exactly what prevented the inclusion of strdup() into the Standard eludes
me !

The case against strtok() is also worth fighting : please stop teaching newbies
to use functions with flawed semantics like this one or its friends : gets(),
strncpy(), strncat()...

Chqrlie.

Nov 14 '05 #24

P: n/a
"Charlie Gordon" <ne**@chqrlie.org> wrote:
you pedants are so picky and righteous !
With good reason.
Just exactly what prevented the inclusion of strdup() into the Standard eludes
me !
It's easily emulated and would sit awkwardly in whichever header you put
it in.
The case against strtok() is also worth fighting :
strtok() has its problems, but where it's useful, it's very useful.
please stop teaching newbies to use functions with flawed semantics like
this one or its friends : gets(), strncpy(), strncat()...


Obviously I agree about gets(), and strncpy() is rarely if ever the
right function, but if you won't allow strncat(), which function _do_
you use when you want a length-limited string copy?

Richard
Nov 14 '05 #25

P: n/a
Hello,
The case against strtok() is also worth fighting :


strtok() has its problems, but where it's useful, it's very useful.
please stop teaching newbies to use functions with flawed semantics like
this one or its friends : gets(), strncpy(), strncat()...


Obviously I agree about gets(), and strncpy() is rarely if ever the
right function, but if you won't allow strncat(), which function _do_
you use when you want a length-limited string copy?


Why not write one that does what is needed.

Or use a test and write appropriate code for each case.

strncat() is not the worst of these, but it is consistently misused in code I
come across :

strncpy(dest, source, sizeof(dest)); // dangerous !
strncat(dest, source, sizeof(dest)); // worse even!

Chqrlie.

PS: the glove don't fit, you must acquit !
Nov 14 '05 #26

P: n/a
On Tue, 16 Nov 2004 12:36:55 +0100, Charlie Gordon wrote:

....
The case against strtok() is also worth fighting : please stop teaching newbies
to use functions with flawed semantics like this one or its friends : gets(),
strncpy(), strncat()...


What's wrong with strncat()?

Lawrence

Nov 14 '05 #27

P: n/a
"Lawrence Kirby" <lk****@netactive.co.uk> wrote in message
news:pa****************************@netactive.co.u k...
On Tue, 16 Nov 2004 12:36:55 +0100, Charlie Gordon wrote:

...
The case against strtok() is also worth fighting : please stop teaching newbies to use functions with flawed semantics like this one or its friends : gets(), strncpy(), strncat()...


What's wrong with strncat()?


It is often misunderstood, and misused :

strncpy(dest, source, sizeof(dest)); // dangerous !
strncat(dest, source, sizeof(dest)); // worse even!

Chqrlie.
Nov 14 '05 #28

P: n/a


Lawrence Kirby wrote:
On Tue, 16 Nov 2004 12:36:55 +0100, Charlie Gordon wrote:

...

The case against strtok() is also worth fighting : please stop teaching newbies
to use functions with flawed semantics like this one or its friends : gets(),
strncpy(), strncat()...


What's wrong with strncat()?


He probably refers to Criminally Stupid Behaviour (TM), such as
ignoring the difference between the size of the destination
and the number of characters that still fit in, that is using
strncat(dst, src, dst_size);
instead of
strncat(dst, src, dst_size-strlen(dst)-1);
However, if you have to work with people who cannot or do not
want to read and comprehend the function documentation, this is
really just one of the many things which will go wrong.

I guess that he would rather like to pass the overall size of the
destination which is IMO no bad idea at all.
The only question to answer is then whether to make the last
character '\0' if no string terminator was encountered or not.
Cheers
Michael
--
E-Mail: Mine is a gmx dot de address.

Nov 14 '05 #29

P: n/a


Michael Mair wrote:


Lawrence Kirby wrote:
On Tue, 16 Nov 2004 12:36:55 +0100, Charlie Gordon wrote:

...

The case against strtok() is also worth fighting : please stop
teaching newbies
to use functions with flawed semantics like this one or its friends :
gets(),
strncpy(), strncat()...

What's wrong with strncat()?

He probably refers to Criminally Stupid Behaviour (TM), such as
ignoring the difference between the size of the destination
and the number of characters that still fit in, that is using
strncat(dst, src, dst_size);
instead of
strncat(dst, src, dst_size-strlen(dst)-1);
However, if you have to work with people who cannot or do not
want to read and comprehend the function documentation, this is
really just one of the many things which will go wrong.

I guess that he would rather like to pass the overall size of the
destination which is IMO no bad idea at all.
The only question to answer is then whether to make the last
character '\0' if no string terminator was encountered or not.

Misleading: I mean in the destination (that is
strlen(dst)>= the passed size)


Cheers
Michael

--
E-Mail: Mine is a gmx dot de address.

Nov 14 '05 #30

P: n/a
"Charlie Gordon" <ne**@chqrlie.org> wrote:
please stop teaching newbies to use functions with flawed semantics like
this one or its friends : gets(), strncpy(), strncat()...
Obviously I agree about gets(), and strncpy() is rarely if ever the
right function, but if you won't allow strncat(), which function _do_
you use when you want a length-limited string copy?


Why not write one that does what is needed.


Such as strncat()?
Or use a test and write appropriate code for each case.

strncat() is not the worst of these, but it is consistently misused in code I
come across :


So is printf().

Richard
Nov 14 '05 #31

P: n/a
> > He probably refers to Criminally Stupid Behaviour (TM), such as
ignoring the difference between the size of the destination
and the number of characters that still fit in, that is using
strncat(dst, src, dst_size);
instead of
strncat(dst, src, dst_size-strlen(dst)-1);
Exactly, and many variants with more subtle issues (such as what if strlen(dst)
= dst_size)

However, if you have to work with people who cannot or do not
want to read and comprehend the function documentation, this is
really just one of the many things which will go wrong.
Quite true. But I make it part of house rules to avoid "booby trap programming
techniques" and part of house teachings to learn about them and detect them in
external code.
I guess that he would rather like to pass the overall size of the
destination which is IMO no bad idea at all.
The only question to answer is then whether to make the last
character '\0' if no string terminator was encountered or not.

Misleading: I mean in the destination (that is
strlen(dst)>= the passed size)


Insightful remark !

The function we are talking about does not modify the destination buffer in this
case, nor in the case of dest_size <= 0.
It does not dereference dest beyond dest_size, thus does not compute
strlen(dest).
It is arguable whether it should accept NULL pointers either as source or
destination.
It returns a useful result : not the adress of the destination buffer, but an
iidication of what happened, such as the number of characters copied before the
'\0' or something negative in case of truncation, or malformed destination, or
invalid dest_size.

This is really a family of 5 functions :

int pstrlen(const char *buf, int buf_size);
int pstrcpy(char *buf, int buf_size, const char *str);
int pstrncpy(char *buf, int buf_size, const char *str, int n);
int pstrcat(char *buf, int buf_size, const char *str);
int pstrncat(char *buf, int buf_size, const char *str, int n);

I use int instead of size_t for the sizes on purpose, only ssize_t might be more
appropriate.

Chqrlie.

Nov 14 '05 #32

P: n/a
On Tue, 16 Nov 2004 15:34:51 +0100, Charlie Gordon wrote:
"Lawrence Kirby" <lk****@netactive.co.uk> wrote in message
news:pa****************************@netactive.co.u k...
On Tue, 16 Nov 2004 12:36:55 +0100, Charlie Gordon wrote:

...
> The case against strtok() is also worth fighting : please stop teaching newbies > to use functions with flawed semantics like this one or its friends : gets(), > strncpy(), strncat()...


What's wrong with strncat()?


It is often misunderstood, and misused :

strncpy(dest, source, sizeof(dest)); // dangerous !
strncat(dest, source, sizeof(dest)); // worse even!


I can understand strncpy() being misunderstood, its behaviour
is oddball. strncat() is pretty simple though. You might just
as well say strcpy() is dangerous (and yes some people will
say that) but it isn't difficult to use safely.

Lawrence
Nov 14 '05 #33

This discussion thread is closed

Replies have been disabled for this discussion.