Alf P. Steinbach wrote:
* Tu*********@gmail.com: Why does the following give a segmentation fault?
void breakme(char* st) {
char* cp = st;
*cp = 'x' // This is the problem line.
}
int main() {
char* mine = "teststringfortestingpurposes";
breakme(mine);
}
Why not?
The program has undefined behavior.
Therefore, anything can happen, including a SIGSEGV.
If instead you ask, why does C++ allow that particular nastie?
That's to maintain backwards compatibility with C.
The idea that modification of literals is undefined in C++ for the sake
of compatibility with C is quite ridiculous.
Other languages have similar restrictions against what is essentially
self-modifying code.
Modification of literals is also undefined in Lisp for instance. Is
that for compatibility with C also?
(setf (car '(a b c)) 42) ;; Nasal demons!
(setf (char "abc" 1) #\z) ;; ditto
This type of liberty with respect to literals allows for ROM-able code:
both the program code and its literal data can be compiled into a
single program image, which can then be put into write-protected
memory. The program can reference that data using a direct pointer into
that memory.
A literal should be thought of as a component of the program itself,
and not some external data. Through a literal quoting mechanism, a
program gains access to a piece of itself which it can use as data in a
computation.
If such objects had to be modifiable, then only their initial values
could be stored in that read-only memory. Modifiable storage would have
to be allocated for them at program startup, and the initial values
copied there. This is a waste of time and storage for objects which are
treated as immutable.
And let's not forget that immutable objects can also be compressed
together to save space through substructure sharing. If one string
literal matches a suffix of another, or possible the entire string,
then it can be represented as a pointer to that suffix. If those
objects could be modified, there would be surprising behaviors.
Changing one literal would change some unrelated string that shares
storage with it.
Many C and C++ implementations for virtual memory systems provide a
nice, accurate diagnostic for this error. The compiler places string
literals into the same object file sections as code, so literals are
co-allocated with code. The dynamic loader maps the program's
executable image file into the process address space as read only, so
any self modification (code or literal data) results in an instant
violation. It's a nice setup.