Can I Trust Pointer Arithmetic In Re-Allocated Memory?

Bill Reid

Bear with me, as I am not a "professional" programmer, but I was
working on part of program that reads parts of four text files into
a buffer which I re-allocate the size as I read each file. I read some
of the items from the bottom up of the buffer, and some from the
top down, moving the bottom items back to the new re-allocated
bottom on every file read.

Then when I've read all four files, I sort the top and bottom items
separately using qsort(), which takes a pointer to a list of items, and
write the two sorted lists to two new files.

Problem is, I worry that if I just supply a pointer to the first item
in the bottom list to qsort(), it might point out to bozo-land during
the sort because I thought that dynamically re-allocated memory
is not necessarily contiguous. So I've done a little two step where
I write the bottom list to another buffer to do the sorting and writing,
and everything works great, but I'm wondering if I'm wasting time
and worrying about nothing...after all, if I can't trust a pointer to an
arbitrary point in the list, how can I trust a pointer to the start of
the list?

Any light you can shed on how pointers are handled in dynamically
allocated memory would be interesting and helpful...thanks.

---
William Ernest Reid

Aug 11 '06 #1

Subscribe Post Reply

3007

Barry Schwarz

On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

>Bear with me, as I am not a "professional" programmer, but I was
working on part of program that reads parts of four text files into
a buffer which I re-allocate the size as I read each file. I read some
of the items from the bottom up of the buffer, and some from the
top down, moving the bottom items back to the new re-allocated
bottom on every file read.

I don't quite follow this description.

>
Then when I've read all four files, I sort the top and bottom items
separately using qsort(), which takes a pointer to a list of items, and
write the two sorted lists to two new files.

Problem is, I worry that if I just supply a pointer to the first item
in the bottom list to qsort(), it might point out to bozo-land during
the sort because I thought that dynamically re-allocated memory
is not necessarily contiguous. So I've done a little two step where

The block of memory whose non-NULL address is returned from
malloc/realloc/calloc is guaranteed to be contiguous. You memory is
allocated from address to address+size-1. Furthermore, calculating
the value address+size is always allowed but you may not dereference
this address.

>I write the bottom list to another buffer to do the sorting and writing,
and everything works great, but I'm wondering if I'm wasting time
and worrying about nothing...after all, if I can't trust a pointer to an
arbitrary point in the list, how can I trust a pointer to the start of
the list?

Any light you can shed on how pointers are handled in dynamically
allocated memory would be interesting and helpful...thanks.

A pointer value between the limits mentioned above is within range of
the allocated memory. You have to insure alignment but if the pointer
has the correct type the compiler will do this for you.
Remove del for email

Aug 11 '06 #2

Bill Reid

Barry Schwarz <sc******@doezl.netwrote in message
news:k9********************************@4ax.com...

On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

Bear with me, as I am not a "professional" programmer, but I was
working on part of program that reads parts of four text files into
a buffer which I re-allocate the size as I read each file. I read some
of the items from the bottom up of the buffer, and some from the
top down, moving the bottom items back to the new re-allocated
bottom on every file read.

I don't quite follow this description.

Yeah, it's a little confusing, and not that relevant to what I'm
asking...the
bottom line is I want to separately sort two parts of a list...

Then when I've read all four files, I sort the top and bottom items
separately using qsort(), which takes a pointer to a list of items, and
write the two sorted lists to two new files.

Problem is, I worry that if I just supply a pointer to the first item
in the bottom list to qsort(), it might point out to bozo-land during
the sort because I thought that dynamically re-allocated memory
is not necessarily contiguous. So I've done a little two step where

The block of memory whose non-NULL address is returned from
malloc/realloc/calloc is guaranteed to be contiguous.

OK, that's the answer, I was just plain wrong that the memory
might not be contiguous...I've probably only read that guarantee
about 100000000000 times but just forgot it.

I think I got that confused with the idea that the re-allocated
block may have a different location than the original malloc, which
would mean...

You memory is
allocated from address to address+size-1. Furthermore, calculating
the value address+size is always allowed but you may not dereference
this address.

....you wouldn't want to dereference an address, right.

I write the bottom list to another buffer to do the sorting and writing,
and everything works great, but I'm wondering if I'm wasting time
and worrying about nothing...after all, if I can't trust a pointer to an
arbitrary point in the list, how can I trust a pointer to the start of
the list?

Any light you can shed on how pointers are handled in dynamically
allocated memory would be interesting and helpful...thanks.

A pointer value between the limits mentioned above is within range of
the allocated memory. You have to insure alignment but if the pointer
has the correct type the compiler will do this for you.

OK, so this should be completely legal and flawless:

/* sort the symbol list alphabetically */
qsort((void *)curr_instrs,num_symbols,128,sort_alpha_list);

then...

/* sort the no-symbol list alphabetically */
qsort((void *)curr_instrs+num_symbols,num_no_symbols,128,sort_ alpha_list);

First qsort() sorts down to the end of the symbols part of the list,
the second sorts down from the start of the no-symbols part of the
list to the end of the list. I guess it was the (void *) cast that scared
me...thanks.

---
William Ernest Reid

Aug 11 '06 #3

Keith Thompson

"Bill Reid" <ho********@happyhealthy.netwrites:

Barry Schwarz <sc******@doezl.netwrote in message
news:k9********************************@4ax.com...
>On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

>Bear with me, as I am not a "professional" programmer, but I was
working on part of program that reads parts of four text files into
a buffer which I re-allocate the size as I read each file. I read some
of the items from the bottom up of the buffer, and some from the
top down, moving the bottom items back to the new re-allocated
bottom on every file read.

I don't quite follow this description.

Yeah, it's a little confusing, and not that relevant to what I'm
asking...the
bottom line is I want to separately sort two parts of a list...

>
Then when I've read all four files, I sort the top and bottom items
separately using qsort(), which takes a pointer to a list of items, and
write the two sorted lists to two new files.

Problem is, I worry that if I just supply a pointer to the first item
in the bottom list to qsort(), it might point out to bozo-land during
the sort because I thought that dynamically re-allocated memory
is not necessarily contiguous. So I've done a little two step where

The block of memory whose non-NULL address is returned from
malloc/realloc/calloc is guaranteed to be contiguous.

OK, that's the answer, I was just plain wrong that the memory
might not be contiguous...I've probably only read that guarantee
about 100000000000 times but just forgot it.

I think I got that confused with the idea that the re-allocated
block may have a different location than the original malloc, which
would mean...

One thing that I found a little confusing in your original message is
that you talked about "re-allocated" memory, but you didn't mention
the "realloc" function. The more specific your description, the more
likely it is that we can help.

[...]

OK, so this should be completely legal and flawless:

/* sort the symbol list alphabetically */
qsort((void *)curr_instrs,num_symbols,128,sort_alpha_list);

then...

/* sort the no-symbol list alphabetically */
qsort((void *)curr_instrs+num_symbols,num_no_symbols,128,sort_ alpha_list);

Um, no.

Don't be afraid of whitespace. I put blanks around most operator
symbols, and after every comma. If I have to split something across
lines, that's ok. So I'd write your qsort call as:

qsort((void *)curr_instrs + num_symbols,
num_no_symbols,
128,
sort_alpha_list);

The third argument, 128, is a "magic number". It's very difficult to
tell what it means or whether it's even correct. Define a constant:
#define WHATEVER 128
so you only need to change it in one place (but pick a better name, of
course).

The first argument to qsort is:

(void *)curr_instrs + num_symbols

You can't do pointer arithmetic on a void* value. (Some compilers may
allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
replace "-ansi" with "-std=c99").

If you're trying to get the address pointed to by curr_instrs plus an
offset of num_symbols bytes, you'll need to to the arithmetic using
char*:

qsort((char*)curr_instrs + num_symbols,
/* other args */);

assuming that curr_instrs isn't already a char*. Note that I didn't
cast the expression to void*; any pointer-to-object type can be
converted to void*, or vice versa.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Aug 11 '06 #4

Bill Reid

Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...

"Bill Reid" <ho********@happyhealthy.netwrites:
Barry Schwarz <sc******@doezl.netwrote in message
news:k9********************************@4ax.com...
On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

Bear with me, as I am not a "professional" programmer, but I was
working on part of program that reads parts of four text files into
a buffer which I re-allocate the size as I read each file. I read

some

of the items from the bottom up of the buffer, and some from the
top down, moving the bottom items back to the new re-allocated
bottom on every file read.

I don't quite follow this description.

Yeah, it's a little confusing, and not that relevant to what I'm
asking...the
bottom line is I want to separately sort two parts of a list...

Then when I've read all four files, I sort the top and bottom items
separately using qsort(), which takes a pointer to a list of items,

and

write the two sorted lists to two new files.

Problem is, I worry that if I just supply a pointer to the first item
in the bottom list to qsort(), it might point out to bozo-land during
the sort because I thought that dynamically re-allocated memory
is not necessarily contiguous. So I've done a little two step where

The block of memory whose non-NULL address is returned from
malloc/realloc/calloc is guaranteed to be contiguous.
OK, that's the answer, I was just plain wrong that the memory
might not be contiguous...I've probably only read that guarantee
about 100000000000 times but just forgot it.

I think I got that confused with the idea that the re-allocated
block may have a different location than the original malloc, which
would mean...

One thing that I found a little confusing in your original message is
that you talked about "re-allocated" memory, but you didn't mention
the "realloc" function. The more specific your description, the more
likely it is that we can help.

Well, OK, maybe, here's canonical specificity:

/* now re-allocate memory for the instrument strings */
if((curr_instrs=(instr_strs *)
realloc(curr_instrs,num_instrs*sizeof(instr_strs)) )==NULL) {
printf("Not enough memory for instruments buffer\n");
goto CloseFiles;
}

Does that help you help me?

[...]

OK, so this should be completely legal and flawless:

/* sort the symbol list alphabetically */
qsort((void *)curr_instrs,num_symbols,128,sort_alpha_list);

then...

/* sort the no-symbol list alphabetically */
qsort((void

*)curr_instrs+num_symbols,num_no_symbols,128,sort_ alpha_list);

>
Um, no.

By "legal and flawless" I DID mean "100% guaranteed functional",
not "pleasing to thine eyes"...

Don't be afraid of whitespace. I put blanks around most operator
symbols, and after every comma. If I have to split something across
lines, that's ok. So I'd write your qsort call as:

qsort((void *)curr_instrs + num_symbols,
num_no_symbols,
128,
sort_alpha_list);

That's the way YOU'D do it, I do it differently, and since I'm the only
one reading it (except in this one rare instance, or occasionally I'll post
some code somewhere on the net), I can read it just fine, and of
course it compiles all the same...

The third argument, 128, is a "magic number". It's very difficult to
tell what it means or whether it's even correct. Define a constant:
#define WHATEVER 128

In qsort(), it's basically 128 (character) bytes.

I've actually got "128" defined globally (and I do mean globally, for
several hundred thousand lines of code) for the purposes of reading
and writing strings of certain lengths. And those damned defines
have managed to screw me up royally several times, including a
really irritating "intermittent" problem I had when I first wrote this
particular section of code. So lately I've been using them less
and less...

so you only need to change it in one place (but pick a better name, of
course).

Even at file scope right now I'm more comfortable with the way it
is...

The first argument to qsort is:

(void *)curr_instrs + num_symbols

You can't do pointer arithmetic on a void* value. (Some compilers may
allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
replace "-ansi" with "-std=c99").

Then how does qsort() do it? I'm assuming now that it must just
use pointer arithmetic internally, because it doesn't seem to want or
recognize my typedef of a 128-character string:

typedef char instr_strs[128];
instr_strs *curr_instrs;

If you're trying to get the address pointed to by curr_instrs plus an
offset of num_symbols bytes, you'll need to to the arithmetic using
char*:

qsort((char*)curr_instrs + num_symbols,
/* other args */);

assuming that curr_instrs isn't already a char*.

Nope, a pointer to the first of many 128-character strings, as above, so
are you saying the pointer cast should be (instr_strs *)? I have no problem
with that, as long as it works, and I must stress again at this point that
the current code:

/* sort the symbol list alphabetically */
qsort((void *)curr_instrs,num_symbols,128,sort_alpha_list);

Has worked flawlessly for months now; it's part of a particular section
of code that downloads about 3/4 meg of raw data from the net every
day at a specific time, parses out about 100,000 data items, and writes
them to a custom database in a matter of seconds.

The only reason I asked the original question was because I went
back and reviewed the code and wondered if I could shave a few
more milliseconds off the execution time...

Note that I didn't

cast the expression to void*; any pointer-to-object type can be
converted to void*, or vice versa.

Yeah, I noticed that, I just use (void *) because that's what
I thought qsort() wanted, and it definitely WORKS that way
(I've used qsort() dozens of times EXACTLY that way without
problems).

Now to get back to this:

If you're trying to get the address pointed to by curr_instrs plus an
offset of num_symbols bytes, you'll need to to the arithmetic using
char*:

qsort((char*)curr_instrs + num_symbols,
/* other args */);

I think I see what you're saying, maybe...and maybe not...

If curr_instrs is pointer to a 128-character string type, wouldn't
curr_instrs+num_symbols then point to a location offset from
curr_instrs by (num_symbols*128 bytes)? And if so, what's
the point of cast (char *) if qsort() already works by sorting
some specified number of sequences of some specified
number of character bytes?

I thought I had the answer to my original question, and then it
slipped away from me...

---
William Ernest Reid

Aug 11 '06 #5

Keith Thompson

"Bill Reid" <ho********@happyhealthy.netwrites:

Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...

[...]

>One thing that I found a little confusing in your original message is
that you talked about "re-allocated" memory, but you didn't mention
the "realloc" function. The more specific your description, the more
likely it is that we can help.

Well, OK, maybe, here's canonical specificity:

/* now re-allocate memory for the instrument strings */
if((curr_instrs=(instr_strs *)
realloc(curr_instrs,num_instrs*sizeof(instr_strs)) )==NULL) {
printf("Not enough memory for instruments buffer\n");
goto CloseFiles;
}

Does that help you help me?

A little, but there are still a bunch of identifiers whose
declarations I haven't seen.

I will make one comment: Don't cast the result of malloc() or
realloc(). See section 7 of the comp.lang.c FAQ,
<http://www.c-faq.com/>, particularly questions 7.7b.

>[...]

OK, so this should be completely legal and flawless:

/* sort the symbol list alphabetically */
qsort((void *)curr_instrs,num_symbols,128,sort_alpha_list);

then...

/* sort the no-symbol list alphabetically */
qsort((void

*)curr_instrs+num_symbols,num_no_symbols,128,sort_ alpha_list);
>>
Um, no.

By "legal and flawless" I DID mean "100% guaranteed functional",
not "pleasing to thine eyes"...

The code isn't 100% guaranteed functional". You're performing
arithmetic on a void*. That's not allowed in standard C.

>Don't be afraid of whitespace. I put blanks around most operator
symbols, and after every comma. If I have to split something across
lines, that's ok. So I'd write your qsort call as:

qsort((void *)curr_instrs + num_symbols,
num_no_symbols,
128,
sort_alpha_list);

That's the way YOU'D do it, I do it differently, and since I'm the only
one reading it (except in this one rare instance, or occasionally I'll post
some code somewhere on the net), I can read it just fine, and of
course it compiles all the same...

Ok, but I find it more difficult to read without the whitespace.
Whenever you post code here, you can expect comments on its style.
You're under no obligation to pay attention.

>The third argument, 128, is a "magic number". It's very difficult to
tell what it means or whether it's even correct. Define a constant:
#define WHATEVER 128

In qsort(), it's basically 128 (character) bytes.

Ok, but why 128 rather than 127, or 100, or 256? That's a rhetorical
question; you don't need to answer it, but ideally your code should.
(And yes, it's a style issue.)

I've actually got "128" defined globally (and I do mean globally, for
several hundred thousand lines of code) for the purposes of reading
and writing strings of certain lengths. And those damned defines
have managed to screw me up royally several times, including a
really irritating "intermittent" problem I had when I first wrote this
particular section of code. So lately I've been using them less
and less...

>so you only need to change it in one place (but pick a better name, of
course).

Even at file scope right now I'm more comfortable with the way it
is...

Ok, it's your code, but I'm quite surprised that defining symbolic
constants would cause more problems than it would solve.

If someone else needs to maintain your code (and "someone else" could
be you a year from now), it's not going to be obvious that the 128 in
this function corresponds to the 128 (or 127) in another function, but
the 128 in that function over there is just coincidental. There's a
good discussion at <http://c-faq.com/~scs/cclass/notes/sx9b.html>.

>The first argument to qsort is:

(void *)curr_instrs + num_symbols

You can't do pointer arithmetic on a void* value. (Some compilers may
allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
replace "-ansi" with "-std=c99").

Then how does qsort() do it? I'm assuming now that it must just
use pointer arithmetic internally, because it doesn't seem to want or
recognize my typedef of a 128-character string:

typedef char instr_strs[128];
instr_strs *curr_instrs;

qsort behaves in a manner consistent with its specification. That's
all you really need to know. It needn't even be implemented in C, and
if it is, it's free to use compiler-specific extensions.

But if it's implemented in standard C (which is entirely possible), it
presumably would convert the void* arguments to char* before
performing arithmetic on them. (Since char* and void* have the same
representation, the conversion doesn't cost anything at run time.)

>If you're trying to get the address pointed to by curr_instrs plus an
offset of num_symbols bytes, you'll need to to the arithmetic using
char*:

qsort((char*)curr_instrs + num_symbols,
/* other args */);

assuming that curr_instrs isn't already a char*.

Nope, a pointer to the first of many 128-character strings, as above, so
are you saying the pointer cast should be (instr_strs *)? I have no problem
with that, as long as it works, and I must stress again at this point that
the current code:

/* sort the symbol list alphabetically */
qsort((void *)curr_instrs,num_symbols,128,sort_alpha_list);

Has worked flawlessly for months now; it's part of a particular section
of code that downloads about 3/4 meg of raw data from the net every
day at a specific time, parses out about 100,000 data items, and writes
them to a custom database in a matter of seconds.

That qsort() call isn't the one with the problem.

Incidentally, a piece of code is either correct or not. The number of
times it "works" really doesn't prove anything. If your compiler
accepts some non-portable code, it's probably going to keep working
the same way indefinitely -- but it will fail the first time you
compile it with a different compiler, or with the same compiler and
different options. Correctness is not statistical.

One pitfall of C is that there are a lot of errors that your compiler
isn't required to tell you about. Many things invoke undefined
behavior; they may appear to work, but the language doesn't guarantee
anything. Other things may be compiler-specific extensions. The
language requires a conforming implementation to issue a diagnostic
message for many of these -- but many compilers (including gcc) are
not conforming in their default mode. Typically you can use
command-line options to enable a conforming mode and provide
additional warnings.

The only reason I asked the original question was because I went
back and reviewed the code and wondered if I could shave a few
more milliseconds off the execution time...

Note that I didn't
>cast the expression to void*; any pointer-to-object type can be
converted to void*, or vice versa.

Yeah, I noticed that, I just use (void *) because that's what
I thought qsort() wanted, and it definitely WORKS that way
(I've used qsort() dozens of times EXACTLY that way without
problems).

Yes, it works, but it's not necessary. As a general rule, casts
should be avoided unless they're actually required. A cast is, among
other things, a promise to the compiler that you know what you're
doing, and will often inhibit warnings and error messages. In this
case, the argument will be implicitly converted to void* without the
cast (assuming you have a visible prototype for qsort() -- i.e., you
haven't forgotten the "#include <stdlib.h>".) The code is perfectly
correct either way, but the form with the cast is more "brittle". If
the cast had specified the wrong type, for example, the compiler
likely wouldn't have told you about the error.

Now to get back to this:

>If you're trying to get the address pointed to by curr_instrs plus an
offset of num_symbols bytes, you'll need to to the arithmetic using
char*:

qsort((char*)curr_instrs + num_symbols,
/* other args */);

I think I see what you're saying, maybe...and maybe not...

If curr_instrs is pointer to a 128-character string type, wouldn't
curr_instrs+num_symbols then point to a location offset from
curr_instrs by (num_symbols*128 bytes)? And if so, what's
the point of cast (char *) if qsort() already works by sorting
some specified number of sequences of some specified
number of character bytes?

I haven't seen the full context of your code (or if I have, I've
forgotten it). Your original code had

(void*)curr_instrs + num_symbols

which is illegal, because you can't perform pointer arithmetic on
void* (the cast applies to "curr_instrs", not to "curr_instrs +
num_symbols"). Pointer arithmetic, as you probably know, is scaled by
the size of the pointed-to type.

Are you using gcc? If so, it supports arithmetic on void* as an
extension; it acts like arithmetic on char*. (IMHO, this extension is
a bad idea.) By casting curr_instrs to void*, you cause the "+
num_symbols" to denote an offset of num_symbols *bytes*. I had
guessed that that's what you wanted, but apparently it isn't.

I think what you *really* wanted was for the addition to be scaled by
sizeof *curr_instrs (128 bytes?). If so, you probably meant to use

(void*)(curr_instrs + num_symbols)

which should work. But since the argument will be implicitly
converted to void* anyway, all you need is

curr_instrs + num_symbols

In other words, all you need to do is drop the cast. This avoids
depending on a compiler-specific extension *and* corrects a bug. It's
also a very nice demonstration of why unnecessary casts should be
avoided.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Aug 11 '06 #6

Bill Reid

Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...

"Bill Reid" <ho********@happyhealthy.netwrites:
Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...
[...]

One thing that I found a little confusing in your original message is
that you talked about "re-allocated" memory, but you didn't mention
the "realloc" function. The more specific your description, the more
likely it is that we can help.

Well, OK, maybe, here's canonical specificity:

/* now re-allocate memory for the instrument strings */
if((curr_instrs=(instr_strs *)
realloc(curr_instrs,num_instrs*sizeof(instr_strs)) )==NULL) {
printf("Not enough memory for instruments buffer\n");
goto CloseFiles;
}

Does that help you help me?

A little, but there are still a bunch of identifiers whose
declarations I haven't seen.

Exactly why I didn't want to post any code in the first place,
just wanted to ask a verbal question (see Subject). I'm calling
on all types of custom libraries for this data downloading function,
and the more you see, the more you won't recognize...

I will make one comment: Don't cast the result of malloc() or
realloc(). See section 7 of the comp.lang.c FAQ,
<http://www.c-faq.com/>, particularly questions 7.7b.

OK, I think I've heard some type of debate about this, I thought
(based on the DOCUMENTATION THAT CAME WITH MY
FRIGGIN' DEVELOPMENT PACKAGE) that was what you
were supposed to do; it has seemed to work OK...

[...]

OK, so this should be completely legal and flawless:

/* sort the symbol list alphabetically */
qsort((void *)curr_instrs,num_symbols,128,sort_alpha_list);

then...

/* sort the no-symbol list alphabetically */
qsort((void
*)curr_instrs+num_symbols,num_no_symbols,128,sort_ alpha_list);
>
Um, no.

By "legal and flawless" I DID mean "100% guaranteed functional",
not "pleasing to thine eyes"...

The code isn't 100% guaranteed functional". You're performing
arithmetic on a void*. That's not allowed in standard C.

Yes, sort of, I recognized my mistake after I last hit "send", and
way below, you hit on the actual error I made...

Don't be afraid of whitespace. I put blanks around most operator
symbols, and after every comma. If I have to split something across
lines, that's ok. So I'd write your qsort call as:

qsort((void *)curr_instrs + num_symbols,
num_no_symbols,
128,
sort_alpha_list);

That's the way YOU'D do it, I do it differently, and since I'm the only
one reading it (except in this one rare instance, or occasionally I'll

post

some code somewhere on the net), I can read it just fine, and of
course it compiles all the same...

Ok, but I find it more difficult to read without the whitespace.
Whenever you post code here, you can expect comments on its style.
You're under no obligation to pay attention.

The third argument, 128, is a "magic number". It's very difficult to
tell what it means or whether it's even correct. Define a constant:
#define WHATEVER 128
In qsort(), it's basically 128 (character) bytes.

Ok, but why 128 rather than 127, or 100, or 256? That's a rhetorical
question; you don't need to answer it, but ideally your code should.
(And yes, it's a style issue.)

I have my reasons, the most important of which I would think would
be obvious, and a secondary reason which should also be both apparent
and not really important at the same time...

I've actually got "128" defined globally (and I do mean globally, for
several hundred thousand lines of code) for the purposes of reading
and writing strings of certain lengths. And those damned defines
have managed to screw me up royally several times, including a
really irritating "intermittent" problem I had when I first wrote this
particular section of code. So lately I've been using them less
and less...

so you only need to change it in one place (but pick a better name, of
course).

Even at file scope right now I'm more comfortable with the way it
is...

Ok, it's your code, but I'm quite surprised that defining symbolic
constants would cause more problems than it would solve.

As I've alluded, it partly depends on the scope. What I've found
the hard way is that you should really know EXACTLY what you
need AT THE MOST LOCAL LEVEL OF SCOPE. And I
found I kept forgetting what my global defines were, and I would
use an inappropriate one where I knew EXACTLY what I needed
RIGHT THERE. Among other problems...

In this case, maybe a sizeof() would be better...

If someone else needs to maintain your code (and "someone else" could
be you a year from now), it's not going to be obvious that the 128 in
this function corresponds to the 128 (or 127) in another function, but
the 128 in that function over there is just coincidental. There's a
good discussion at <http://c-faq.com/~scs/cclass/notes/sx9b.html>.

Everything is local to a single 1000-line function for this data downloading
operation; any (many) calls out to my custom libraries don't care about
string lengths because they do a strlen() on the passed string pointer
on entry.

The first argument to qsort is:

(void *)curr_instrs + num_symbols

You can't do pointer arithmetic on a void* value. (Some compilers may
allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
replace "-ansi" with "-std=c99").

Then how does qsort() do it? I'm assuming now that it must just
use pointer arithmetic internally, because it doesn't seem to want or
recognize my typedef of a 128-character string:

typedef char instr_strs[128];
instr_strs *curr_instrs;

qsort behaves in a manner consistent with its specification. That's
all you really need to know. It needn't even be implemented in C, and
if it is, it's free to use compiler-specific extensions.

But if it's implemented in standard C (which is entirely possible), it
presumably would convert the void* arguments to char* before
performing arithmetic on them. (Since char* and void* have the same
representation, the conversion doesn't cost anything at run time.)

Yeah, apparently it only processes a char* at a time, and the
declaration of void* just prevents somebody from stupidly passing
the wrong starting point, or something...

If you're trying to get the address pointed to by curr_instrs plus an
offset of num_symbols bytes, you'll need to to the arithmetic using
char*:

qsort((char*)curr_instrs + num_symbols,
/* other args */);

assuming that curr_instrs isn't already a char*.
Nope, a pointer to the first of many 128-character strings, as above, so
are you saying the pointer cast should be (instr_strs *)? I have no

problem

with that, as long as it works, and I must stress again at this point

that

the current code:

/* sort the symbol list alphabetically */
qsort((void *)curr_instrs,num_symbols,128,sort_alpha_list);

Has worked flawlessly for months now; it's part of a particular section
of code that downloads about 3/4 meg of raw data from the net every
day at a specific time, parses out about 100,000 data items, and writes
them to a custom database in a matter of seconds.

That qsort() call isn't the one with the problem.

Exactly. It was the one I was going to add that had a problem...

Incidentally, a piece of code is either correct or not. The number of
times it "works" really doesn't prove anything. If your compiler
accepts some non-portable code, it's probably going to keep working
the same way indefinitely -- but it will fail the first time you
compile it with a different compiler, or with the same compiler and
different options. Correctness is not statistical.

Try telling that to a third-grade teacher grading tests...

One pitfall of C is that there are a lot of errors that your compiler
isn't required to tell you about. Many things invoke undefined
behavior; they may appear to work, but the language doesn't guarantee
anything. Other things may be compiler-specific extensions. The
language requires a conforming implementation to issue a diagnostic
message for many of these -- but many compilers (including gcc) are
not conforming in their default mode. Typically you can use
command-line options to enable a conforming mode and provide
additional warnings.

Most importantly in my case, a compiler is not a mind-reader...

The only reason I asked the original question was because I went
back and reviewed the code and wondered if I could shave a few
more milliseconds off the execution time...

Note that I didn't
cast the expression to void*; any pointer-to-object type can be
converted to void*, or vice versa.

Yeah, I noticed that, I just use (void *) because that's what
I thought qsort() wanted, and it definitely WORKS that way
(I've used qsort() dozens of times EXACTLY that way without
problems).

Yes, it works, but it's not necessary.

Are you sure? Somehow, it seems like I tried it without the cast, and
got an error, but if that actually happened, it was years, hell, decades
ago.

Like most people, I'm a victim of experience: I just keep doing what
works...

As a general rule, casts
should be avoided unless they're actually required.

The question here would be: does qsort() require it? Here's the
documentation:

Syntax

#include <stdlib.h>
void qsort(void *base, size_t nelem, size_t width,
int (_USERENTRY *fcmp)(const void *, const void *));

and the example from the documentation:

int sort_function( const void *a, const void *b);
char list[5][4] = { "cat", "car", "cab", "cap", "can" };

int main(void)
{
int x;

qsort((void *)list, 5, sizeof(list[0]), sort_function);
for (x = 0; x < 5; x++)
printf("%s\n", list[x]);
return 0;
}

int sort_function( const void *a, const void *b)
{
return( strcmp((char *)a,(char *)b) );
}

Unlike some example documentation that I can think of, THAT one
actually works as advertised...but doesn't mean that the cast is
required...

A cast is, among
other things, a promise to the compiler that you know what you're
doing, and will often inhibit warnings and error messages. In this
case, the argument will be implicitly converted to void* without the
cast (assuming you have a visible prototype for qsort() -- i.e., you
haven't forgotten the "#include <stdlib.h>".) The code is perfectly
correct either way, but the form with the cast is more "brittle". If
the cast had specified the wrong type, for example, the compiler
likely wouldn't have told you about the error.

Well, OK, I just compiled it without the cast, and it came
up clean. Then I immediately pasted the cast back in place, since
this is "production code", and from a perfectly practical standpoint,
the void* cast is 100% functional IN THIS CASE, so I'm loathe
to mess anything up...

Now to get back to this:

If you're trying to get the address pointed to by curr_instrs plus an
offset of num_symbols bytes, you'll need to to the arithmetic using
char*:

qsort((char*)curr_instrs + num_symbols,
/* other args */);
I think I see what you're saying, maybe...and maybe not...

If curr_instrs is pointer to a 128-character string type, wouldn't
curr_instrs+num_symbols then point to a location offset from
curr_instrs by (num_symbols*128 bytes)? And if so, what's
the point of cast (char *) if qsort() already works by sorting
some specified number of sequences of some specified
number of character bytes?

I haven't seen the full context of your code (or if I have, I've
forgotten it). Your original code had

(void*)curr_instrs + num_symbols

which is illegal, because you can't perform pointer arithmetic on
void* (the cast applies to "curr_instrs", not to "curr_instrs +
num_symbols"). Pointer arithmetic, as you probably know, is scaled by
the size of the pointed-to type.

Actually, SEQUENCE POINTS!!!! Yes, I know now this is
wrong...

Are you using gcc? If so, it supports arithmetic on void* as an
extension; it acts like arithmetic on char*. (IMHO, this extension is
a bad idea.) By casting curr_instrs to void*, you cause the "+
num_symbols" to denote an offset of num_symbols *bytes*. I had
guessed that that's what you wanted, but apparently it isn't.

Nope, 128-character strings...

I think what you *really* wanted was for the addition to be scaled by
sizeof *curr_instrs (128 bytes?). If so, you probably meant to use

(void*)(curr_instrs + num_symbols)

which should work.

EXACTLY!

But since the argument will be implicitly
converted to void* anyway, all you need is

curr_instrs + num_symbols

In other words, all you need to do is drop the cast. This avoids
depending on a compiler-specific extension *and* corrects a bug. It's
also a very nice demonstration of why unnecessary casts should be
avoided.

OK, I WILL try that to replace this nonsense, along with the unneeded
malloc and free for the no-symbols list:

/* put bottom of instruments list into no-symbols list */
swap_idx=0;
no_symbol_idx=num_symbols;
while(no_symbol_idx<num_instrs) {
strcpy(curr_no_symbls[swap_idx],
curr_instrs[no_symbol_idx]);
no_symbol_idx++;
swap_idx++;
}

/* sort the no-symbol list alphabetically */
qsort((void *)curr_no_symbls,num_no_symbls,128,sort_alpha_list );

Hopefully everything will go well at 6pm EST when it downloads
the data...

---
William Ernest Reid

Aug 11 '06 #7

Keith Thompson

"Bill Reid" <ho********@happyhealthy.netwrites:

Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...
>"Bill Reid" <ho********@happyhealthy.netwrites:

[...]

>I will make one comment: Don't cast the result of malloc() or
realloc(). See section 7 of the comp.lang.c FAQ,
<http://www.c-faq.com/>, particularly questions 7.7b.

OK, I think I've heard some type of debate about this, I thought
(based on the DOCUMENTATION THAT CAME WITH MY
FRIGGIN' DEVELOPMENT PACKAGE) that was what you
were supposed to do; it has seemed to work OK...

Then the DOCUMENTATION THAT CAME WITH YOUR FRIGGIN' DEVELOPMENT
PACKAGE is advising you to do something that's unnecessary and
potentially dangerous. (Unless it's intended to be called from C++,
which doesn't do implicit conversions to and from void* as freely as C
does, but that's a different language.)

[...]

>Ok, it's your code, but I'm quite surprised that defining symbolic
constants would cause more problems than it would solve.

As I've alluded, it partly depends on the scope. What I've found
the hard way is that you should really know EXACTLY what you
need AT THE MOST LOCAL LEVEL OF SCOPE. And I
found I kept forgetting what my global defines were, and I would
use an inappropriate one where I knew EXACTLY what I needed
RIGHT THERE. Among other problems...

Yes, one problem with macros is that they're not scoped.

If you want an integer constant (within the range of type int), there
is a trick you can use in C to limit it to the scope you want:

enum { WHATEVER = 128 };

It's arguably an abuse of the "enum" feature (you're doing it for the
sake of the constant, and not actualy using the type), but it does
work, and it's not an uncommon idiom.

Or you can use a macro and be careful about how you use it.

In this case, maybe a sizeof() would be better...

Probably so.

>If someone else needs to maintain your code (and "someone else" could
be you a year from now), it's not going to be obvious that the 128 in
this function corresponds to the 128 (or 127) in another function, but
the 128 in that function over there is just coincidental. There's a
good discussion at <http://c-faq.com/~scs/cclass/notes/sx9b.html>.

Everything is local to a single 1000-line function for this data downloading
operation; any (many) calls out to my custom libraries don't care about
string lengths because they do a strlen() on the passed string pointer
on entry.

>The first argument to qsort is:

(void *)curr_instrs + num_symbols

You can't do pointer arithmetic on a void* value. (Some compilers may
allow it; if you're using gcc, try "-ansi -pedantic -Wall -W", or
replace "-ansi" with "-std=c99").

Then how does qsort() do it? I'm assuming now that it must just
use pointer arithmetic internally, because it doesn't seem to want or
recognize my typedef of a 128-character string:

typedef char instr_strs[128];
instr_strs *curr_instrs;

qsort behaves in a manner consistent with its specification. That's
all you really need to know. It needn't even be implemented in C, and
if it is, it's free to use compiler-specific extensions.

But if it's implemented in standard C (which is entirely possible), it
presumably would convert the void* arguments to char* before
performing arithmetic on them. (Since char* and void* have the same
representation, the conversion doesn't cost anything at run time.)

Yeah, apparently it only processes a char* at a time, and the
declaration of void* just prevents somebody from stupidly passing
the wrong starting point, or something...

void* is a generic pointer type. In fact, it's *the* generic pointer
type (pointer-to-object, actually; you can't portably use it for
pointers to functions). That's why qsort() uses it. (Earlier
versions of qsort(), before the 1989 ANSI standard, probably would
have used char*.)

I'm not sure what you mean by "it only processes a char* at a time".
qsort() works with whatever size of data you tell it to. It likely
uses memcpy() or something similar to copy data around within the
array.

[...]

>Incidentally, a piece of code is either correct or not. The number of
times it "works" really doesn't prove anything. If your compiler
accepts some non-portable code, it's probably going to keep working
the same way indefinitely -- but it will fail the first time you
compile it with a different compiler, or with the same compiler and
different options. Correctness is not statistical.

Try telling that to a third-grade teacher grading tests...

I'm not sure I see the point.

[...]

Yeah, I noticed that, I just use (void *) because that's what
I thought qsort() wanted, and it definitely WORKS that way
(I've used qsort() dozens of times EXACTLY that way without
problems).

Yes, it works, but it's not necessary.

Are you sure? Somehow, it seems like I tried it without the cast, and
got an error, but if that actually happened, it was years, hell, decades
ago.

Yes, I'm sure.

Like most people, I'm a victim of experience: I just keep doing what
works...

>As a general rule, casts
should be avoided unless they're actually required.

The question here would be: does qsort() require it? Here's the
documentation:

Syntax

#include <stdlib.h>
void qsort(void *base, size_t nelem, size_t width,
int (_USERENTRY *fcmp)(const void *, const void *));

and the example from the documentation:

int sort_function( const void *a, const void *b);
char list[5][4] = { "cat", "car", "cab", "cap", "can" };

int main(void)
{
int x;

qsort((void *)list, 5, sizeof(list[0]), sort_function);
for (x = 0; x < 5; x++)
printf("%s\n", list[x]);
return 0;
}

int sort_function( const void *a, const void *b)
{
return( strcmp((char *)a,(char *)b) );
}

Unlike some example documentation that I can think of, THAT one
actually works as advertised...but doesn't mean that the cast is
required...

Yes, it works with a cast. It also works without a cast, and there's
just no reason to use one.

What you quoted above is not *the* documentation for qsort(). You'll
find that in the C standard, and it doesn't say anything about casting
arguments.

>A cast is, among
other things, a promise to the compiler that you know what you're
doing, and will often inhibit warnings and error messages. In this
case, the argument will be implicitly converted to void* without the
cast (assuming you have a visible prototype for qsort() -- i.e., you
haven't forgotten the "#include <stdlib.h>".) The code is perfectly
correct either way, but the form with the cast is more "brittle". If
the cast had specified the wrong type, for example, the compiler
likely wouldn't have told you about the error.

Well, OK, I just compiled it without the cast, and it came
up clean. Then I immediately pasted the cast back in place, since
this is "production code", and from a perfectly practical standpoint,
the void* cast is 100% functional IN THIS CASE, so I'm loathe
to mess anything up...

Sure, if it already works, any change you make has a chance of
breaking something. But keep this in mind for any new code you write,
and when tracking down bugs in existing code. And if you're fixing a
piece of code anyway, you might as well remove any unnecessary casts
while you're at it; it will make the code more robust in the long run.

[...]

>I haven't seen the full context of your code (or if I have, I've
forgotten it). Your original code had

(void*)curr_instrs + num_symbols

which is illegal, because you can't perform pointer arithmetic on
void* (the cast applies to "curr_instrs", not to "curr_instrs +
num_symbols"). Pointer arithmetic, as you probably know, is scaled by
the size of the pointed-to type.

Actually, SEQUENCE POINTS!!!! Yes, I know now this is
wrong...

No, sequence points aren't involved. It's just a matter of operator
precedence (how an expression is parsed, and which operations apply to
which operands).

[snip]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Aug 11 '06 #8

Barry Schwarz

On Fri, 11 Aug 2006 06:09:21 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

>
Barry Schwarz <sc******@doezl.netwrote in message
news:k9********************************@4ax.com.. .
>On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

snip

>You memory is
allocated from address to address+size-1. Furthermore, calculating
the value address+size is always allowed but you may not dereference
this address.

...you wouldn't want to dereference an address, right.

It's a very common thing to do. How else do you get the value at that
address? All subscripts involve an implied dereference.

Remove del for email

Aug 12 '06 #9

Bill Reid

Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...

"Bill Reid" <ho********@happyhealthy.netwrites:
Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...
"Bill Reid" <ho********@happyhealthy.netwrites:

[...]

I will make one comment: Don't cast the result of malloc() or
realloc(). See section 7 of the comp.lang.c FAQ,
<http://www.c-faq.com/>, particularly questions 7.7b.

OK, I think I've heard some type of debate about this, I thought
(based on the DOCUMENTATION THAT CAME WITH MY
FRIGGIN' DEVELOPMENT PACKAGE) that was what you
were supposed to do; it has seemed to work OK...

Then the DOCUMENTATION THAT CAME WITH YOUR FRIGGIN' DEVELOPMENT
PACKAGE is advising you to do something that's unnecessary and
potentially dangerous. (Unless it's intended to be called from C++,
which doesn't do implicit conversions to and from void* as freely as C
does, but that's a different language.)

Mmmmm, well it's actually a C++ package (with a lot of "Object Pascal"
crap laying around apparently just in a vain attempt to create a Microsoft
style monopoly--three guesses who made it), and I do call back and
forth between C and C++ and D----i, so maybe I DO want to keep
the "unneeded" casts...

To its credit, it seems to always issue warnings for any declarations
not in scope, so 7.7b has little to no practical relevance...

>

Ok, it's your code, but I'm quite surprised that defining symbolic
constants would cause more problems than it would solve.

As I've alluded, it partly depends on the scope. What I've found
the hard way is that you should really know EXACTLY what you
need AT THE MOST LOCAL LEVEL OF SCOPE. And I
found I kept forgetting what my global defines were, and I would
use an inappropriate one where I knew EXACTLY what I needed
RIGHT THERE. Among other problems...

Yes, one problem with macros is that they're not scoped.

If you want an integer constant (within the range of type int), there
is a trick you can use in C to limit it to the scope you want:

enum { WHATEVER = 128 };

It's arguably an abuse of the "enum" feature (you're doing it for the
sake of the constant, and not actualy using the type), but it does
work, and it's not an uncommon idiom.

I'm not sure about using that particular trick, but I will say I did a
major overhaul of my code a few years back where I ditched about
80% of my defines and replaced them with enums and have saved
tremendous amounts of wasted effort as a result.

Or you can use a macro and be careful about how you use it.

The real point is always that you always have to be careful and
there is no magic trick that will completely relieve you of the duty
to know what the hell you are doing.

In this case, maybe a sizeof() would be better...

Probably so.

If someone else needs to maintain your code (and "someone else" could
be you a year from now), it's not going to be obvious that the 128 in
this function corresponds to the 128 (or 127) in another function, but
the 128 in that function over there is just coincidental. There's a
good discussion at <http://c-faq.com/~scs/cclass/notes/sx9b.html>.

No offense but that is sooooo "old school" and "Mickey Mouse"...it
might have impressed me in 1975 writing a "hello world!" program in
my diappies, but I have much bigger fish to fry these days...I try to use
what tools are available in the best way possible, defines still have a
place in my code and always will, but I'm not kidding when I say
I got sick and tired of dealing with them, with one very important
exception; this is from the top of my c_inclds.h file that is included
in every C program I write:

#ifndef c_incldsH
#define c_incldsH

/* boolean boo-yah */
#define TRUE 1 /* what about negative logic? */
#define FALSE 0 /* not to mention situational ethics... */

After that there are about another 50 defines, including line length
maxs and crap like that, that I'd just as soon flush down the bit-crapper
than ever use again...

>
qsort behaves in a manner consistent with its specification. That's
all you really need to know.

Again, I may become a "victim" of the "documentation"...but what're
ya goin' to do? As I've said, if it gets the job done flawlessly after
being
compiled, I don't care what anything does...

>

Incidentally, a piece of code is either correct or not. The number of
times it "works" really doesn't prove anything. If your compiler
accepts some non-portable code, it's probably going to keep working
the same way indefinitely -- but it will fail the first time you
compile it with a different compiler, or with the same compiler and
different options.

Yeah, but I can do everything as "correctly" as possible and will
still have portability issues, so again, what're ya goin' to do?

Correctness is not statistical.

>
Try telling that to a third-grade teacher grading tests...

I'm not sure I see the point.

In all walks of life, and in so much of my own work, everything is
"graded". Some things are measurably "better" than others, you know,
like Japanese cars are better than American cars, because, you know,
they actually use this thing called "statistical quality control" and
other disciplines, while Americans don't so much, even though it was
invented here...

I value speed and flawless execution in computer programs, and
have implemented a methodology for some level of portability, modularity,
and maintainability, but those are secondary concerns...

Yeah, I noticed that, I just use (void *) because that's what
I thought qsort() wanted, and it definitely WORKS that way
(I've used qsort() dozens of times EXACTLY that way without
problems).

Yes, it works, but it's not necessary.

How about if I call it from C++ like you mentioned about malloc()?
I believe I actually do call malloc() in some xxx.cpp files...

>
Like most people, I'm a victim of experience: I just keep doing what
works...

As a general rule, casts
should be avoided unless they're actually required.
The question here would be: does qsort() require it? Here's the
documentation:

Syntax

#include <stdlib.h>
void qsort(void *base, size_t nelem, size_t width,
int (_USERENTRY *fcmp)(const void *, const void *));

and the example from the documentation:

int sort_function( const void *a, const void *b);
char list[5][4] = { "cat", "car", "cab", "cap", "can" };

int main(void)
{
int x;

qsort((void *)list, 5, sizeof(list[0]), sort_function);
for (x = 0; x < 5; x++)
printf("%s\n", list[x]);
return 0;
}

int sort_function( const void *a, const void *b)
{
return( strcmp((char *)a,(char *)b) );
}

Unlike some example documentation that I can think of, THAT one
actually works as advertised...but doesn't mean that the cast is
required...

Yes, it works with a cast. It also works without a cast, and there's
just no reason to use one.

What you quoted above is not *the* documentation for qsort(). You'll
find that in the C standard, and it doesn't say anything about casting
arguments.

Again, might be the C++ thing, or an urban legend or something...

>
Well, OK, I just compiled it without the cast, and it came
up clean. Then I immediately pasted the cast back in place, since
this is "production code", and from a perfectly practical standpoint,
the void* cast is 100% functional IN THIS CASE, so I'm loathe
to mess anything up...

Sure, if it already works, any change you make has a chance of
breaking something. But keep this in mind for any new code you write,
and when tracking down bugs in existing code. And if you're fixing a
piece of code anyway, you might as well remove any unnecessary casts
while you're at it; it will make the code more robust in the long run.

Unless I call it from C++?

I haven't seen the full context of your code (or if I have, I've
forgotten it). Your original code had

(void*)curr_instrs + num_symbols

which is illegal, because you can't perform pointer arithmetic on
void* (the cast applies to "curr_instrs", not to "curr_instrs +
num_symbols"). Pointer arithmetic, as you probably know, is scaled by
the size of the pointed-to type.

Actually, SEQUENCE POINTS!!!! Yes, I know now this is
wrong...

No, sequence points aren't involved. It's just a matter of operator
precedence (how an expression is parsed, and which operations apply to
which operands).

Oh, I thought that was "sequence points", but yeah, what I wrote
wouldn't work right.

Oh, while I've got you here, here's another issue I noticed that I'm
not sure about concerning realloc(). Here's the NON-documentation:

Syntax

#include <stdlib.h>
void *realloc(void *block, size_t size);

....

If block is a NULL pointer, realloc works just like malloc.

....

I read this years ago, and thought "Great, I don't necessarily have to
malloc something first, I can use realloc in a loop and the first pass
through the loop it'll just be like malloc."

Problem is, it didn't seem to work out that way, and I'm not sure
what I did wrong, but I think I tried a number of things, such as
explicitly initializing my memory pointer to NULL, and always got
an error...is it actually possible to use realloc() to act like malloc
with a NULL pointer?

---
William Ernest Reid

Aug 12 '06 #10

Keith Thompson

"Bill Reid" <ho********@happyhealthy.netwrites:

Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...

[...]

Mmmmm, well it's actually a C++ package (with a lot of "Object Pascal"
crap laying around apparently just in a vain attempt to create a Microsoft
style monopoly--three guesses who made it), and I do call back and
forth between C and C++ and D----i, so maybe I DO want to keep
the "unneeded" casts...

If you have a genuine need to compile the same code as both C and C++,
that's a valid reason to cast the result of the *alloc() functions.

Very very few people have such a genuine need. We can count the ones
we've seen here on the fingers of P.J. Plauger's right hand (and even
that's overkill).

C++ provides mechanisms for interfacing to C code. Unless you're
providing a library to be used with either C or C++ code, you're
probably better off picking a language for each piece of your program
and using the appropriate compiler for it.

[...]

How about if I call it from C++ like you mentioned about malloc()?
I believe I actually do call malloc() in some xxx.cpp files...

Why? C++ has "new" and "delete". But in any case, C++ is a different
language, and comp.lang.c++ down the hall on the left, just past the
water cooler.

[...]

>Yes, it works with a cast. It also works without a cast, and there's
just no reason to use one.

What you quoted above is not *the* documentation for qsort(). You'll
find that in the C standard, and it doesn't say anything about casting
arguments.

Again, might be the C++ thing, or an urban legend or something...

The Solaris man page has similar wording.

[...]

Oh, while I've got you here, here's another issue I noticed that I'm
not sure about concerning realloc(). Here's the NON-documentation:

Syntax

#include <stdlib.h>
void *realloc(void *block, size_t size);

...

If block is a NULL pointer, realloc works just like malloc.

...

I read this years ago, and thought "Great, I don't necessarily have to
malloc something first, I can use realloc in a loop and the first pass
through the loop it'll just be like malloc."

Yes. If it doesn't work that way, your implementation is broken.
(But that's an unlikely bug, since the behavior is clearly documented
in the standard.)

Problem is, it didn't seem to work out that way, and I'm not sure
what I did wrong, but I think I tried a number of things, such as
explicitly initializing my memory pointer to NULL, and always got
an error...is it actually possible to use realloc() to act like malloc
with a NULL pointer?

Yes. I can't guess why you were unable to get it to work.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Aug 12 '06 #11

Bill Reid

Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...

"Bill Reid" <ho********@happyhealthy.netwrites:
Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...
[...]
Mmmmm, well it's actually a C++ package (with a lot of "Object Pascal"
crap laying around apparently just in a vain attempt to create a

Microsoft

style monopoly--three guesses who made it), and I do call back and
forth between C and C++ and D----i, so maybe I DO want to keep
the "unneeded" casts...

If you have a genuine need to compile the same code as both C and C++,
that's a valid reason to cast the result of the *alloc() functions.

Nope, I don't think I ever do that. However, I do occasionally
call malloc in a xxx.cpp file, which is compiled by C++...

Very very few people have such a genuine need.

Yes, hard to imagine what the point of that would be, 'cept maybe
even greater programming confusion than I have!

We can count the ones
we've seen here on the fingers of P.J. Plauger's right hand (and even
that's overkill).

C++ provides mechanisms for interfacing to C code. Unless you're
providing a library to be used with either C or C++ code, you're
probably better off picking a language for each piece of your program
and using the appropriate compiler for it.

The only libraries I provide are for myself, and as you note it is
generally fairly painless to call into C++ object files from C and vice
versa.

How about if I call it from C++ like you mentioned about malloc()?
I believe I actually do call malloc() in some xxx.cpp files...

Why? C++ has "new" and "delete".

Good question, maybe there was a good reason, maybe not, but
since I'm not looking at that particular code right now, it probably
had to do with keeping certain data structures as similar as possible
when used in C++ as they are when used in C, and something about
"new" just "scared" me...

But in any case, C++ is a different
language, and comp.lang.c++ down the hall on the left, just past the
water cooler.

Well, I didn't bring it up, but my code base is about 50/50...

>

Yes, it works with a cast. It also works without a cast, and there's
just no reason to use one.

What you quoted above is not *the* documentation for qsort(). You'll
find that in the C standard, and it doesn't say anything about casting
arguments.

Again, might be the C++ thing, or an urban legend or something...

The Solaris man page has similar wording.

Well, the Solaris man page would be just the old ucb man page,
right? In any event, I am highly displeased with this particular
development
package, and high on my list of specific displeasures is the documentation.
It is in some cases wrong, many cases stupidly written, incomplete,
and just plain difficult to use. So I'm not at all surprised that they
included an unnecessary cast in the example, but at least the
example works, as I said...

Oh, while I've got you here, here's another issue I noticed that I'm
not sure about concerning realloc(). Here's the NON-documentation:

Syntax

#include <stdlib.h>
void *realloc(void *block, size_t size);

...

If block is a NULL pointer, realloc works just like malloc.

...

I read this years ago, and thought "Great, I don't necessarily have to
malloc something first, I can use realloc in a loop and the first pass
through the loop it'll just be like malloc."

Yes. If it doesn't work that way, your implementation is broken.
(But that's an unlikely bug, since the behavior is clearly documented
in the standard.)

I would think it unlikely it is broken, the package is irritatingly bad
in many ways but seems to generally put out clean functioning programs
after fighting the "tools", but who knows. I may have just done something
stupid, wouldn't be the first time...

Problem is, it didn't seem to work out that way, and I'm not sure
what I did wrong, but I think I tried a number of things, such as
explicitly initializing my memory pointer to NULL, and always got
an error...is it actually possible to use realloc() to act like malloc
with a NULL pointer?

Yes. I can't guess why you were unable to get it to work.

Maybe I'll try it again. I made the changes to my data downloading
code yesterday, including deleting the "unnecessary casts", ran some tests,
everything worked fine, put it "into production", 6:15pm EST rolled
around and it did its thing apparently flawlessly, only about three
milliseconds quicker...

---
William Ernest Reid

Aug 12 '06 #12

Bill Reid

Barry Schwarz <sc******@doezl.netwrote in message
news:nm********************************@4ax.com...

On Fri, 11 Aug 2006 06:09:21 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:
Barry Schwarz <sc******@doezl.netwrote in message
news:k9********************************@4ax.com...
On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

You memory is
allocated from address to address+size-1. Furthermore, calculating
the value address+size is always allowed but you may not dereference
this address.

...you wouldn't want to dereference an address, right.

It's a very common thing to do. How else do you get the value at that
address? All subscripts involve an implied dereference.

OK, you were talking about dereferencing an address one element
past the end of the block, I thought you were talking about something like
saving the pointer, then trying to use it again after another realloc().
That WOULD be a recipe for diasaster, right?

So I'm not sure what distinction you're trying to make about
subscript "implied" dereferencing. Isn't "address+size" equivalent to
"address[size]"? Again, the only problem in doing anything with a
dereference of that address is that you're one element past the
end of the block...but that might actually work for you if you're Russian...

---
William Ernest Reid

Aug 12 '06 #13

Flash Gordon

Bill Reid wrote:

Barry Schwarz <sc******@doezl.netwrote in message
news:nm********************************@4ax.com...
>On Fri, 11 Aug 2006 06:09:21 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:
>>Barry Schwarz <sc******@doezl.netwrote in message
news:k9********************************@4ax.com. ..
On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:
You memory is
allocated from address to address+size-1. Furthermore, calculating
the value address+size is always allowed but you may not dereference
this address.

...you wouldn't want to dereference an address, right.
It's a very common thing to do. How else do you get the value at that
address? All subscripts involve an implied dereference.

OK, you were talking about dereferencing an address one element
past the end of the block, I thought you were talking about something like
saving the pointer, then trying to use it again after another realloc().
That WOULD be a recipe for diasaster, right?

So I'm not sure what distinction you're trying to make about
subscript "implied" dereferencing. Isn't "address+size" equivalent to
"address[size]"?

No. "address[size]" and "*(address+size)" are equivalent. So the first
form does a dereference. "address+size" on the other hand does *not* do
a dereference, implied or otherwise.

Again, the only problem in doing anything with a
dereference of that address is that you're one element past the
end of the block...but that might actually work for you if you're Russian...

Never dereference beyond the end of the block. It is "not allowed" by
the standard, i.e. anything can happen including, unfortunately, what
you happen to expect.

Aug 13 '06 #14

Bill Reid

Bill Reid <ho********@happyhealthy.netwrote in message
news:uT********************@bgtnsc04-news.ops.worldnet.att.net...

Keith Thompson <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...
"Bill Reid" <ho********@happyhealthy.netwrites:

Oh, while I've got you here, here's another issue I noticed that I'm
not sure about concerning realloc(). Here's the NON-documentation:
>
Syntax
>
#include <stdlib.h>
void *realloc(void *block, size_t size);
>
...
>
If block is a NULL pointer, realloc works just like malloc.
>
...
>
I read this years ago, and thought "Great, I don't necessarily have to
malloc something first, I can use realloc in a loop and the first pass
through the loop it'll just be like malloc."
Yes. If it doesn't work that way, your implementation is broken.
(But that's an unlikely bug, since the behavior is clearly documented
in the standard.)
I would think it unlikely it is broken, the package is irritatingly bad
in many ways but seems to generally put out clean functioning programs
after fighting the "tools", but who knows. I may have just done something
stupid, wouldn't be the first time...

Problem is, it didn't seem to work out that way, and I'm not sure
what I did wrong, but I think I tried a number of things, such as
explicitly initializing my memory pointer to NULL, and always got
an error...is it actually possible to use realloc() to act like malloc
with a NULL pointer?
Yes. I can't guess why you were unable to get it to work.
Maybe I'll try it again.

Oooooh, that was gnarly...

What I forgot was that if I don't malloc() the block first, if I
realloc() in a loop I get a memory access exception. I hate it
when that happens...

Maybe it IS a bug in the compiler, if it wasn't so easy to work
around, I might actually worry about it more. As it is, I did a
search on the compiler maker's web-site for any information
on known bugs, came up with nothing, and left a question on
the discussion forum about it, see if anybody knows anything...

---
William Ernest Reid

Aug 13 '06 #15

Barry Schwarz

On Sat, 12 Aug 2006 23:55:11 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

>
Barry Schwarz <sc******@doezl.netwrote in message
news:nm********************************@4ax.com.. .
>On Fri, 11 Aug 2006 06:09:21 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:
>Barry Schwarz <sc******@doezl.netwrote in message
news:k9********************************@4ax.com.. .
On Fri, 11 Aug 2006 03:54:19 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

>You memory is
allocated from address to address+size-1. Furthermore, calculating
the value address+size is always allowed but you may not dereference
this address.

...you wouldn't want to dereference an address, right.

It's a very common thing to do. How else do you get the value at that
address? All subscripts involve an implied dereference.

OK, you were talking about dereferencing an address one element
past the end of the block, I thought you were talking about something like
saving the pointer, then trying to use it again after another realloc().
That WOULD be a recipe for diasaster, right?

The point I was trying to make was: After the successful allocation,
you could dereference any address in the range address to
address+size-1. While it is legal to compute the value address+size
it is not legal to dereference it.

After calling realloc, any address based on the "before" location is
probably invalid. The only time it would be valid is if:

The address returned from realloc was the same as the address
passed to the function in argument 1 and

The offset into the area (address of interest - starting address
of area) <= size argument passed to realloc.

>
So I'm not sure what distinction you're trying to make about
subscript "implied" dereferencing. Isn't "address+size" equivalent to

You asked why someone would want to dereference an address. I tried
to give an example of why it is a very common thing to do.

>"address[size]"? Again, the only problem in doing anything with a

In my discussion, I used the phrase address+size in its non-C
arithmetic meaning. In C, the meaning is equivalent only for pointers
where the sizeof the object pointed to is 1.

In C address[size] is defined to be *(address+size), remembering that
pointer arithmetic includes implied scaling by the sizeof the object
pointed to.

>dereference of that address is that you're one element past the
end of the block...but that might actually work for you if you're Russian...

The "problem" is that dereferencing the address invokes undefined
behavior, even before you attempt to do something with the object that
may be retrieved from that address.
Remove del for email

Aug 13 '06 #16

Herbert Rosenau

On Sun, 13 Aug 2006 16:14:19 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:

What I forgot was that if I don't malloc() the block first, if I
realloc() in a loop I get a memory access exception. I hate it
when that happens...

void *p = NULL; /* we have no memory yet */
void *temp; /* realloc will set it */

size_t size = 0; /* we calculate the size in the loop before we call
realloc */
....
for (....) {
....
if ((temp = realloc(p, size) != NULL) {
/* realloc failed */
return NULL; /* or some other error code */
}
p = temp;
....
}
free(p);

will work always - except your implementation is really broken. But
hten trow your compiler into trash and get another one.

Maybe it IS a bug in the compiler, if it wasn't so easy to work
around, I might actually worry about it more. As it is, I did a
search on the compiler maker's web-site for any information
on known bugs, came up with nothing, and left a question on
the discussion forum about it, see if anybody knows anything...

I would say you have forgotten to initialise the pointer given to
realloc with NULL signalling it that thre is currently nothing to
realloc but malloc.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!

Aug 14 '06 #17

Bill Reid

Herbert Rosenau <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...

On Sun, 13 Aug 2006 16:14:19 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:

What I forgot was that if I don't malloc() the block first, if I
realloc() in a loop I get a memory access exception. I hate it
when that happens...

void *p = NULL; /* we have no memory yet */
void *temp; /* realloc will set it */

size_t size = 0; /* we calculate the size in the loop before we call
realloc */
...
for (....) {
....
if ((temp = realloc(p, size) != NULL) {
/* realloc failed */
return NULL; /* or some other error code */
}
p = temp;
....
}
free(p);

will work always - except your implementation is really broken. But
hten trow your compiler into trash and get another one.

Maybe it IS a bug in the compiler, if it wasn't so easy to work
around, I might actually worry about it more. As it is, I did a
search on the compiler maker's web-site for any information
on known bugs, came up with nothing, and left a question on
the discussion forum about it, see if anybody knows anything...

I would say you have forgotten to initialise the pointer given to
realloc with NULL signalling it that thre is currently nothing to
realloc but malloc.

You are correct sir!

I made yet a further fool of myself and posted the question on
the message board for the compiler. Sure enough, I "forgot" that
local pointers are not initialized, therefore are not NULL and
could be anything, so realloc() tries to allocate memory in
god-knows-where.

Of course, I could have sworn that I explicitly set the pointer
to NULL as the FIRST thing I tried to fix the problem when
it first cropped up years ago, but I must have mucked that up
somehow.

As to your code, I'm not sure why you do the two-step
with "*p" and "*temp"...it seems like all is required is to set
*p=NULL when declared, at least that's what I did, and
it worked just fine. Am I (again!) missing something?

---
William Ernest Reid

Aug 16 '06 #18

Bill Reid

Barry Schwarz <sc******@doezl.netwrote in message
news:3l********************************@4ax.com...

On Sat, 12 Aug 2006 23:55:11 GMT, "Bill Reid"
<ho********@happyhealthy.netwrote:

dereference of that address is that you're one element past the
end of the block...but that might actually work for you if you're

Russian...

>
The "problem" is that dereferencing the address invokes undefined
behavior, even before you attempt to do something with the object that
may be retrieved from that address.

Oh really. Well, throw that onto the giant pile of stuff I did not
know about C programming...

In any event, I did solve a bunch of "problems" in my code as a result
of asking these stupid questions. My code basically runs exactly as it
did before, would compile just as cleanly on any compiler as before, but
it now no longer has certain "problems"!

Thanks guys!

---
William Ernest Reid

Aug 16 '06 #19

Herbert Rosenau

On Wed, 16 Aug 2006 01:01:06 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:

Anybody blames her/himself as s/he can. :-) You will learn that
mostenly when you tries to blame the compiler you blames yourself
instead.

>
As to your code, I'm not sure why you do the two-step
with "*p" and "*temp"...it seems like all is required is to set
*p=NULL when declared, at least that's what I did, and
it worked just fine. Am I (again!) missing something?

p is the pointer holding the address of the memory block. As realloc
an fail (returning NULL) you needs another pointer to assign the
result of realloc until you knows that realloc returns (new) memory
address.

When realloc fails you needs to either work with the memory (p) you
have already or to cleanup (free(p). Overwriting p with NULL gives you
a memory leak as you lost the address of the memory you have laready
allocated.

Another tip:

You should initialise any variable when defining it to be sure to fail
on that because you've not already assigned a known valid value. So
initialise a 0 (or a value you can easyly identify as invalid to data
and NULL to pointer. Then learn how to use a debugger and set a
breakpoint immediately before the fail occures, analyse the date found
there and then, when anything seems ok step a single step forward and
analyse again until the failture occures.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!

Aug 17 '06 #20

Bill Reid

Herbert Rosenau <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...

On Wed, 16 Aug 2006 01:01:06 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:

Anybody blames her/himself as s/he can. :-) You will learn that
mostenly when you tries to blame the compiler you blames yourself
instead.

I blame society.

As to your code, I'm not sure why you do the two-step
with "*p" and "*temp"...it seems like all is required is to set
*p=NULL when declared, at least that's what I did, and
it worked just fine. Am I (again!) missing something?

p is the pointer holding the address of the memory block. As realloc
an fail (returning NULL) you needs another pointer to assign the
result of realloc until you knows that realloc returns (new) memory
address.

When realloc fails you needs to either work with the memory (p) you
have already or to cleanup (free(p). Overwriting p with NULL gives you
a memory leak as you lost the address of the memory you have laready
allocated.

Hmmm...if this is the case this should be in the NON-documentation.
Actually, I guess it kind of is, but after a quick read of this I always
thought realloc() freed the original memory block and returned NULL
if it couldn't reallocate the new block:

....

Syntax

#include <stdlib.h>
void *realloc(void *block, size_t size);

....

If the block cannot be reallocated, realloc returns NULL.

If the value of size is 0, the memory block is freed and realloc returns
NULL.

---end of NON-documentation

I guess I got the last two conditions conflated in my mind; it just seemed
logical to me that if realloc() failed it would free the previous block.
It SEEMS like it should.

As an orthogonal point, what horrible things happen if you try
to free() a NULL pointer?

Another tip:

You should initialise any variable when defining it to be sure to fail
on that because you've not already assigned a known valid value. So
initialise a 0 (or a value you can easyly identify as invalid to data
and NULL to pointer. Then learn how to use a debugger and set a
breakpoint immediately before the fail occures, analyse the date found
there and then, when anything seems ok step a single step forward and
analyse again until the failture occures.

I'm not quite sure how this is functionally different from SOP using
a debugger; initialized or not, I can "mouse over" ALL the variables
after an exeption point or break point, so I don't see garbage at an
exception point I step back and put in a break, run it again, check
the values, step forward, etc.

---
William Ernest Reid

Aug 17 '06 #21

Keith Thompson

"Bill Reid" <ho********@happyhealthy.netwrites:
[...]

As an orthogonal point, what horrible things happen if you try
to free() a NULL pointer?

Nothing; free(NULL) is guaranteed to be a no-op.

Either your system's documentation or any decent C textbook should
tell you this. Failing that, you can download a copy of the latest
draft of the C standard; search for "n1124.pdf".

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Aug 18 '06 #22

Herbert Rosenau

On Thu, 17 Aug 2006 23:54:41 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:

When realloc fails you needs to either work with the memory (p)

you

have already or to cleanup (free(p). Overwriting p with NULL gives you
a memory leak as you lost the address of the memory you have laready
allocated.
Hmmm...if this is the case this should be in the NON-documentation.
Actually, I guess it kind of is, but after a quick read of this I always
thought realloc() freed the original memory block and returned NULL
if it couldn't reallocate the new block:

Yes, but it does NOT free() the old block then. That block leaves
unchanged.

...

Syntax

#include <stdlib.h>
void *realloc(void *block, size_t size);

...

If the block cannot be reallocated, realloc returns NULL.

If the value of size is 0, the memory block is freed and realloc returns
NULL.

---end of NON-documentation

I guess I got the last two conditions conflated in my mind; it just seemed
logical to me that if realloc() failed it would free the previous block.
It SEEMS like it should.

But it does not so because you may need to continue your work with the
old block.
When you have no need for the old block you have to free() that
yourself, else you should free() it. In any case you needs its
address.

As an orthogonal point, what horrible things happen if you try
to free() a NULL pointer?

Nothing. free(NULL); works like a noop. Nothing occures. That is
guaranteed.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!

Aug 18 '06 #23

Bill Reid

Herbert Rosenau <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...

On Thu, 17 Aug 2006 23:54:41 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:

I guess I got the last two conditions conflated in my mind; it just

seemed

logical to me that if realloc() failed it would free the previous block.
It SEEMS like it should.

But it does not so because you may need to continue your work with the
old block.

Sure, but right off-hand I can't think of single case where I would
actually want to do anything with "half a loaf". When memory
allocation fails in whole or in part, I just want to get out of there
as quickly as possible, maybe just break out of that particular
module operation, quite often just to exit the whole program.

I suspect that is true of the vast majority of programs out there.
So not automatically freeing the block seems to be a case of "the
needs of the few outweighing the needs of the many".

When you have no need for the old block you have to free() that
yourself, else you should free() it. In any case you needs its
address.

In this imperfect world, I guess so...what a friggin' hassle...

As an orthogonal point, what horrible things happen if you try
to free() a NULL pointer?

Nothing. free(NULL); works like a noop. Nothing occures. That is
guaranteed.

So I've kind of been wasting a little time with these types of
generalized pre-exit memory cleanup routines:

void free_time_series_mem(void) {
unsigned ts_idx;

for(ts_idx=0;ts_idx<TS_MAX;ts_idx++) {
if(time_series[ts_idx]!=NULL) {
free(time_series[ts_idx]);
time_series[ts_idx]=NULL;
}
}

num_series=0;
}

Not only don't I need:
if(time_series[ts_idx]!=NULL)

What the hell is the point of:
time_series[ts_idx]=NULL;

??? Who knows...what the hell was I thinking...oh, wait, I WASN'T
thinking...

However, note the clever use of the "TS_MAX" define; NO "MAGIC
NUMBER" HERE FOR THIS BOY!!!

---
William Ernest Reid

Aug 18 '06 #24

Ben Pfaff

"Bill Reid" <ho********@happyhealthy.netwrites:

Herbert Rosenau <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...
>On Thu, 17 Aug 2006 23:54:41 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:
>
I guess I got the last two conditions conflated in my mind; it just

seemed

logical to me that if realloc() failed it would free the previous block.
It SEEMS like it should.

But it does not so because you may need to continue your work with the
old block.

Sure, but right off-hand I can't think of single case where I would
actually want to do anything with "half a loaf". When memory
allocation fails in whole or in part, I just want to get out of there
as quickly as possible, maybe just break out of that particular
module operation, quite often just to exit the whole program.

I suspect that is true of the vast majority of programs out there.
So not automatically freeing the block seems to be a case of "the
needs of the few outweighing the needs of the many".

You can write a wrapper function for the standard realloc that
does what you want. It's not possible to wrap a function that
has your desired behavior to do what the standard realloc does.
--
"Your correction is 100% correct and 0% helpful. Well done!"
--Richard Heathfield

Aug 18 '06 #25

Herbert Rosenau

On Fri, 18 Aug 2006 23:27:03 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:

>
Herbert Rosenau <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...
On Thu, 17 Aug 2006 23:54:41 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:
>
I guess I got the last two conditions conflated in my mind; it just

seemed

logical to me that if realloc() failed it would free the previous block.
It SEEMS like it should.
But it does not so because you may need to continue your work with the
old block.

Sure, but right off-hand I can't think of single case where I would
actually want to do anything with "half a loaf". When memory
allocation fails in whole or in part, I just want to get out of there
as quickly as possible, maybe just break out of that particular
module operation, quite often just to exit the whole program.

.... and leaving the database in an undefined state. Maybe you needs a
lot of single steps on the incloplete data to do to unwind any action
you've already done with. So the data is needed by your program. Most
programs are more complex in data handling as you currently aware of.

So the best realloc can do for you to flag the error that it is out of
memory and leave data it arrived unchanged. Sometimes you gets a
chance to continue when you free()s some other data area.
Sometimes you'll split the aready occupied block into one or more
smaller ones, write older, smaller blocks out and reuse them.

There are lots of possibilities to continue even without - or with -
asking the user. Exit is seldom the choice in real programs. Often you
will have a solution for "out of memory" on a higher level.

exit() is good for ingenious programs but the real world is more
complex. So realloc gives you the chance for a real cleanup of any
kind you may need. It is simply a bug to overwrite a data area you
have already allocated with a null pointer. You have to cleanup.

I suspect that is true of the vast majority of programs out there.
So not automatically freeing the block seems to be a case of "the
needs of the few outweighing the needs of the many".

You not written a reals program using realloc. You will then quickly
revise your standpoint.

When you have no need for the old block you have to free() that
yourself, else you should free() it. In any case you needs its
address.
In this imperfect world, I guess so...what a friggin' hassle...

As an orthogonal point, what horrible things happen if you try
to free() a NULL pointer?
Nothing. free(NULL); works like a noop. Nothing occures. That is
guaranteed.
So I've kind of been wasting a little time with these types of
generalized pre-exit memory cleanup routines:

void free_time_series_mem(void) {
unsigned ts_idx;

for(ts_idx=0;ts_idx<TS_MAX;ts_idx++) {
if(time_series[ts_idx]!=NULL) {
free(time_series[ts_idx]);
time_series[ts_idx]=NULL;
}
}

num_series=0;
}

Not only don't I need:
if(time_series[ts_idx]!=NULL)

What the hell is the point of:
time_series[ts_idx]=NULL;

That is needed to save yourself from accessing data you've given back
to the system. The only point where you would savely NOT set the
free()'d pointer to NULL is when the pointer and the data it points to
is only alive only inside the function you calls free() AND one of the
next statements is return, In any other case setting it to NULL will
give you a clean "access of pointer to 0 instead of undefined
behavior.
--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!

Aug 19 '06 #26

Bill Reid

Herbert Rosenau <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...

On Fri, 18 Aug 2006 23:27:03 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:
Herbert Rosenau <os****@pc-rosenau.dewrote in message
news:wm***************************@JUPITER1.PC-ROSENAU.DE...
On Thu, 17 Aug 2006 23:54:41 UTC, "Bill Reid"
<ho********@happyhealthy.netwrote:

I guess I got the last two conditions conflated in my mind; it just
seemed
logical to me that if realloc() failed it would free the previous

block.

It SEEMS like it should.
>
But it does not so because you may need to continue your work with the
old block.
Sure, but right off-hand I can't think of single case where I would
actually want to do anything with "half a loaf". When memory
allocation fails in whole or in part, I just want to get out of there
as quickly as possible, maybe just break out of that particular
module operation, quite often just to exit the whole program.

... and leaving the database in an undefined state.

Maybe YOUR database. I started this whole thread while looking
at a specific operation in one specific module. Each operation in the
module downloads several raw data files from the net, parses out the
data items, and writes the data to a database. Unless I get ALL of
the data, ALL of the files downloaded properly in the correct format
and with the correct data, NOTHING, none of the about 100,000
data items in this specific operation gets written to the database.

I specifically DESIGNED these download modules to work
this way, even though the database can never be "undefined" (if data
is written to the database, attempts to "over-write" it can't occur and
don't matter anyway), out of force of habit writing the routines that
analyze the data. In those modules, it is a complete waste of time to
in any way analyze partial data, so if I can't download it all from the
database, I just give up (except that's actually never happened,
probably because I run that stuff at night with no other processes
running).

Maybe you needs a
lot of single steps on the incloplete data to do to unwind any action
you've already done with.

Well, maybe YOU do. It's not like I'm not aware of how I could
corrupt my results or my database, it's just that I design the various
parts of the program to avoid these types of situations.

So the data is needed by your program. Most
programs are more complex in data handling as you currently aware of.

Possibly...and maybe not. I've seen some REALLY "complex"
programs in my time, and on a LOC basis alone (or any other basis)
I can't really classify what I do as "simple"...but I'm definitely not
multi-threading or some other things that might dramatically increase
the "complexity"...

So the best realloc can do for you to flag the error that it is out of
memory and leave data it arrived unchanged. Sometimes you gets a
chance to continue when you free()s some other data area.
Sometimes you'll split the aready occupied block into one or more
smaller ones, write older, smaller blocks out and reuse them.

Hmmmm, yeah, I was afraid of this type of response...yeah, there's
about a million things I COULD do, I just DON'T...

You may have missed the part where I said I only care that my
stuff WORKS, works flawlessly, and blindingly-fast for the amount
of data being processed. I'm not TRYING to generate bugs,
cuz I don't get paid to fix them, I just LOSE money...

There are lots of possibilities to continue even without - or with -
asking the user.

I thought the general drill was to ask the user (in this case, I'm the
only user) to close out some other applications. But I guess another
general strategy for the "commercial market" is to let the thing sit
there and grind away pointlessly; at least it keeps the room warm
on a cold day...

Exit is seldom the choice in real programs.

Yeah, I think I'm getting your point here, it's just that it doesn't
really have anything to do with "complexity", but something else...

Often you
will have a solution for "out of memory" on a higher level.

Yeah, but I would suspect these would all slow you down to
a crawl...for my purposes, I'll pass, but again, for the "commercial"
market it might make sense...and since most software is written for
the "commercial" market, maybe it DOES make "statistical" sense
for realloc() to work that way...

exit() is good for ingenious programs but the real world is more
complex.

Well, I don't generally break all the way out to exit(), except in the case
of the database initialization stuff whenever I start up, but I probably
could
since each function tends to clean up after itself on error and return
success or failure in some way...

So realloc gives you the chance for a real cleanup of any
kind you may need. It is simply a bug to overwrite a data area you
have already allocated with a null pointer. You have to cleanup.

SOME people have to clean up in that fashion. I'm already clean,
I just have to close some open files, release some related memory,
I'm good...

I suspect that is true of the vast majority of programs out there.
So not automatically freeing the block seems to be a case of "the
needs of the few outweighing the needs of the many".

You not written a reals program using realloc. You will then quickly
revise your standpoint.

If I'm lucky I never WILL write a "real" program. I would hate
to be "corrupted"...

When you have no need for the old block you have to free() that
yourself, else you should free() it. In any case you needs its
address.
>
In this imperfect world, I guess so...what a friggin' hassle...

As an orthogonal point, what horrible things happen if you try
to free() a NULL pointer?
>
Nothing. free(NULL); works like a noop. Nothing occures. That is
guaranteed.
>
So I've kind of been wasting a little time with these types of
generalized pre-exit memory cleanup routines:

void free_time_series_mem(void) {
unsigned ts_idx;

for(ts_idx=0;ts_idx<TS_MAX;ts_idx++) {
if(time_series[ts_idx]!=NULL) {
free(time_series[ts_idx]);
time_series[ts_idx]=NULL;
}
}

num_series=0;
}

Not only don't I need:
if(time_series[ts_idx]!=NULL)

What the hell is the point of:
time_series[ts_idx]=NULL;

That is needed to save yourself from accessing data you've given back
to the system.

Again, you are correct sir! Well, sort of...

I mean, that's a good reason TO do it; it's just that I incorrectly
described that particular function as a "pre-exit" memory clean-up
routine when I quickly looked at it. If was truly "pre-exit", then
there would be no need worry about an actual address in
the pointer, the next call is to exit() (or return to main(), exit())...

The only point where you would savely NOT set the
free()'d pointer to NULL is when the pointer and the data it points to
is only alive only inside the function you calls free() AND one of the
next statements is return, In any other case setting it to NULL will
give you a clean "access of pointer to 0 instead of undefined
behavior.

....but I realized after I posted that I also call it when "the user"
leaves one particular module for another. Then I could POSSIBLY
come back later and have the old pointer laying around to screw
me up; it would specifically mess up the purpose of the function
I use to populate the array on an "as-needed" basis as any module
requires those particular structures:

TIME_SERIES *create_time_series(void) {
unsigned ts_idx;

for(ts_idx=0;ts_idx<TS_MAX;ts_idx++) {
if(time_series[ts_idx]==NULL) {
if((time_series[ts_idx]=
(TIME_SERIES *)malloc(sizeof(TIME_SERIES)))==NULL) {
#ifdef __CONSOLE__
printf("\nNot enough memory to create time series");
#endif
#ifdef __WINGUI__
if(MessageDlg("Not enough memory to create time series",
mtConfirmation,TMsgDlgButtons()<<mbOK,0)==mrOk) ;
#endif
break;
}

time_series[ts_idx]->ts_vars=init_tsv;
time_series[ts_idx]->ds_vars=init_dsv;
num_series++;
break;
}
}

return time_series[ts_idx];
}

Of course, that's all screwed up in the first place because of the
ridiculous cast of malloc()...

---
William Ernest Reid

Aug 20 '06 #27

Can I Trust Pointer Arithmetic In Re-Allocated Memory?

Similar topics