By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,201 Members | 922 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,201 IT Pros & Developers. It's quick & easy.

Dynamic buffer library

P: n/a

I hacked this together this morning so that I could shift my out-of-
space code away from the rest of my logic. I wanted to allow array
syntax on my dynamic buffers, so I manually created a struct with
malloc() and judicious use of void* pointers.

I decided to post it because stuff like this is breeding ground for
UB, and this group is good at finding it. (I had a dentist analogy
for that, but it didn't work out.)

Here's my code:

/* Start. */
#include <stdlib.h>
#include <string.h>

void *create_buffer (size_t size, size_t max)
{
/* We need to allocate three size_t before the actual buffer to hold
* the size of each item, the maximum number of items, and the actual
* number of items. We do it this way instead of using a struct so that
* array syntax can still be used on the returned structure. */
size_t *tmp = malloc ( (3 * sizeof *tmp) + (size * max) );

if (tmp != NULL)
{
tmp[0] = size; /* Size of each item */
tmp[1] = max; /* Maximum # of items */
tmp[2] = 0; /* Actual # of items */
}

return (tmp != NULL) ? tmp + 3 : NULL;
}

/* This function will return three possible things:
* 1. On ordinary usage, it will return the new maximum size of the buffer.
* 2. If it runs out of memory, it will return 0.
* 3. If passed NULL, it will do nothing but return maximum size of buf.
* The third case is used after the second case to recover lost size info.
* If buf is NULL, Bad Things will happen. */
size_t append_item (void *buf, void *itm)
{
size_t *tmp = (size_t *) buf - 3;
size_t size = tmp[0];
size_t max = tmp[1];
size_t num = ++tmp[2];

/* Early return on special case. */
if (itm == NULL)
return num;

/* If we don't have enough space, try and get some more. */
if (num == max) {
size_t *rtmp;
max *= 2;
while (!(rtmp = realloc (tmp, max)) && (max tmp[1]))
--max;
}

/* If we now have enough space, we're good. Otherwise, we're cooked. */
if (max num) {
memmove (buf, itm, size);
return num;
} else return 0;
}
/* End. */

It compiles in gcc with all warnings set.

--
Andrew Poelstra <http://www.wpsoftware.net/projects>
To reach me by email, use `apoelstra' at the above domain.
"Do BOTH ends of the cable need to be plugged in?" -Anon.
Aug 30 '06 #1
Share this Question
Share on Google+
26 Replies


P: n/a
Andrew Poelstra wrote:
I hacked this together this morning so that I could shift my out-of-
space code away from the rest of my logic. I wanted to allow array
syntax on my dynamic buffers, so I manually created a struct with
malloc() and judicious use of void* pointers.

I decided to post it because stuff like this is breeding ground for
UB, and this group is good at finding it. (I had a dentist analogy
for that, but it didn't work out.)

Here's my code:

/* Start. */
#include <stdlib.h>
#include <string.h>

void *create_buffer (size_t size, size_t max)
{
/* We need to allocate three size_t before the actual buffer to hold
* the size of each item, the maximum number of items, and the actual
* number of items. We do it this way instead of using a struct so that
* array syntax can still be used on the returned structure. */
size_t *tmp = malloc ( (3 * sizeof *tmp) + (size * max) );

if (tmp != NULL)
{
tmp[0] = size; /* Size of each item */
tmp[1] = max; /* Maximum # of items */
tmp[2] = 0; /* Actual # of items */
}

return (tmp != NULL) ? tmp + 3 : NULL;
}

/* This function will return three possible things:
* 1. On ordinary usage, it will return the new maximum size of the buffer.
* 2. If it runs out of memory, it will return 0.
* 3. If passed NULL, it will do nothing but return maximum size of buf.
* The third case is used after the second case to recover lost size info.
* If buf is NULL, Bad Things will happen. */
size_t append_item (void *buf, void *itm)
{
size_t *tmp = (size_t *) buf - 3;
size_t size = tmp[0];
size_t max = tmp[1];
size_t num = ++tmp[2];

/* Early return on special case. */
if (itm == NULL)
return num;
This doesn't return the maximum size of buffer.
>
/* If we don't have enough space, try and get some more. */
if (num == max) {
size_t *rtmp;
max *= 2;
while (!(rtmp = realloc (tmp, max)) && (max tmp[1]))
--max;
}

/* If we now have enough space, we're good. Otherwise, we're cooked. */
if (max num) {
memmove (buf, itm, size);
buf still points to the old memory block.
And in any case you want the new item to
be placed after the already existing ones , right ?
return num;
} else return 0;
}
/* End. */

It compiles in gcc with all warnings set.
>From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?

Aug 30 '06 #2

P: n/a
Spiros Bousbouras wrote:
Andrew Poelstra wrote:

It compiles in gcc with all warnings set.
From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?
Ok , Google's interface is indeed broken. The line reading
"From the code it seems that your intention is to store" was
written by me but Google added a right before "From" !
It thinks it's a mail file !

I'll now start searching for a different web based interface
to usenet.

Aug 30 '06 #3

P: n/a
"Spiros Bousbouras" <sp****@gmail.comwrites:
>>From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?
The items may not be of size_t. I don't know that they're properly
aligned (this is an example of the UB I knew that I missed). What
would be the best way for me to guarantee alignment? Am I going to
need to use a union?

Thanks for your comments: I've corrected my append_item function
below (most of it was just typos and missed lines):

/* We now need to get the address of the buffer, because realloc()
* may change the value of the pointer. */
size_t append_item (void **buf, void *itm)
{
size_t *tmp = (size_t *) *buf - 3;
size_t size = tmp[0];
size_t max = tmp[1];
size_t num = ++tmp[2];

/* Early return on special case. */
if (itm == NULL)
return max;

/* If we don't have enough space, try and get some more. */
if (num == max) {
size_t *rtmp;
max *= 2;

rtmp = realloc (tmp, max);
while ((rtmp == NULL) && (max tmp[1]))
rtmp = realloc (tmp, --max);
}

/* If we now have enough space, we're good. Otherwise, we're cooked. */
if (max num) {
*buf = rtmp;
memmove (buf, itm, size);
return num;
} else return 0;
}

--
Andrew Poelstra <http://www.wpsoftware.net/projects>
To reach me by email, use `apoelstra' at the above domain.
"Do BOTH ends of the cable need to be plugged in?" -Anon.
Aug 30 '06 #4

P: n/a
Spiros Bousbouras wrote:
Spiros Bousbouras wrote:
.... snip ...
>>
>From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?

Ok , Google's interface is indeed broken. The line reading
"From the code it seems that your intention is to store" was
written by me but Google added a right before "From" !
It thinks it's a mail file !

I'll now start searching for a different web based interface
to usenet.
Now this is one area where Google is NOT broken. A blank line,
followed by the word 'From', signals the start of a mail/news
message. Adding the '>' prevents that, and will be done by all
good newsreaders.

Just get yourself a proper newsreader. Thunderbird comes to mind.

--
Some informative links:
news:news.announce.newusers
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

Aug 31 '06 #5

P: n/a
On Wed, 30 Aug 2006 21:48:37 GMT, Andrew Poelstra
<ap*******@false.sitewrote:
>
I hacked this together this morning so that I could shift my out-of-
space code away from the rest of my logic. I wanted to allow array
syntax on my dynamic buffers, so I manually created a struct with
malloc() and judicious use of void* pointers.

I decided to post it because stuff like this is breeding ground for
UB, and this group is good at finding it. (I had a dentist analogy
for that, but it didn't work out.)

Here's my code:

/* Start. */
#include <stdlib.h>
#include <string.h>

void *create_buffer (size_t size, size_t max)
{
/* We need to allocate three size_t before the actual buffer to hold
* the size of each item, the maximum number of items, and the actual
* number of items. We do it this way instead of using a struct so that
* array syntax can still be used on the returned structure. */
size_t *tmp = malloc ( (3 * sizeof *tmp) + (size * max) );

if (tmp != NULL)
{
tmp[0] = size; /* Size of each item */
tmp[1] = max; /* Maximum # of items */
tmp[2] = 0; /* Actual # of items */
}

return (tmp != NULL) ? tmp + 3 : NULL;
}

/* This function will return three possible things:
* 1. On ordinary usage, it will return the new maximum size of the buffer.
* 2. If it runs out of memory, it will return 0.
* 3. If passed NULL, it will do nothing but return maximum size of buf.
* The third case is used after the second case to recover lost size info.
* If buf is NULL, Bad Things will happen. */
size_t append_item (void *buf, void *itm)
{
size_t *tmp = (size_t *) buf - 3;
size_t size = tmp[0];
size_t max = tmp[1];
size_t num = ++tmp[2];

/* Early return on special case. */
if (itm == NULL)
return num;

/* If we don't have enough space, try and get some more. */
if (num == max) {
size_t *rtmp;
max *= 2;
while (!(rtmp = realloc (tmp, max)) && (max tmp[1]))
You forgot the 3*sizeof(size_t) in the second parameter to realloc. If
create_buffer is called with a small value of max, your realloc could
actually reduce the amount of allocated space.

After you have reallocated the space, how does your caller know where
it is?
--max;
}

/* If we now have enough space, we're good. Otherwise, we're cooked. */
if (max num) {
memmove (buf, itm, size);
If you reallocated the space, buf need not be a valid address.
return num;
} else return 0;
}
/* End. */

It compiles in gcc with all warnings set.
Syntax is only one of many contributors to effectiveness. Necessary
but not sufficient.
Remove del for email
Aug 31 '06 #6

P: n/a
Andrew Poelstra schrieb:
"Spiros Bousbouras" <sp****@gmail.comwrites:
>>>From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?

The items may not be of size_t. I don't know that they're properly
aligned (this is an example of the UB I knew that I missed). What
would be the best way for me to guarantee alignment? Am I going to
need to use a union?
There is no portable way that does all the work; some time ago,
I wrote a "memsize" solution that does store the size in a standard
conforming albeit ugly and wasteful way. I asked for a code review
back then but did not receive any response; you are welcome to take
out the useful part of the ideas:
<36*************@individual.net>
Download seems still possible.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Aug 31 '06 #7

P: n/a
Michael Mair <Mi**********@invalid.invalidwrites:
Andrew Poelstra schrieb:
>"Spiros Bousbouras" <sp****@gmail.comwrites:
>>>>From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?

The items may not be of size_t. I don't know that they're properly
aligned (this is an example of the UB I knew that I missed). What
would be the best way for me to guarantee alignment? Am I going to
need to use a union?

There is no portable way that does all the work; some time ago,
I wrote a "memsize" solution that does store the size in a standard
conforming albeit ugly and wasteful way. I asked for a code review
back then but did not receive any response; you are welcome to take
out the useful part of the ideas:
<36*************@individual.net>
Download seems still possible.
Thanks! That'll be useful, and probably end up being my solution.
A quick question, though, is struct padding consistant?

That is, for every struct { int x, size_t y, float z }, will (&z - &x)
always be the same number? Because if so, a more efficient solution
could be found.

--
Andrew Poelstra <http://www.wpsoftware.net/projects>
To reach me by email, use `apoelstra' at the above domain.
"Do BOTH ends of the cable need to be plugged in?" -Anon.
Aug 31 '06 #8

P: n/a
Andrew Poelstra schrieb:
Michael Mair <Mi**********@invalid.invalidwrites:
>>Andrew Poelstra schrieb:
>>>"Spiros Bousbouras" <sp****@gmail.comwrites:
>From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?

The items may not be of size_t. I don't know that they're properly
aligned (this is an example of the UB I knew that I missed). What
would be the best way for me to guarantee alignment? Am I going to
need to use a union?

There is no portable way that does all the work; some time ago,
I wrote a "memsize" solution that does store the size in a standard
conforming albeit ugly and wasteful way. I asked for a code review
back then but did not receive any response; you are welcome to take
out the useful part of the ideas:
<36*************@individual.net>
Download seems still possible.

Thanks! That'll be useful, and probably end up being my solution.

A quick question, though, is struct padding consistant?

That is, for every struct { int x, size_t y, float z }, will (&z - &x)
always be the same number? Because if so, a more efficient solution
could be found.
If you always use the same identifiers for members of structs
with this layout and this sequence of types in your source: Yes
(see "compatible type" in the standard).
If not: The standard does not guarantee it but in every
implementation worth its salt, yes.
Read the bits about compatible types w.r.t. structure types and
about the "common initial sequence" for unions in the standard
to get a feeling about what is guaranteed (C90 or C99).
The DS900x series probably uses a member name based struct layout
for struct types not used in a union...

Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Aug 31 '06 #9

P: n/a
Michael Mair wrote:
Andrew Poelstra schrieb:
"Spiros Bousbouras" <sp****@gmail.comwrites:
>>From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?
The items may not be of size_t. I don't know that they're properly
aligned (this is an example of the UB I knew that I missed). What
would be the best way for me to guarantee alignment? Am I going to
need to use a union?

There is no portable way that does all the work; some time ago,
I wrote a "memsize" solution that does store the size in a standard
conforming albeit ugly and wasteful way. I asked for a code review
back then but did not receive any response; you are welcome to take
out the useful part of the ideas:
<36*************@individual.net>
Download seems still possible.
This may be interesting as an academic exercise but
personally I don't see the point of storing the size of
the buffer in the buffer. Whoever needs to use the buffer
will need to know the location of it so if they can remember
the location then why not also the size ?

The way I'd do it would be to have a structure
struct buffer{size_t size ; size_t max ; size_t num ;
void * beg_of_bf ;}
and pass pointers to buffer to the functions
create_buffer and append_item

In the case of create_buffer the caller would
initialize fields size and max and the function
the fields num (set to 0) and beg_of_bf

In the case of append_item the return value
will determine if the appending was done
succesfully and if yes it will update num and
possibly also max and beg_of_bf if reallocation
was required.

Sep 1 '06 #10

P: n/a
"Spiros Bousbouras" <sp****@gmail.comwrites:
This may be interesting as an academic exercise but
personally I don't see the point of storing the size of
the buffer in the buffer. Whoever needs to use the buffer
will need to know the location of it so if they can remember
the location then why not also the size ?
If I wanted the user to need to keep track of the buffer's
size, I wouldn't have bothered hiding everything else. Why
not make the user do his own reallocations?
The way I'd do it would be to have a structure
struct buffer{size_t size ; size_t max ; size_t num ;
void * beg_of_bf ;}
and pass pointers to buffer to the functions
create_buffer and append_item
Because that would require the user to access his buffer
as mybuf->beg_of_bf instead of just mybuf[].
In the case of create_buffer the caller would
initialize fields size and max and the function
the fields num (set to 0) and beg_of_bf
I think that I should initialize the entire buffer to 0 on
start. Thanks for the suggesion!
In the case of append_item the return value
will determine if the appending was done
succesfully and if yes it will update num and
possibly also max and beg_of_bf if reallocation
was required.
--
Andrew Poelstra <http://www.wpsoftware.net/projects>
To reach me by email, use `apoelstra' at the above domain.
"Do BOTH ends of the cable need to be plugged in?" -Anon.
Sep 1 '06 #11

P: n/a
Thanks to Michael Mair, Barry Schwartz, and Spiros Bousbouras for their
suggestions.

OT: It appears that Google's interface has indeed been fixed, and it's
actually very impressive now.

Here's my new version:

#include <stdlib.h>
#include <string.h>
#include "dynamic_buffer.h"

struct dbuffer_s {
size_t size;
size_t max;
size_t num;
void *content;
};

void *create_buffer (size_t size, size_t max)
{
struct dbuffer_s *tmp = malloc (sizeof *tmp);

if (tmp != NULL)
{
tmp->size = size; /* Size of each item */
tmp->max = max; /* Maximum # of items */
tmp->num = 0; /* Actual # of items */

tmp->content = calloc (size * max);
if (tmp->content == NULL) {
free (tmp);
tmp = NULL;
}
}

return (tmp != NULL) ? tmp->content : NULL;
}

/* This function will return three possible things:
* The new maximum on success
* The old maximum on failure (or itm == NULL)
* 0 on buf == NULL
*/

size_t append_item (void **buf, void *itm)
{
static struct dbuffer_s stmp; /* Used for length calculations. */
size_t distance_back = (size_t) &stmp.content - (size_t) &stmp;

struct dbuffer_s *tmp;

/* Initial error conditions */
if (buf == NULL)
return 0;

tmp = (struct dbuffer_s *) ((unsigned char *) buf - distance_back);

if (itm == NULL)
return tmp->size;

/* Here's where the (attempted) reallocation starts: */
if (tmp->num == tmp->max)
{
size_t max = tmp->max * 2;
void *ptmp = tmp->content; /* tmp->content is *buf. */

ptmp = realloc (tmp->content, max * tmp->size);
while (ptmp == NULL)
{
ptmp = realloc (tmp->content, --max * tmp->size);

/* If we failed, we failed. */
if (max == tmp->max)
break;
}

/* If we pulled anything off, reflect it in tmp: */
if (max tmp->max)
tmp->content = ptmp;
}

/* Now the actual addition code: */
if (tmp->num < tmp->max)
memmove (itm, (unsigned char *) tmp->content + tmp->num++,
tmp->size);

return tmp->size;
}

--
Andrew Poelstra <http://www.wpsoftware.net/projects>
To reach me by email, use `apoelstra' at the above domain.
"Do BOTH ends of the cable need to be plugged in?" -Anon.

Sep 1 '06 #12

P: n/a
Andrew Poelstra wrote:
"Spiros Bousbouras" <sp****@gmail.comwrites:
This may be interesting as an academic exercise but
personally I don't see the point of storing the size of
the buffer in the buffer. Whoever needs to use the buffer
will need to know the location of it so if they can remember
the location then why not also the size ?

If I wanted the user to need to keep track of the buffer's
size, I wouldn't have bothered hiding everything else.
The user doesn't *need* to keep track of the size but
he can find out the size if he so wishes.
Why
not make the user do his own reallocations?
For the reason you explained in your first post:
"I hacked this together this morning so that I could
shift my out-of-space code away from the rest of my
logic."

Sep 1 '06 #13

P: n/a

Michael Mair wrote:
Andrew Poelstra schrieb:
Michael Mair <Mi**********@invalid.invalidwrites:
>Andrew Poelstra schrieb:

"Spiros Bousbouras" <sp****@gmail.comwrites:
From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?

The items may not be of size_t. I don't know that they're properly
aligned (this is an example of the UB I knew that I missed). What
would be the best way for me to guarantee alignment? Am I going to
need to use a union?

There is no portable way that does all the work; some time ago,
I wrote a "memsize" solution that does store the size in a standard
conforming albeit ugly and wasteful way. I asked for a code review
back then but did not receive any response; you are welcome to take
out the useful part of the ideas:
<36*************@individual.net>
Download seems still possible.
Thanks! That'll be useful, and probably end up being my solution.

A quick question, though, is struct padding consistant?

That is, for every struct { int x, size_t y, float z }, will (&z - &x)
always be the same number? Because if so, a more efficient solution
could be found.

If you always use the same identifiers for members of structs
with this layout and this sequence of types in your source: Yes
(see "compatible type" in the standard).
To be compatible the struct tags also need to match.

Sep 3 '06 #14

P: n/a
en******@yahoo.com schrieb:
Michael Mair wrote:
>>Andrew Poelstra schrieb:
>>>Michael Mair <Mi**********@invalid.invalidwrites:
Andrew Poelstra schrieb:
>"Spiros Bousbouras" <sp****@gmail.comwrites:
>
>
>
>>>From the code it seems that your intention is to store
>>the "items" starting at buf[3]. Are the items of type
>>size_t ? If not how do you know that buf[3] is properly
>>aligned ?
>
>The items may not be of size_t. I don't know that they're properly
>aligned (this is an example of the UB I knew that I missed). What
>would be the best way for me to guarantee alignment? Am I going to
>need to use a union?

There is no portable way that does all the work; some time ago,
I wrote a "memsize" solution that does store the size in a standard
conforming albeit ugly and wasteful way. I asked for a code review
back then but did not receive any response; you are welcome to take
out the useful part of the ideas:
<36*************@individual.net>
Download seems still possible.

Thanks! That'll be useful, and probably end up being my solution.

A quick question, though, is struct padding consistant?

That is, for every struct { int x, size_t y, float z }, will (&z - &x)
always be the same number? Because if so, a more efficient solution
could be found.

If you always use the same identifiers for members of structs
with this layout and this sequence of types in your source: Yes
(see "compatible type" in the standard).

To be compatible the struct tags also need to match.
Could you please give a C90 source for that?

In my C89 draft, I cannot find mention of struct tags in 3.1.2.6
but an example for typedef in 3.5.6 which suggests the same.
I do not recall reading the bit about the tags in the standard
(which lies around at work) either.
In the C99 standard, this can be found where I expect it, namely
in 6.2.7.

I am especially interested whether the relationship between
"with tag" and "without tag" is the same in C90 as suggested by
C99. This would upset something we cooked up recently relying on
the C90 standard...
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Sep 3 '06 #15

P: n/a

Michael Mair wrote:
en******@yahoo.com schrieb:
Michael Mair wrote:
>Andrew Poelstra schrieb:

Michael Mair <Mi**********@invalid.invalidwrites:
Andrew Poelstra schrieb:
"Spiros Bousbouras" <sp****@gmail.comwrites:

>>From the code it seems that your intention is to store
>the "items" starting at buf[3]. Are the items of type
>size_t ? If not how do you know that buf[3] is properly
>aligned ?

The items may not be of size_t. I don't know that they're properly
aligned (this is an example of the UB I knew that I missed). What
would be the best way for me to guarantee alignment? Am I going to
need to use a union?

There is no portable way that does all the work; some time ago,
I wrote a "memsize" solution that does store the size in a standard
conforming albeit ugly and wasteful way. I asked for a code review
back then but did not receive any response; you are welcome to take
out the useful part of the ideas:
<36*************@individual.net>
Download seems still possible.

Thanks! That'll be useful, and probably end up being my solution.

A quick question, though, is struct padding consistant?

That is, for every struct { int x, size_t y, float z }, will (&z - &x)
always be the same number? Because if so, a more efficient solution
could be found.

If you always use the same identifiers for members of structs
with this layout and this sequence of types in your source: Yes
(see "compatible type" in the standard).
To be compatible the struct tags also need to match.

Could you please give a C90 source for that?

In my C89 draft, I cannot find mention of struct tags in 3.1.2.6
but an example for typedef in 3.5.6 which suggests the same.
I do not recall reading the bit about the tags in the standard
(which lies around at work) either.
In the C99 standard, this can be found where I expect it, namely
in 6.2.7.

I am especially interested whether the relationship between
"with tag" and "without tag" is the same in C90 as suggested by
C99. This would upset something we cooked up recently relying on
the C90 standard...
Be sure you're reading the requirements carefully. Even
in C89/C90, the compatibility between structure types
with isomorphic members applies only for types declared
_in separate translation units_. In one translation
unit, two struct types must be the same type for them
to be compatible; the isomorphic member provision
doesn't apply.

Requiring struct or union types to have the same tag for
types in separate compilation units to be compatible was
introduced in C99.

I'm resisting the temptation to editorialize on the
advisability of using different struct tags in different
translation units to produce interchangeable types. :)

Sep 3 '06 #16

P: n/a
en******@yahoo.com schrieb:
Michael Mair wrote:
>>en******@yahoo.com schrieb:
>>>Michael Mair wrote:
Andrew Poelstra schrieb:
>Michael Mair <Mi**********@invalid.invalidwrites:
>>Andrew Poelstra schrieb:
>>>"Spiros Bousbouras" <sp****@gmail.comwrites:
>>>>>From the code it seems that your intention is to store
>>>>the "items" starting at buf[3]. Are the items of type
>>>>size_t ? If not how do you know that buf[3] is properly
>>>>aligned ?
>>>
>>>The items may not be of size_t. I don't know that they're properly
>>>aligned (this is an example of the UB I knew that I missed). What
>>>would be the best way for me to guarantee alignment? Am I going to
>>>need to use a union?
>>
>>There is no portable way that does all the work; some time ago,
>>I wrote a "memsize" solution that does store the size in a standard
>>conforming albeit ugly and wasteful way. I asked for a code review
>>back then but did not receive any response; you are welcome to take
>>out the useful part of the ideas:
>><36*************@individual.net>
>>Download seems still possible.
>
>Thanks! That'll be useful, and probably end up being my solution.
>
>A quick question, though, is struct padding consistant?
>
>That is, for every struct { int x, size_t y, float z }, will (&z - &x)
>always be the same number? Because if so, a more efficient solution
>could be found.

If you always use the same identifiers for members of structs
with this layout and this sequence of types in your source: Yes
(see "compatible type" in the standard).

To be compatible the struct tags also need to match.

Could you please give a C90 source for that?

In my C89 draft, I cannot find mention of struct tags in 3.1.2.6
but an example for typedef in 3.5.6 which suggests the same.
I do not recall reading the bit about the tags in the standard
(which lies around at work) either.
In the C99 standard, this can be found where I expect it, namely
in 6.2.7.

I am especially interested whether the relationship between
"with tag" and "without tag" is the same in C90 as suggested by
C99. This would upset something we cooked up recently relying on
the C90 standard...

Be sure you're reading the requirements carefully. Even
in C89/C90, the compatibility between structure types
with isomorphic members applies only for types declared
_in separate translation units_. In one translation
unit, two struct types must be the same type for them
to be compatible; the isomorphic member provision
doesn't apply.
Thank you for that reminder -- and yes, I was aware of that :-)

Requiring struct or union types to have the same tag for
types in separate compilation units to be compatible was
introduced in C99.
Okay, so this is not an obvious oversight in C90 but came up
with C99, probably to avoid awkward situations w.r.t. type
definition included or not.

I'm resisting the temptation to editorialize on the
advisability of using different struct tags in different
translation units to produce interchangeable types. :)
*g* It is not necessary -- the question for me primarily
was between "no tag" and "unique tag throughout all
translation units knowing the tag". As it is, we have
to make sure that the generated include dependencies suffice.
Regards
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Sep 4 '06 #17

P: n/a

Michael Mair wrote:
en******@yahoo.com schrieb:
I'm resisting the temptation to editorialize on the
advisability of using different struct tags in different
translation units to produce interchangeable types. :)

*g* It is not necessary -- the question for me primarily
was between "no tag" and "unique tag throughout all
translation units knowing the tag". As it is, we have
to make sure that the generated include dependencies suffice.
It doesn't seem like it would be that hard to make
the tags be unique, even across the entire program.
What am I missing?

Sep 4 '06 #18

P: n/a
On Wed, 30 Aug 2006 21:04:55 -0400, CBFalconer <cb********@yahoo.com>
wrote:
Spiros Bousbouras wrote:
Spiros Bousbouras wrote:
Ok , Google's interface is indeed broken. The line reading
"From the code it seems that your intention is to store" was
written by me but Google added a right before "From" !
It thinks it's a mail file !
Now this is one area where Google is NOT broken. A blank line,
followed by the word 'From', signals the start of a mail/news
Only in 'mbox' format local storage on _some_ systems, and maybe uucp
transport (can't remember, too long). The current transports, SMTP and
NNTP, use a dot-only terminator (which traces back to ed and Multics
at least, and I've heard to CTSS) and dot-stuffing.
message. Adding the '>' prevents that, and will be done by all
good newsreaders.
I don't think all good. But some good. Some even escape From at BOL
(maybe even F at BOL) when using q-p encoding. (Which shouldn't be
needed and shouldn't be used in text newsgroups like this one.) You
certainly shouldn't be surprised or too upset when it is done.
Just get yourself a proper newsreader. Thunderbird comes to mind.
_and_ newsserver aka NSP, if you don't already have one. The one
arguable benefit I can see to dejagoo is that it is its own NSP.

- David.Thompson1 at worldnet.att.net
Sep 7 '06 #19

P: n/a
On Wed, 30 Aug 2006 23:27:19 GMT, Andrew Poelstra
<ap*******@false.sitewrote:

(In the original post, readded:)
/* We need to allocate three size_t before the actual buffer to hold
* the size of each item, the maximum number of items, and the actual
* number of items. We do it this way instead of using a struct so that
* array syntax can still be used on the returned structure. */
That's not so. You can use this same hack with a header struct and
'pseudo-array' body just as well, and I have done so. The UB
(manifested as platform dependence) is the same (neither solved nor
aggravated). But in this simple case there's not much benefit.
"Spiros Bousbouras" <sp****@gmail.comwrites:
>From the code it seems that your intention is to store
the "items" starting at buf[3]. Are the items of type
size_t ? If not how do you know that buf[3] is properly
aligned ?

The items may not be of size_t. I don't know that they're properly
aligned (this is an example of the UB I knew that I missed). What
would be the best way for me to guarantee alignment? Am I going to
need to use a union?
Either that, and then you can only handle the set of types you have
statically compiled, plus any types that the given implementation
makes 'sufficiently similar' (in particular, although structs can
require alignment independent of their contents, mostly they don't).

Or else determine the maximum required alignment outside of the code
and feed it in; the simplest way is to read the documentation and give
the value on a #define or -D or whatever. But this is prone to human
error, especially if you want to let other people use your code.
Thanks for your comments: I've corrected my append_item function
below (most of it was just typos and missed lines):

/* We now need to get the address of the buffer, because realloc()
The address of the _pointer_ (or buffer pointer or pointer to buffer).
* may change the value of the pointer. */
size_t append_item (void **buf, void *itm)
Note that this means the caller cannot have an int* or struct foo* for
accessing the items and give you & of that. void* can be converted
from and to all data pointer types, but void** canNOT validly point to
all data pointer types; FAQ 4.9. Either each caller must have, and
keep in sync, _two_ pointers; or must keep only a void* and cast it on
every use to a mydata* (which is a pain); or you need a different API.
I have never found a really attractive solution to this problem, which
is one reason I don't often bother with these generic allocators.
{
size_t *tmp = (size_t *) *buf - 3;
size_t size = tmp[0];
size_t max = tmp[1];
size_t num = ++tmp[2];

/* Early return on special case. */
if (itm == NULL)
return max;
Your comments said you use this case after a previous 'no memory'
return, but that call already incremented tmp[2], and this call
increments it again, and you don't have usable memory at either of
them. I'm not sure what you want to do but I don't think this is it.
/* If we don't have enough space, try and get some more. */
if (num == max) {
size_t *rtmp;
max *= 2;

rtmp = realloc (tmp, max);
If max is a number of items each of size size, as appears to me to be
the case, you want /*new*/max * size + 3 * sizeof(size_t) here, and
similarly in the next chunk.
while ((rtmp == NULL) && (max tmp[1]))
rtmp = realloc (tmp, --max);
}
In the limit case this will try to realloc to the old max i.e. tmp[1]
(times size, per above) which I believe (although there has been some
controversy) is actually allowed to choose to move the data and return
a new pointer, which you then fail to store below effectively
destroying your data structure, although I doubt any implementation
actually does so since it's just Really Silly. Since trying to realloc
to the old max is useless even if it 'succeeds', since it doesn't give
you the room needed for the new item, I would recode the limit
condition. The canonical form IMO is:
max *= 2; /* or max = whatever */
while( max tmp[1]
&& (rtmp = realloc (tmp, hsize+max*size)) == NULL )
--max;
This does do one unneeded comparison, but I feel the gain in clarity
and regularity outweighs that. By a lot.
/* If we now have enough space, we're good. Otherwise, we're cooked. */
if (max num) {
*buf = rtmp;
You also need to update tmp[1] to the new max. (And I would name tmp
something more like hdrp. It's not just a temporary.)
memmove (buf, itm, size);
With the name append_item I would think you want to store the new item
at the newly allocated slot #num:
memcpy ((char*)buf + size*num, itm, size);
The caller should never be giving you a pointer into space that isn't
logically allocated yet, much less into reallocated space, so it
shouldn't be able to overlap with the new slot and memcpy is OK.

But you are also starting with slot #1 and thus never using slot #0.
Is that really what you want? It's more common and arguably more
idiomatic in C to use zero-origin and _post_inc like new = sofar++ ;

Also, if the caller wants to duplicate an existing slot in the same
array _and_ you need to reallocate and the realloc does (choose to)
move the data, you are now illegally memcpy'ing (or even memmove'ing)
from space that has been freed and may have been trashed.
You have to document that callers must not do this, and of course must
never save pointers into your array over a call to append_item or any
other routine that may reallocate or any routine that may call a
routine that may reallocate etc. etc. rather like the way C++
documents that certain operations on certain container types
invalidate (certain) preexisting iterators into those containers.
Still think it's such a wonderful idea? <G>
return num;
} else return 0;
}
- David.Thompson1 at worldnet.att.net
Sep 7 '06 #20

P: n/a
On 1 Sep 2006 07:30:59 -0700, "Andrew Poelstra"
<jo*********@gmail.comwrote:
Thanks to Michael Mair, Barry Schwartz, and Spiros Bousbouras for their
suggestions.

OT: It appears that Google's interface has indeed been fixed, and it's
actually very impressive now.

Here's my new version:

#include <stdlib.h>
#include <string.h>
#include "dynamic_buffer.h"

struct dbuffer_s {
size_t size;
size_t max;
size_t num;
void *content;
This is completely different from your earlier version. It is a
pointer to space allocated somewhere else, not adjacent.
};

void *create_buffer (size_t size, size_t max)
{
struct dbuffer_s *tmp = malloc (sizeof *tmp);

if (tmp != NULL)
{
tmp->size = size; /* Size of each item */
tmp->max = max; /* Maximum # of items */
tmp->num = 0; /* Actual # of items */

tmp->content = calloc (size * max);
That's wrong. Either malloc (size * max) or calloc (size, max) with
two separate arguments. Since your logic tries to ensure that each new
'item's worth of space is written before it can be read, the extra
work calloc (probably) does is wasted and malloc is better.

Oh, and either way should probably check for overflow:
if( max SIZE_MAX / size ){ give up now }
if (tmp->content == NULL) {
free (tmp);
tmp = NULL;
}
}

return (tmp != NULL) ? tmp->content : NULL;
}
In nonerror case, this returns the value of the pointer tmp->content,
which is the address of space somewhere completely unrelated to the
address of the space for struct dbuffer_s that tmp points to.
/* This function will return three possible things:
* The new maximum on success
* The old maximum on failure (or itm == NULL)
How is the caller supposed to distinguish these? Is it required to
keep track of the current usage? If so, doesn't that reduce the
benefit you intended to provide by encapsulating this functionality?
* 0 on buf == NULL
*/

size_t append_item (void **buf, void *itm)
{
static struct dbuffer_s stmp; /* Used for length calculations. */
size_t distance_back = (size_t) &stmp.content - (size_t) &stmp;
You don't need a variable for this; use offsetof() from <stdlib.hand
it's guaranteed to be a compile time constant.
struct dbuffer_s *tmp;

/* Initial error conditions */
if (buf == NULL)
return 0;
Personally I would make this an assert() or some similar but
project/environment-specific fatal error. If the caller is so screwed
up as to be trying to use an unallocated buffer, it is probably so
screwed up it won't handle an error return usefully.
tmp = (struct dbuffer_s *) ((unsigned char *) buf - distance_back);
Assuming tmp is the value returned from create_buffer, this won't work
at all. If you want to do this kind of kludge, and accept the issues
about ensuring alignment which <ObStdcan't be done for all cases
guaranteed portably but </can be done for most common systems, you
need to allocate _one_ chunk of space big enough for the header plus
padding plus the data, give the caller the address to the data part,
and step backward from that to find the header -- leaving any excess
padding, unusually, _before_ the header.

And in that case you don't need to do it explicitly; assuming buf
actually points to the caller's void* pointer (NOT portably a data*
pointer, see my previous post) just use:
struct dbuffer_s * hdrp = (struct dbuffer_s *) *buf - 1;
Or you can (and when I did this I preferred to):
struct dbuffer_s * hdrp = /* (struct dbuffer_s *) */ *buf;
and then use hdrp[-1].field rather than hdrp->field .
if (itm == NULL)
return tmp->size;

/* Here's where the (attempted) reallocation starts: */
if (tmp->num == tmp->max)
{
size_t max = tmp->max * 2;
Should check for overflow: old tmp->max SIZE_MAX/2 on the count, but
old tmp->max SIZE_MAX/size/2 for the resulting allocation. Though on
most systems you will run out of available memory and realloc() will
fail sufficiently before you run out of bits in size_t.
void *ptmp = tmp->content; /* tmp->content is *buf. */
This initialization is useless; you immediately overwrite it.
ptmp = realloc (tmp->content, max * tmp->size);
while (ptmp == NULL)
{
ptmp = realloc (tmp->content, --max * tmp->size);

/* If we failed, we failed. */
if (max == tmp->max)
break;
}
There's no point (re)trying the old tmp->max; see my previous post.
/* If we pulled anything off, reflect it in tmp: */
if (max tmp->max)
tmp->content = ptmp;
}
And update tmp->max, see my previous post.
/* Now the actual addition code: */
if (tmp->num < tmp->max)
memmove (itm, (unsigned char *) tmp->content + tmp->num++,
tmp->size);
That's backwards; the first argument is the destination, which should
be (anychar*)content + (num++) * size (note scaling by size), and the
second argument should be itm, the object you are adding.

And as I previously posted, these should never overlap for a valid
caller, so you only need memcpy() not memmove(), but will screw up if
the caller gives you an element in the old pre-reallocated array.
return tmp->size;
}
- David.Thompson1 at worldnet.att.net
Sep 7 '06 #21

P: n/a
Thank you so much for your comments, Dave. I've decided to simply use
C99 for my code; that way I get SIZE_MAX and I get flexible array
members, which are very useful for my new (hopefully free of
non-trivial mistakes) version. I'm aware that there aren't any popular
C99 compilers, but this appears to compiler properly with the latest
version of gcc, and I can always make object files for people with
non-supported compilers. I've also deliberately not included functions
for checking the buffer size, etc: these are trivial additions that
would distract from the much more bug-conducive body.

Here's the code:

#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdint.h>

#include "dynamic_buffer.h" /* This contains nothing but function
prototypes*/

/* Thanks to Dave Thompson for all his suggestions, fixes, and pointing
* out the things I overlooked. */

struct dbuffer_s {
size_t size;
size_t max;
size_t num;
unsigned char content[];
};

void *create_buffer (size_t size, size_t max)
{
struct dbuffer_s *tmp = NULL;

/* Check for overflow */
if (max SIZE_MAX / size)
tmp = malloc (sizeof *tmp + size * max);

if (tmp != NULL)
{
tmp->size = size; /* Size of each item */
tmp->max = max; /* Maximum # of items */
tmp->num = 0; /* Actual # of items */
}

return (tmp != NULL) ? tmp->content : NULL;
}

void delete_buffer (void *buf)
{
free ((char *) buf - offsetof (struct dbuffer_s, content));
return;
}
/* This function will return three possible things:
* The new (potentially reallocated) buffer on success
* NULL on failure.
*/

void *append_item (void *buf, void *itm)
{
struct dbuffer_s *tmp;

assert (buf != NULL);
assert (itm != NULL);

tmp = (struct dbuffer_s *) ((char *) buf - offsetof (struct
dbuffer_s, content));

/* Here's where the (attempted) reallocation starts: */
if (tmp->num == tmp->max)
{
size_t max = tmp->max * 2;
void *ptmp = realloc (tmp->content, sizeof *tmp + max * tmp->size);

while (ptmp == NULL)
{
/* If we failed, we failed. */
if (max == tmp->max + 1)
return NULL;

ptmp = realloc (tmp->content, sizeof *tmp + --max * tmp->size);
}

tmp = ptmp;
/* Now the actual addition code: */
memmove ((char *) tmp->content + tmp->num++, itm, tmp->size);
}

return tmp->content;
}

Sep 7 '06 #22

P: n/a
Thank you so much for your comments, Dave. I've decided to simply use
C99 for my code; that way I get SIZE_MAX and I get flexible array
members, which are very useful for my new (hopefully free of
non-trivial mistakes) version. I'm aware that there aren't any popular
C99 compilers, but this appears to compiler properly with the latest
version of gcc, and I can always make object files for people with
non-supported compilers. I've also deliberately not included functions
for checking the buffer size, etc: these are trivial additions that
would distract from the much more bug-conducive body.

Here's the code:

#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdint.h>

#include "dynamic_buffer.h" /* This contains nothing but function
prototypes*/

/* Thanks to Dave Thompson for all his suggestions, fixes, and pointing
* out the things I overlooked. */

struct dbuffer_s {
size_t size;
size_t max;
size_t num;
unsigned char content[];
};

void *create_buffer (size_t size, size_t max)
{
struct dbuffer_s *tmp = NULL;

/* Check for overflow */
if (max SIZE_MAX / size)
tmp = malloc (sizeof *tmp + size * max);

if (tmp != NULL)
{
tmp->size = size; /* Size of each item */
tmp->max = max; /* Maximum # of items */
tmp->num = 0; /* Actual # of items */
}

return (tmp != NULL) ? tmp->content : NULL;
}

void delete_buffer (void *buf)
{
free ((char *) buf - offsetof (struct dbuffer_s, content));
return;
}
/* This function will return three possible things:
* The new (potentially reallocated) buffer on success
* NULL on failure.
*/

void *append_item (void *buf, void *itm)
{
struct dbuffer_s *tmp;

assert (buf != NULL);
assert (itm != NULL);

tmp = (struct dbuffer_s *) ((char *) buf - offsetof (struct
dbuffer_s, content));

/* Here's where the (attempted) reallocation starts: */
if (tmp->num == tmp->max)
{
size_t max = tmp->max * 2;
void *ptmp = realloc (tmp->content, sizeof *tmp + max * tmp->size);

while (ptmp == NULL)
{
/* If we failed, we failed. */
if (max == tmp->max + 1)
return NULL;

ptmp = realloc (tmp->content, sizeof *tmp + --max * tmp->size);
}

tmp = ptmp;
/* Now the actual addition code: */
memmove ((char *) tmp->content + tmp->num++, itm, tmp->size);
}

return tmp->content;
}

Sep 7 '06 #23

P: n/a
"Andrew Poelstra" <jo*********@gmail.comwrites:
I've decided to simply use C99 for my code; that way I get
SIZE_MAX and [...]
For what it's worth, in the absence of C99, SIZE_MAX is easy to
define:
#define SIZE_MAX ((size_t) -1)
(Unfortunately it's not suitable for use in preprocessing
directives, as SIZE_MAX is supposed to be.)
--
Ben Pfaff
email: bl*@cs.stanford.edu
web: http://benpfaff.org
Sep 7 '06 #24

P: n/a
en******@yahoo.com schrieb:
Michael Mair wrote:
>>en******@yahoo.com schrieb:
>>>I'm resisting the temptation to editorialize on the
advisability of using different struct tags in different
translation units to produce interchangeable types. :)

*g* It is not necessary -- the question for me primarily
was between "no tag" and "unique tag throughout all
translation units knowing the tag". As it is, we have
to make sure that the generated include dependencies suffice.

It doesn't seem like it would be that hard to make
the tags be unique, even across the entire program.
What am I missing?
Background: Several different "sources" generating C code
without much of an interface to communicate things such as
struct types (because struct tags are generated, too, and
may depend on order).
However, with the additional "identical tag" stipulation of
C99, the people who wanted to tinker with the "no tag" trick
are out of luck. Good enough for me.

Sorry for taking longer to get back to you.

Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Sep 12 '06 #25

P: n/a

Andrew Poelstra wrote:
Thank you so much for your comments, Dave. I've decided to simply use
C99 for my code; that way I get SIZE_MAX and I get flexible array
members, which are very useful for my new (hopefully free of
non-trivial mistakes) version. I'm aware that there aren't any popular
C99 compilers, but this appears to compiler properly with the latest
version of gcc, and I can always make object files for people with
non-supported compilers. I've also deliberately not included functions
for checking the buffer size, etc: these are trivial additions that
would distract from the much more bug-conducive body.

Here's the code:

#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdint.h>

#include "dynamic_buffer.h" /* This contains nothing but function
prototypes*/

/* Thanks to Dave Thompson for all his suggestions, fixes, and pointing
* out the things I overlooked. */

struct dbuffer_s {
size_t size;
size_t max;
size_t num;
unsigned char content[];
};

void *create_buffer (size_t size, size_t max)
{
struct dbuffer_s *tmp = NULL;

/* Check for overflow */
if (max SIZE_MAX / size)
tmp = malloc (sizeof *tmp + size * max);

if (tmp != NULL)
{
tmp->size = size; /* Size of each item */
tmp->max = max; /* Maximum # of items */
tmp->num = 0; /* Actual # of items */
}

return (tmp != NULL) ? tmp->content : NULL;
}

void delete_buffer (void *buf)
{
free ((char *) buf - offsetof (struct dbuffer_s, content));
return;
}
/* This function will return three possible things:
* The new (potentially reallocated) buffer on success
* NULL on failure.
*/

void *append_item (void *buf, void *itm)
{
struct dbuffer_s *tmp;

assert (buf != NULL);
assert (itm != NULL);

tmp = (struct dbuffer_s *) ((char *) buf - offsetof (struct
dbuffer_s, content));

/* Here's where the (attempted) reallocation starts: */
if (tmp->num == tmp->max)
{
size_t max = tmp->max * 2;
void *ptmp = realloc (tmp->content, sizeof *tmp + max * tmp->size);

while (ptmp == NULL)
{
/* If we failed, we failed. */
if (max == tmp->max + 1)
return NULL;

ptmp = realloc (tmp->content, sizeof *tmp + --max * tmp->size);
}

tmp = ptmp;
/* Now the actual addition code: */
memmove ((char *) tmp->content + tmp->num++, itm, tmp->size);
This should be moved down a block (so that it always runs), and changed
to:
memmove ((char *) tmp->content + tmp->num++ * tmp->size, itm,
tmp->size);
}

return tmp->content;
}
Sep 16 '06 #26

P: n/a
On 7 Sep 2006 16:24:32 -0700, "Andrew Poelstra"
<jo*********@gmail.comwrote:
Thank you so much for your comments, Dave. I've decided to simply use
C99 for my code; that way I get SIZE_MAX and I get flexible array
members, which are very useful for my new (hopefully free of
non-trivial mistakes) version.
Almost; I see you already caught the one real error:
memmove /* but could be memcpy */
( (char*)content + num++ * size, itm, size )

But one minor disagreement:
I'm aware that there aren't any popular
C99 compilers, but this appears to compiler properly with the latest
version of gcc, and I can always make object files for people with
non-supported compilers.
Not necessarily; only if you have a compiler targetted to their
platform, and if that platform has more than one format of object file
(some do) to their object file. For instance, I do some work on
Tandem^WCompaq^WHP NonStop, who have only since about 1995 their own C
compiler but since about 1975 their own object format quite unlike
anything you have seen and AFAICT not supported by GCC/binutils.
More seriously, there are two popular but different object formats for
Windows: COFF used by M$, and ELF used by gcc.

OTOH, your code is easily enough modified to C89 or even earlier,
or, to be frank, just rewritten off the top of one's head in about 10
minutes, that I don't consider this a problem.
- David.Thompson1 at worldnet.att.net
Sep 21 '06 #27

This discussion thread is closed

Replies have been disabled for this discussion.