By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,696 Members | 1,512 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,696 IT Pros & Developers. It's quick & easy.

Expanding buffer - response to "Determine the size of malloc" query

P: n/a
Initial issue: read in an arbitrary-length piece of text.
Perceived issue: handle variable-length data

The code below is a suggestion for implementing a variable length
buffer that could be used to read text or handle arrays of arbitrary
length. I don't have the expertise in C of many folks here so I feel
like I'm offering a small furry animal for sacrifice to a big armour
plated one... but will offer it anyway. Please do suggest improvements
or challenge the premise. It would be great if it could be improved to
become a generally useful piece of code.

Well, here goes. This should be fun. :-?

-

The following utility code is passed a buffer (allocated by the
caller) and maintains it at an appropriate size. The main function
increases the allocation (when necessary) by factors - rather than
fixed amounts - for speed. There is a secondary function to trim a
buffer back to a specific size. An extra byte (one more than is
requested) is always left at the end.

/*
* Expanding buffer
*/

#define EBUF_SIZE_INIT 128
#define EBUF_SIZE_MIN 128
#define EBUF_INCREASE 1.5 /* Factor to increase space by each time */

#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>

int ebuf_full(char **buf, size_t *buf_size, size_t offset) {
size_t new_size;
char *new_buf;

if (*buf_size < offset + 2) { /* NB last pos left empty */
new_size = *buf_size * EBUF_INCREASE + 1;
if (new_size < offset + 2) new_size = offset + 2;
if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Failed to realloc buffer */
}
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocated successfuly */
}
int ebuf_trim(char **buf, size_t *buf_size, size_t offset) {
int new_size = offset + 2; /* Includes empty char */
char *new_buf;

if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if (new_size != *buf_size) {
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Reallocation failed (unlikely) */
}
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocation succeeded */
}

Jun 27 '08 #1
Share this Question
Share on Google+
36 Replies


P: n/a
On 30 May, 14:12, James Harris <james.harri...@googlemail.comwrote:
Initial issue: read in an arbitrary-length piece of text.
Perceived issue: handle variable-length data

The code below is a suggestion for implementing a variable length
buffer that could be used to read text or handle arrays of arbitrary
length. I don't have the expertise in C of many folks here so I feel
like I'm offering a small furry animal for sacrifice to a big armour
plated one... but will offer it anyway. Please do suggest improvements
or challenge the premise. It would be great if it could be improved to
become a generally useful piece of code.

Well, here goes. This should be fun. :-?

-

The following utility code is passed a buffer (allocated by the
caller) and maintains it at an appropriate size. The main function
increases the allocation (when necessary) by factors - rather than
fixed amounts - for speed. There is a secondary function to trim a
buffer back to a specific size. An extra byte (one more than is
requested) is always left at the end.

/*
* Expanding buffer
*/

#define EBUF_SIZE_INIT 128
#define EBUF_SIZE_MIN 128
#define EBUF_INCREASE 1.5 /* Factor to increase space by each time */

#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>

int ebuf_full(char **buf, size_t *buf_size, size_t offset) {
size_t new_size;
char *new_buf;

if (*buf_size < offset + 2) { /* NB last pos left empty */
new_size = *buf_size * EBUF_INCREASE + 1;
if (new_size < offset + 2) new_size = offset + 2;
if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Failed to realloc buffer */
}
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocated successfuly */

}

int ebuf_trim(char **buf, size_t *buf_size, size_t offset) {
int new_size = offset + 2; /* Includes empty char */
char *new_buf;

if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if (new_size != *buf_size) {
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Reallocation failed (unlikely) */
}
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocation succeeded */

}
An example of intended use follows. Note that the routines are coded
to expect buffer and current size as parameters. Despite the error
handling the code is intended to be fast. Including "if (offset + 2 >
buf1_size)" in the main code the function should only be called if the
buffer is too small. The cost of one integer comparison is small.

int main() {
char *buf1;
size_t buf1_size = EBUF_SIZE_INIT;
size_t offset;

if ((buf1 = malloc(buf1_size)) == NULL) {
fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
buf1_size);
exit(1);
}

...

offset = <position in buffer to write to>

...

/* Check buf1 is big enough */
if (offset + 2 buf1_size && ebuf_full(&buf1, &buf1_size, offset))
{
fprintf(stderr, "Buffer overflow - have %d bytes but need %d
bytes",
buf1_size, offset + 2);
exit(1);
}
buf1[offset] = 0;

...

free(buf1);
}
Jun 27 '08 #2

P: n/a
On 30 May, 14:12, James Harris <james.harri...@googlemail.comwrote:
Initial issue: read in an arbitrary-length piece of text.
Perceived issue: handle variable-length data

The code below is a suggestion for implementing a variable length
buffer that could be used to read text or handle arrays of arbitrary
length. I don't have the expertise in C of many folks here so I feel
like I'm offering a small furry animal for sacrifice to a big armour
plated one... but will offer it anyway. Please do suggest improvements
or challenge the premise. It would be great if it could be improved to
become a generally useful piece of code.

Well, here goes. This should be fun. :-?

-

The following utility code is passed a buffer (allocated by the
caller) and maintains it at an appropriate size. The main function
increases the allocation (when necessary) by factors - rather than
fixed amounts - for speed. There is a secondary function to trim a
buffer back to a specific size. An extra byte (one more than is
requested) is always left at the end.

/*
* Expanding buffer
*/

#define EBUF_SIZE_INIT 128
#define EBUF_SIZE_MIN 128
#define EBUF_INCREASE 1.5 /* Factor to increase space by each time */

#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>

int ebuf_full(char **buf, size_t *buf_size, size_t offset) {
size_t new_size;
char *new_buf;

if (*buf_size < offset + 2) { /* NB last pos left empty */
new_size = *buf_size * EBUF_INCREASE + 1;
if (new_size < offset + 2) new_size = offset + 2;
if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Failed to realloc buffer */
}
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocated successfuly */

}

int ebuf_trim(char **buf, size_t *buf_size, size_t offset) {
int new_size = offset + 2; /* Includes empty char */
char *new_buf;

if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if (new_size != *buf_size) {
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Reallocation failed (unlikely) */
}
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocation succeeded */

}
Here's another piece of example code to use the proposed functions.
This one is to read an arbitrary-length line. Hopefully when compared
with a custom line-reading function the code below keeps a far simpler
interface while allowing any necessary options. It should also be fast
in that, again, the function only gets called if there is a need for
more space. Since the function allocates memory in ever-increasing
chunks for most iterations the function will not be called.
#define ENDCHAR '\n'

FILE *infile = stdin;
char *buffer;
size_t bufsize = 100; /* Initial size only */
size_t offset;

... (allocate buffer)

/* Read to 'endchar' */
for (offset = 0; (ch = getc(infile)) != EOF; ) {
if (offset + 2 bufsize &&
ebuf_full(&buffer, &bufsize, offset) {
fprintf(stderr, "Line too long for memory");
exit(1);
}
buffer[offset++] = ch;
if (ch == ENDCHAR) break;
}

... (free buffer)

Notably since we invoke getc() we could easily have more than one
termination character such as

if (ch == '\n' || ch == '\0' || ch == ',')

etc. which is intended to be a big advantage over calling a line
reader function.

--
James
Jun 27 '08 #3

P: n/a
James Harris wrote:
On 30 May, 14:12, James Harris <james.harri...@googlemail.comwrote:
>Initial issue: read in an arbitrary-length piece of text.
Perceived issue: handle variable-length data

The code below is a suggestion for implementing a variable length
buffer that could be used to read text or handle arrays of arbitrary
length. I don't have the expertise in C of many folks here so I feel
like I'm offering a small furry animal for sacrifice to a big armour
plated one... but will offer it anyway. Please do suggest improvements
or challenge the premise. It would be great if it could be improved to
become a generally useful piece of code.

Well, here goes. This should be fun. :-?

-

The following utility code is passed a buffer (allocated by the
caller) and maintains it at an appropriate size. The main function
increases the allocation (when necessary) by factors - rather than
fixed amounts - for speed. There is a secondary function to trim a
buffer back to a specific size. An extra byte (one more than is
requested) is always left at the end.

/*
* Expanding buffer
*/

#define EBUF_SIZE_INIT 128
#define EBUF_SIZE_MIN 128
#define EBUF_INCREASE 1.5 /* Factor to increase space by each time */

#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
Only a small, furry animal offering itself up for sacrifice
would use a non-standard header like <malloc.h>. Consider
yourself eaten by the Ravenous Bugblatter Beast.

Also, there seems to be no reason for <stdio.hin the
buffer-bashing code; it doesn't hurt to #include extraneous
baggage, but it doesn't help either. Everything you need
is in <stdlib.h>.
>int ebuf_full(char **buf, size_t *buf_size, size_t offset) {
size_t new_size;
char *new_buf;

if (*buf_size < offset + 2) { /* NB last pos left empty */
This magical `2' appears in quite a few places. Maybe
it deserves a #define of its own?
> new_size = *buf_size * EBUF_INCREASE + 1;
As a small matter of personal preference and prejudice,
I myself would avoid floating-point arithmetic here and do
the calculation in integers. Not a big deal, though.
> if (new_size < offset + 2) new_size = offset + 2;
if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Failed to realloc buffer */
}
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocated successfuly */

}

int ebuf_trim(char **buf, size_t *buf_size, size_t offset) {
int new_size = offset + 2; /* Includes empty char */
char *new_buf;

if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if (new_size != *buf_size) {
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Reallocation failed (unlikely) */
There's an interface design decision lurking here: Should
this be considered a "failure," or just an "unsuccessful
attempt to optimize?" Arguments can be made for both points
of view. IMHO you've chosen rightly, because it's possible
that ebuf_trim() could fail in an attempt to *increase* the
size of the buffer, in which case the calling program might
be, er, surprised to discover that the buffer was too small
for the offset.
> }
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocation succeeded */

}

An example of intended use follows. Note that the routines are coded
to expect buffer and current size as parameters. Despite the error
handling the code is intended to be fast. Including "if (offset + 2 >
buf1_size)" in the main code the function should only be called if the
buffer is too small. The cost of one integer comparison is small.

int main() {
`int main(void)' would be very slightly better.
char *buf1;
size_t buf1_size = EBUF_SIZE_INIT;
size_t offset;

if ((buf1 = malloc(buf1_size)) == NULL) {
fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
buf1_size);
What is the type of buf1_size? Answer: size_t. What type
of operand does the "%d" specifier convert? Answer: int. Are
size_t and int the same? Answer: No. What should you do to
fix the mismatch? Answer: Change "%d" to "%g". (No, wait, I
didn't mean that ...)

For fprintf() and so on, <stdio.h*is* needed.
exit(1);
ITYM `exit(EXIT_FAILURE);'. Or `return EXIT_FAILURE;'.
}
Instead of pre-allocating the initial buffer, why not
set buf1=NULL and buf1_size=0 and just let ebuf_full()
take care of everything?
...

offset = <position in buffer to write to>

...

/* Check buf1 is big enough */
if (offset + 2 buf1_size && ebuf_full(&buf1, &buf1_size, offset))
{
fprintf(stderr, "Buffer overflow - have %d bytes but need %d
bytes",
buf1_size, offset + 2);
exit(1);
}
buf1[offset] = 0;

...

free(buf1);
... and since main() returns an int value, you should ...?
(C99 introduced a special rule for main() that says falling
off the end is equivalent to returning zero, but IMHO this
should be viewed as a concession to the large amount of sloppy
code already in existence, not as an encouragement to further
sloppiness. Besides, C99 implementations have not exactly
taken the world by storm, and lots of C90 implementations are
still in use.)
}
It seems to me you understand the basic ideas of how to
use realloc() to grow a buffer (although the fact that you
can reallocate a NULL may have escaped you). There are a
few glitches in the way you've done things, easily fixable.

If you want to package something like this for wider
use as a buffer-managing utility, you might consider putting
the buffer information in a struct and passing a single
struct pointer to the functions. Not only would this make
the interface clearer by reducing the argument count, but
it would also make it easy for you to add further fillips
of functionality later on, just by adding a few elements
to the struct and leaving the calls alone.

Go back to your lair and lick your wounds; I think
they're not life-threatening.

--
Er*********@sun.com
Jun 27 '08 #4

P: n/a

"James Harris" <ja************@googlemail.comwrote in message
Initial issue: read in an arbitrary-length piece of text.
Perceived issue: handle variable-length data

The code below is a suggestion for implementing a variable length
buffer that could be used to read text or handle arrays of arbitrary
length. I don't have the expertise in C of many folks here so I feel
like I'm offering a small furry animal for sacrifice to a big armour
plated one... but will offer it anyway. Please do suggest improvements
or challenge the premise. It would be great if it could be improved to
become a generally useful piece of code.
Firstly, don't worry about the actual code bodies at this stage. Any
reasonably competent C programmer should be able to provide those.

The thing is the interfaces.

The first problem is that if we use char *, the functions will only work on
character arrays. If we use void *s, this problem disappears, but there
might be issues about too many casts to access the actual data.

The second issue is whether to use a structure for the buffer, or, as you
have done, pass in several parameters to represent size and capacity.
There's a nasty C stitch-up if we use void *s with option 2.

ebuf_full(void **buf ...)

char *buffer;
/* this is illegal */
ebuf_full(&buffer)

buffer has to be assigned to a dummy void *first. Which makes the function
unusable.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
Jun 27 '08 #5

P: n/a
On 30 May, 15:14, Eric Sosman <Eric.Sos...@sun.comwrote:
JamesHarriswrote:
On 30 May, 14:12,JamesHarris<james.harri...@googlemail.comwro te:
Initial issue: read in an arbitrary-length piece of text.
Perceived issue: handle variable-length data
The code below is a suggestion for implementing a variable length
buffer that could be used to read text or handle arrays of arbitrary
length. I don't have the expertise in C of many folks here so I feel
like I'm offering a small furry animal for sacrifice to a big armour
plated one... but will offer it anyway. Please do suggest improvements
or challenge the premise. It would be great if it could be improved to
become a generally useful piece of code.
Well, here goes. This should be fun. :-?
-
The following utility code is passed a buffer (allocated by the
caller) and maintains it at an appropriate size. The main function
increases the allocation (when necessary) by factors - rather than
fixed amounts - for speed. There is a secondary function to trim a
buffer back to a specific size. An extra byte (one more than is
requested) is always left at the end.
/*
* Expanding buffer
*/
#define EBUF_SIZE_INIT 128
#define EBUF_SIZE_MIN 128
#define EBUF_INCREASE 1.5 /* Factor to increase space by each time */
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>

Only a small, furry animal offering itself up for sacrifice
would use a non-standard header like <malloc.h>. Consider
yourself eaten by the Ravenous Bugblatter Beast.
Haha - the sacrificial animal of my analogy was the code I was
offering up - rather than me!! But you've raised - and eaten - some
good points. I wasn't aware not to use malloc.h, for example.
Also, there seems to be no reason for <stdio.hin the
buffer-bashing code; it doesn't hurt to #include extraneous
baggage, but it doesn't help either. Everything you need
is in <stdlib.h>.
OK
int ebuf_full(char **buf, size_t *buf_size, size_t offset) {
size_t new_size;
char *new_buf;
if (*buf_size < offset + 2) { /* NB last pos left empty */

This magical `2' appears in quite a few places. Maybe
it deserves a #define of its own?
Agreed, it's a bit scabby as it stands. The reason for the +2 is that
there's a +1 to change from an offset to a length - e.g. an offset of
7 means a length of 8 - and I wanted to leave one extra byte after the
specified length. I'd rather avoid the clutter of another defined
constant. I'll rewrite to consistently use offsets rather than lengths
and thus avoid the +2.
new_size = *buf_size * EBUF_INCREASE + 1;

As a small matter of personal preference and prejudice,
I myself would avoid floating-point arithmetic here and do
the calculation in integers. Not a big deal, though.
Me too. The reason for including a factor of 1.5 was simply to
demonstrate that we don't need to settle for integer factors.
if (new_size < offset + 2) new_size = offset + 2;
if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Failed to realloc buffer */
}
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocated successfuly */
}
int ebuf_trim(char **buf, size_t *buf_size, size_t offset) {
int new_size = offset + 2; /* Includes empty char */
char *new_buf;
if (new_size < EBUF_SIZE_MIN) new_size = EBUF_SIZE_MIN;
if (new_size != *buf_size) {
if ((new_buf = realloc(*buf, new_size)) == NULL) {
return 1; /* Reallocation failed (unlikely) */

There's an interface design decision lurking here: Should
this be considered a "failure," or just an "unsuccessful
attempt to optimize?" Arguments can be made for both points
of view. IMHO you've chosen rightly, because it's possible
that ebuf_trim() could fail in an attempt to *increase* the
size of the buffer, in which case the calling program might
be, er, surprised to discover that the buffer was too small
for the offset.
OK
}
*buf = new_buf;
*buf_size = new_size;
}
return 0; /* Reallocation succeeded */
}
An example of intended use follows. Note that the routines are coded
to expect buffer and current size as parameters. Despite the error
handling the code is intended to be fast. Including "if (offset + 2 >
buf1_size)" in the main code the function should only be called if the
buffer is too small. The cost of one integer comparison is small.
int main() {

`int main(void)' would be very slightly better.
char *buf1;
size_t buf1_size = EBUF_SIZE_INIT;
size_t offset;
if ((buf1 = malloc(buf1_size)) == NULL) {
fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
buf1_size);

What is the type of buf1_size? Answer: size_t. What type
of operand does the "%d" specifier convert? Answer: int. Are
size_t and int the same? Answer: No. What should you do to
fix the mismatch? Answer: Change "%d" to "%g". (No, wait, I
didn't mean that ...)
I've no idea how to print these values, then. Would they be better as
unsigned ints? I guess this would mean unsigned ints would have to be
wide enough for any memory offset. Not sure if that can be relied
upon.
For fprintf() and so on, <stdio.h*is* needed.
exit(1);

ITYM `exit(EXIT_FAILURE);'. Or `return EXIT_FAILURE;'.
OK. Was trying to keep the interface light. As such I wanted to reduce
the number of defines. The procedure name in meant to indicate the
meaning of a zero or non-zero return. The function can exist in an if
statement as

if (ebuf_full(....)) handle error

}

Instead of pre-allocating the initial buffer, why not
set buf1=NULL and buf1_size=0 and just let ebuf_full()
take care of everything?
I didn't know this could be done. Will include in the rewrite.

...
offset = <position in buffer to write to>
...
/* Check buf1 is big enough */
if (offset + 2 buf1_size && ebuf_full(&buf1, &buf1_size, offset))
{
fprintf(stderr, "Buffer overflow - have %d bytes but need %d
bytes",
buf1_size, offset + 2);
exit(1);
}
buf1[offset] = 0;
...
free(buf1);

... and since main() returns an int value, you should ...?
(C99 introduced a special rule for main() that says falling
off the end is equivalent to returning zero, but IMHO this
should be viewed as a concession to the large amount of sloppy
code already in existence, not as an encouragement to further
sloppiness. Besides, C99 implementations have not exactly
taken the world by storm, and lots of C90 implementations are
still in use.)
OK. That was a miss on my part.
}

It seems to me you understand the basic ideas of how to
use realloc() to grow a buffer (although the fact that you
can reallocate a NULL may have escaped you). There are a
few glitches in the way you've done things, easily fixable.

If you want to package something like this for wider
use as a buffer-managing utility, you might consider putting
the buffer information in a struct and passing a single
struct pointer to the functions. Not only would this make
the interface clearer by reducing the argument count, but
it would also make it easy for you to add further fillips
of functionality later on, just by adding a few elements
to the struct and leaving the calls alone.
I thought about that but chose against it. Options seem to be

1. Address and size are scalars in the caller
- limits other info that can be stored

2. Struct holding address, size, factor and other parameters
- simplfies calls to ebuf-trim
- requires normal use of pointers to be dereferenced via the struct

3. Struct holding parameters other than the address
- still needs extra parameter to be passed to ebuf_full
- requires ebuf_full to locate parameter block

On balance the first option seemed best. It keeps the system simple
without losing function.

Jul 25 '08 #6

P: n/a
James Harris wrote:
On 30 May, 15:14, Eric Sosman <Eric.Sos...@sun.comwrote:
>JamesHarriswrote:
>>[...]
size_t buf1_size = EBUF_SIZE_INIT;
size_t offset;
if ((buf1 = malloc(buf1_size)) == NULL) {
fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
buf1_size);
What is the type of buf1_size? Answer: size_t. What type
of operand does the "%d" specifier convert? Answer: int. Are
size_t and int the same? Answer: No. What should you do to
fix the mismatch? Answer: Change "%d" to "%g". (No, wait, I
didn't mean that ...)

I've no idea how to print these values, then. Would they be better as
unsigned ints? I guess this would mean unsigned ints would have to be
wide enough for any memory offset. Not sure if that can be relied
upon.
If you can count on a C99 implementation, there's a length
modifier "z" for printing size_t values:

printf ("Size = %zu\n", buf1_size);

If you need to live with the more widely available C90
systems, there's no "z" modifier and you need to convert the
size_t to something printf() knows how to handle:

printf ("Size = %u\n", (unsigned int)buf1_size);

or (safer):

printf ("Size = %lu\n", (unsigned long)buf1_size);

or even (extremely safe, extremely unusual):

printf ("Size = %.0f\n", (double)buf1_size);

These will work as well on C99 as they do on C90.

--
Er*********@sun.com
Jul 25 '08 #7

P: n/a
On 31 May, 09:05, "Malcolm McLean" <regniz...@btinternet.comwrote:
"JamesHarris" <james.harri...@googlemail.comwrote in message
Initial issue: read in an arbitrary-length piece of text.
Perceived issue: handle variable-length data
The code below is a suggestion for implementing a variable length
buffer that could be used to read text or handle arrays of arbitrary
length. I don't have the expertise in C of many folks here so I feel
like I'm offering a small furry animal for sacrifice to a big armour
plated one... but will offer it anyway. Please do suggest improvements
or challenge the premise. It would be great if it could be improved to
become a generally useful piece of code.

Firstly, don't worry about the actual code bodies at this stage. Any
reasonably competent C programmer should be able to provide those.

The thing is the interfaces.

The first problem is that if we use char *, the functions will only work on
character arrays. If we use void *s, this problem disappears, but there
might be issues about too many casts to access the actual data.
AFAIK casts tend to make code less safe and I try to avoid them. Is
there a good solution to this?
The second issue is whether to use a structure for the buffer, or, as you
have done, pass in several parameters to represent size and capacity.
There's a nasty C stitch-up if we use void *s with option 2.

ebuf_full(void **buf ...)

char *buffer;
/* this is illegal */
ebuf_full(&buffer)

buffer has to be assigned to a dummy void *first. Which makes the function
unusable.
Not nice!

Having a separate struct would allow other advantages such as having a
per-buffer size increase factor but I think it would need the pointer
to be dereferenced when it is used normally. On balance I think
address and length (or, better, address and offset) is better.
Here's a rewrite where I've improved the code slightly by simplifying
a few bits of it. It now uses offsets rather than lengths and bases
the increase on the requested offset so eliminating some of the
checks. It does require the factor to be greater than or equal to 1.
I'll include the functions and a sample main in one go.

/*
* Test expanding buffer
*/

#define EBUF_INCREASE 1.5 /* Factor (>= 1) for space increase */

#include <stdio.h>
#include <stdlib.h>

int ebuf_full(char **buf, size_t *buf_limit, size_t offset) {
size_t new_limit;
char *new_buf;

if (offset >= *buf_limit) {
new_limit = offset * EBUF_INCREASE;
if ((new_buf = realloc(*buf, new_limit + 1)) == NULL) {
return 1; /* Failed to realloc */
}
*buf = new_buf;
*buf_limit = new_limit;
}
return 0; /* Realloc succeeded */
}

int ebuf_trim(char **buf, size_t *buf_limit, size_t offset) {
char *new_buf;

if ((new_buf = realloc(*buf, offset + 1)) == NULL) {
return 1; /* Realloc failed */
}
*buf = new_buf;
*buf_limit = offset;
return 0; /* Succeeded */
}
int main(void) {
char *buf1 = NULL;
size_t buf1_limit = 0;
size_t offset;

for (offset = 0; offset < 1000; offset += 200) {
fprintf(stderr, "\n---Checking for offset %d\n", offset);

if (offset >= buf1_limit && ebuf_full(&buf1, &buf1_limit, offset))
{
fprintf(stderr, "-Ebuf overflow %d/%d bytes", buf1_limit,
offset);
exit(1);
}
buf1[offset] = 'x';
}

fprintf(stderr, "\n---Trim from %d to %d\n", buf1_limit, offset);
if (ebuf_trim(&buf1, &buf1_limit, offset)) {
fprintf(stderr, "-Buffer trim to %d failure\n", offset);
exit(1);
}

free(buf1);
return 0;
}

Jul 25 '08 #8

P: n/a
On 25 Jul, 22:33, Eric Sosman <Eric.Sos...@sun.comwrote:
James Harris wrote:
On 30 May, 15:14, Eric Sosman <Eric.Sos...@sun.comwrote:
JamesHarriswrote:
[...]
size_t buf1_size = EBUF_SIZE_INIT;
size_t offset;
if ((buf1 = malloc(buf1_size)) == NULL) {
fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
buf1_size);
What is the type of buf1_size? Answer: size_t. What type
of operand does the "%d" specifier convert? Answer: int. Are
size_t and int the same? Answer: No. What should you do to
fix the mismatch? Answer: Change "%d" to "%g". (No, wait, I
didn't mean that ...)
I've no idea how to print these values, then. Would they be better as
unsigned ints? I guess this would mean unsigned ints would have to be
wide enough for any memory offset. Not sure if that can be relied
upon.

If you can count on a C99 implementation, there's a length
modifier "z" for printing size_t values:

printf ("Size = %zu\n", buf1_size);

If you need to live with the more widely available C90
systems, there's no "z" modifier and you need to convert the
size_t to something printf() knows how to handle:

printf ("Size = %u\n", (unsigned int)buf1_size);

or (safer):

printf ("Size = %lu\n", (unsigned long)buf1_size);

or even (extremely safe, extremely unusual):

printf ("Size = %.0f\n", (double)buf1_size);

These will work as well on C99 as they do on C90.
Rather than use size_t would I be better to use a type of unsigned int
or unsigned long in the first place?

--
James
Jul 25 '08 #9

P: n/a
James Harris wrote:
On 25 Jul, 22:33, Eric Sosman <Eric.Sos...@sun.comwrote:
>James Harris wrote:
>>On 30 May, 15:14, Eric Sosman <Eric.Sos...@sun.comwrote:
JamesHarriswrote:
[...]
size_t buf1_size = EBUF_SIZE_INIT;
size_t offset;
if ((buf1 = malloc(buf1_size)) == NULL) {
fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
buf1_size);
What is the type of buf1_size? Answer: size_t. What type
of operand does the "%d" specifier convert? Answer: int. Are
size_t and int the same? Answer: No. What should you do to
fix the mismatch? Answer: Change "%d" to "%g". (No, wait, I
didn't mean that ...)
I've no idea how to print these values, then. Would they be better as
unsigned ints? I guess this would mean unsigned ints would have to be
wide enough for any memory offset. Not sure if that can be relied
upon.
If you can count on a C99 implementation, there's a length
modifier "z" for printing size_t values:

printf ("Size = %zu\n", buf1_size);

If you need to live with the more widely available C90
systems, there's no "z" modifier and you need to convert the
size_t to something printf() knows how to handle:

printf ("Size = %u\n", (unsigned int)buf1_size);

or (safer):

printf ("Size = %lu\n", (unsigned long)buf1_size);

or even (extremely safe, extremely unusual):

printf ("Size = %.0f\n", (double)buf1_size);

These will work as well on C99 as they do on C90.

Rather than use size_t would I be better to use a type of unsigned int
or unsigned long in the first place?
If size_t confuses you, then use long unsigned instead.

--
pete
Jul 26 '08 #10

P: n/a
James Harris wrote:
On 25 Jul, 22:33, Eric Sosman <Eric.Sos...@sun.comwrote:
>[... how to print a size_t ...]

Rather than use size_t would I be better to use a type of unsigned int
or unsigned long in the first place?
I think not: size_t is the type for sizes, so the program
cannot go wrong by using it. If you use unsigned int the program
may err, because valid size_t values might exceed the range of
unsigned int on some machines.

Under C90 rules, size_t must be an unsigned integer type,
and the only "integer types" are the flavors of char, short,
int, and long, with long being the widest. Therefore, in C90
it is guaranteed that converting a size_t to unsigned long
loses no information and preserves the value; it's a good
solution for printing. On the other hand, unsigned long might
be overkill: a 64-bit value, say, where a 32-bit value might
suffice. So the wise course is to calculate with size_t
values and convert to unsigned long only for display purposes.

The situation gets stickier in C99, because the repertoire
of "integer types" is expanded and in fact becomes open-ended:
it is now possible that size_t could be wider than unsigned long --
think of a 64-bit size_t with a 32-bit unsigned long. You could
print by converting to unsigned long long or to uintmax_t, but
since these types only exist in C99 and C99 provides the "z"
length modifier, using "z" for display is surely best.

Summary: Calculate with size_t, except perhaps in situations
where you must economize on storage and *know* that the actual
values will fit in something more restricted. For display
purposes, either convert to unsigned long (C90) or rely on
the "z" modifier (C99).

--
Eric Sosman
es*****@ieee-dot-org.invalid
Jul 26 '08 #11

P: n/a
James Harris wrote:
#define EBUF_INCREASE 1.5 /* Factor (>= 1) for space increase */
In my get_line function, for reading text files,
http://www.mindspring.com/~pfilandr/...ine/get_line.c
I increase the buffer size by only one byte,
each time that the buffer is found to be too small.

Most text files that I've dealt with,
only have line lengths of less than a hundred bytes,
and a hundred calls to realloc in a program
isn't going to add up to any substantial time.

The get_line function is set up so that if you know
that you're going to be dealing
with a file which has significantly long lines,
then you can supply an adequately large original buffer
so that no reallocation will be needed.

--
pete
Jul 26 '08 #12

P: n/a
James Harris wrote:
#define EBUF_SIZE_INIT 128
#define EBUF_SIZE_MIN 128
#define EBUF_INCREASE 1.5 /* Factor to increase space by each time */

#define ENDCHAR '\n'
Macros begining with E followed by another capital are reserved if
<errno.h>
is included. Although you don't now, you should not rule out the
possibility of
future versions including it.

--
Peter
Jul 26 '08 #13

P: n/a
Eric Sosman wrote:
James Harris wrote:
>On 30 May, 15:14, Eric Sosman <Eric.Sos...@sun.comwrote:
>>JamesHarriswrote:
[...]
size_t buf1_size = EBUF_SIZE_INIT;
size_t offset;
if ((buf1 = malloc(buf1_size)) == NULL) {
fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
buf1_size);
What is the type of buf1_size? Answer: size_t. What type
of operand does the "%d" specifier convert? Answer: int. Are
size_t and int the same? Answer: No. What should you do to
fix the mismatch? Answer: Change "%d" to "%g". (No, wait, I
didn't mean that ...)

I've no idea how to print these values, then. Would they be better as
unsigned ints? I guess this would mean unsigned ints would have to be
wide enough for any memory offset. Not sure if that can be relied
upon.

If you can count on a C99 implementation, there's a length
modifier "z" for printing size_t values:

printf ("Size = %zu\n", buf1_size);

If you need to live with the more widely available C90
systems, there's no "z" modifier and you need to convert the
size_t to something printf() knows how to handle:

printf ("Size = %u\n", (unsigned int)buf1_size);

or (safer):

printf ("Size = %lu\n", (unsigned long)buf1_size);

or even (extremely safe, extremely unusual):

printf ("Size = %.0f\n", (double)buf1_size);

These will work as well on C99 as they do on C90.
Even more safely:

printf("Size = %Lf\n", (long double)buf1_size);

:-)

Jul 26 '08 #14

P: n/a
pete wrote:
James Harris wrote:
>On 25 Jul, 22:33, Eric Sosman <Eric.Sos...@sun.comwrote:
>>James Harris wrote:
On 30 May, 15:14, Eric Sosman <Eric.Sos...@sun.comwrote:
JamesHarriswrote:
>[...]
> size_t buf1_size = EBUF_SIZE_INIT;
> size_t offset;
> if ((buf1 = malloc(buf1_size)) == NULL) {
> fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
>buf1_size);
What is the type of buf1_size? Answer: size_t. What type
of operand does the "%d" specifier convert? Answer: int. Are
size_t and int the same? Answer: No. What should you do to
fix the mismatch? Answer: Change "%d" to "%g". (No, wait, I
didn't mean that ...)
I've no idea how to print these values, then. Would they be better
as unsigned ints? I guess this would mean unsigned ints would have
to be wide enough for any memory offset. Not sure if that can be
relied upon.
If you can count on a C99 implementation, there's a length
modifier "z" for printing size_t values:

printf ("Size = %zu\n", buf1_size);

If you need to live with the more widely available C90
systems, there's no "z" modifier and you need to convert the
size_t to something printf() knows how to handle:

printf ("Size = %u\n", (unsigned int)buf1_size);

or (safer):

printf ("Size = %lu\n", (unsigned long)buf1_size);

or even (extremely safe, extremely unusual):

printf ("Size = %.0f\n", (double)buf1_size);

These will work as well on C99 as they do on C90.

Rather than use size_t would I be better to use a type of unsigned
int or unsigned long in the first place?

If size_t confuses you, then use long unsigned instead.
This will break on Windows with objects larger than 4 Gb.

Jul 26 '08 #15

P: n/a
santosh wrote:
Eric Sosman wrote:
>[... printing size_t values in C90 ...]
or even (extremely safe, extremely unusual):

printf ("Size = %.0f\n", (double)buf1_size);

These will work as well on C99 as they do on C90.

Even more safely:

printf("Size = %Lf\n", (long double)buf1_size);

:-)
Next time you have an object larger than
10000000000000000000000000000000000000 bytes,
be sure to let us know. Install an extra
swap disk, too. :-)

--
Eric Sosman
es*****@ieee-dot-org.invalid
Jul 26 '08 #16

P: n/a
On 26 Jul, 01:28, pete <pfil...@mindspring.comwrote:
James Harris wrote:
#define EBUF_INCREASE 1.5 /* Factor (>= 1) for space increase */

In my get_line function, for reading text files,
http://www.mindspring.com/~pfilandr/...ine/get_line.c
I increase the buffer size by only one byte,
each time that the buffer is found to be too small.
The proposed code is NOT specifically for reading lines. It is
intended to be used any time a variable length buffer is needed. The
buffer contents could be generated in a loop, for example.

If the buffer increase factor is set to 1 ebuf_full will degenerate to
allocating only as much space as is needed each time it is called.
Most text files that I've dealt with,
only have line lengths of less than a hundred bytes,
and a hundred calls to realloc in a program
isn't going to add up to any substantial time.

The get_line function is set up so that if you know
that you're going to be dealing
with a file which has significantly long lines,
then you can supply an adequately large original buffer
so that no reallocation will be needed.
Ebuf_full allows a buffer of arbitrary size to be pre-allocated, if
preferred. Whether pre-allocated or not increasing the buffer by
factors allows it to scale.

--
James
Jul 26 '08 #17

P: n/a
On 26 Jul, 04:22, Peter Nilsson <ai...@acay.com.auwrote:
James Harris wrote:
#define EBUF_SIZE_INIT 128
#define EBUF_SIZE_MIN 128
#define EBUF_INCREASE 1.5 /* Factor to increase space by each time */
#define ENDCHAR '\n'

Macros begining with E followed by another capital are reserved if
<errno.h>
is included. Although you don't now, you should not rule out the
possibility of
future versions including it.
OK. Perhaps I should call it xbuf instead so we have

#define XBUF_INCREASE 1.5

int xbuf_full(...

int xbuf_trim(...

--
James
Jul 26 '08 #18

P: n/a
Eric Sosman wrote:
santosh wrote:
>Eric Sosman wrote:
>>[... printing size_t values in C90 ...]
or even (extremely safe, extremely unusual):

printf ("Size = %.0f\n", (double)buf1_size);

These will work as well on C99 as they do on C90.

Even more safely:

printf("Size = %Lf\n", (long double)buf1_size);

:-)

Next time you have an object larger than
10000000000000000000000000000000000000 bytes,
be sure to let us know. Install an extra
swap disk, too. :-)
Floating point types become unsuitable
for the representation of integers,
at the point where two consecutive integers
converted to the floating point type in question,
compare equal.

I don't know how to, at compile time,
calculate the values of the two lowest
consecutive integers where that happens.

--
pete
Jul 26 '08 #19

P: n/a
santosh wrote:
pete wrote:
>James Harris wrote:
>>On 25 Jul, 22:33, Eric Sosman <Eric.Sos...@sun.comwrote:
James Harris wrote:
On 30 May, 15:14, Eric Sosman <Eric.Sos...@sun.comwrote:
>JamesHarriswrote:
>>[...]
>> size_t buf1_size = EBUF_SIZE_INIT;
>> size_t offset;
>> if ((buf1 = malloc(buf1_size)) == NULL) {
>> fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
>>buf1_size);
> What is the type of buf1_size? Answer: size_t. What type
>of operand does the "%d" specifier convert? Answer: int. Are
>size_t and int the same? Answer: No. What should you do to
>fix the mismatch? Answer: Change "%d" to "%g". (No, wait, I
>didn't mean that ...)
I've no idea how to print these values, then. Would they be better
as unsigned ints? I guess this would mean unsigned ints would have
to be wide enough for any memory offset. Not sure if that can be
relied upon.
If you can count on a C99 implementation, there's a length
modifier "z" for printing size_t values:

printf ("Size = %zu\n", buf1_size);

If you need to live with the more widely available C90
systems, there's no "z" modifier and you need to convert the
size_t to something printf() knows how to handle:

printf ("Size = %u\n", (unsigned int)buf1_size);

or (safer):

printf ("Size = %lu\n", (unsigned long)buf1_size);

or even (extremely safe, extremely unusual):

printf ("Size = %.0f\n", (double)buf1_size);

These will work as well on C99 as they do on C90.

Rather than use size_t would I be better to use a type of unsigned
int or unsigned long in the first place?
If size_t confuses you, then use long unsigned instead.

This will break on Windows with objects larger than 4 Gb.
OK, then I guess it's better to learn size_t.

--
pete
Jul 26 '08 #20

P: n/a
On 26 Jul, 22:03, pete <pfil...@mindspring.comwrote:
santosh wrote:
pete wrote:
James Harris wrote:
On 25 Jul, 22:33, Eric Sosman <Eric.Sos...@sun.comwrote:
James Harris wrote:
On 30 May, 15:14, Eric Sosman <Eric.Sos...@sun.comwrote:
JamesHarriswrote:
>[...]
> size_t buf1_size = EBUF_SIZE_INIT;
> size_t offset;
> if ((buf1 = malloc(buf1_size)) == NULL) {
> fprintf(stderr, "Buffer initial malloc of %d bytes failed\n",
>buf1_size);
What is the type of buf1_size? Answer: size_t. What type
of operand does the "%d" specifier convert? Answer: int. Are
size_t and int the same? Answer: No. What should you do to
fix the mismatch? Answer: Change "%d" to "%g". (No, wait, I
didn't mean that ...)
I've no idea how to print these values, then. Would they be better
as unsigned ints? I guess this would mean unsigned ints would have
to be wide enough for any memory offset. Not sure if that can be
relied upon.
If you can count on a C99 implementation, there's a length
modifier "z" for printing size_t values:
>> printf ("Size = %zu\n", buf1_size);
>> If you need to live with the more widely available C90
systems, there's no "z" modifier and you need to convert the
size_t to something printf() knows how to handle:
>> printf ("Size = %u\n", (unsigned int)buf1_size);
>>or (safer):
>> printf ("Size = %lu\n", (unsigned long)buf1_size);
>>or even (extremely safe, extremely unusual):
>> printf ("Size = %.0f\n", (double)buf1_size);
>> These will work as well on C99 as they do on C90.
>Rather than use size_t would I be better to use a type of unsigned
int or unsigned long in the first place?
If size_t confuses you, then use long unsigned instead.
This will break on Windows with objects larger than 4 Gb.

OK, then I guess it's better to learn size_t.
I'm not sure it's a question of learning the meaning of size_t. The
problem was in printing it with printf prior to C99.

The recommendation seems to be to cast to unsigned long or similar but
surely if size_t is wider than unsigned long it will fail to print
correctly. In the absence of C99's %z component perhaps the best way
is to print it by a function (which hasn't been mentioned so may be
wrong or impossible....).

--
James
Jul 26 '08 #21

P: n/a
James Harris wrote:
I'm not sure it's a question of learning the meaning of size_t. The
problem was in printing it with printf prior to C99.

The recommendation seems to be to cast to unsigned long or similar but
surely if size_t is wider than unsigned long it will fail to print
correctly. In the absence of C99's %z component perhaps the best way
is to print it by a function (which hasn't been mentioned so may be
wrong or impossible....).
In this context, C89 and C99 are two different languages.

For C89, use long unsigned.
size_t isn't bigger than long unsigned, in C89.

You could probably do something with conditional compilation,
based on whether or not LLONG_MAX was defined in <limits.h>,
to gain portability across C89 and C99 platforms,
but it wouldn't be pretty.

--
pete
Jul 26 '08 #22

P: n/a
Mark L Pappin wrote:
pete <pf*****@mindspring.comwrites:
>size_t isn't bigger than long unsigned, in C89.

I don't see that this is guaranteed -
ISO/IEC 9899: 1990

6.1.2.5 Types
There are four signed integer types,
designated as signed char, short int, int, and long int.

For each of the signed integer types,
there is a corresponding (but different) unsigned integer type
(designated with the keyword unsigned)
that uses the same amount of storage
(including sign information) and has the same alignment requirements.

An enumeration comprises a set of named integer constant values.
Each distinct enumeration constitutes a different enumerated type.

The type char, the signed and unsigned integer types,
and the enumerated types are collectively called integral types.

6.1.3.2 Integer constants
The type of an integer constant
is the first of the corresponding list
in which its value can be represented.
Unsuffixed decimal:
int, long int, unsigned long int;
unsuffixed octal or hexadecimal:
int, unsigned int, long int, unsigned long int;
suffixed by the letter u or U:
unsigned int, unsigned long int;
suffixed by the letter l or L:
long int, unsigned long int;
suffixed by both the letters u or U and l or L:
unsigned long int.

7.1.6 Common definitions <stddef.h>
The following types and macros
are defined in the standard header <stddef.h>.
Some are also defined in other headers,
as noted in their respective subclauses.
The types are
ptrdiff_t
which is the signed integral type
of the result of subtracting two pointers:
size_t
which is the unsigned integral type
of the result of the sizeof operator;

--
pete
Jul 27 '08 #23

P: n/a
James Harris wrote:
pete <pfil...@mindspring.comwrote:
.... snip ...
>
>The get_line function is set up so that if you know that you're
going to be dealing with a file which has significantly long
lines, then you can supply an adequately large original buffer
so that no reallocation will be needed.

Ebuf_full allows a buffer of arbitrary size to be pre-allocated,
if preferred. Whether pre-allocated or not increasing the buffer
by factors allows it to scale.
Investigate using ggets, written in purely standard C and released
to the public domain. The whole package is available at:

<http://cbfalconer.home.att.net/download/ggets.zip>

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Jul 27 '08 #24

P: n/a
James Harris wrote:

<snip>
I'm not sure it's a question of learning the meaning of size_t. The
problem was in printing it with printf prior to C99.

The recommendation seems to be to cast to unsigned long or similar but
surely if size_t is wider than unsigned long
In a C90 conforming implementation size_t cannot be wider than unsigned
long since the latter is the largest integral type specified by the
Standard, and size_t is defined as an unsigned integral type.

BTW, is it conforming for a C90 implementation to implement size_t as an
unsigned integral type larger than unsigned long? Is there a specific
statement in the Standard that forbids this? Won't the "as if" rule
rescue such an implementation?
it will fail to print correctly.
This is exceedingly unlikely to the point that I don't think you need to
worry about it.
In the absence of C99's %z component perhaps the best way
is to print it by a function (which hasn't been mentioned so may be
wrong or impossible....).
The Standard defined size_t as an unsigned integral type, but the exact
nature of the type is implementation defined. If you know that your
implementation conforms to C99 then the specific format specifier %zu
is the way to go. Otherwise just cast the size_t value to the largest
unsigned integral type that your implementation supports (either
unsigned long or unsigned long long) and print it.

You can set a small macro similar to the ones in inttypes.h for this
purpose, which will expand to the correct (or best) format specifier
for each implementation. Like say:

#if __STDC_VERSION__ >= 199901L
#define PRI_SIZE_T "zu"
#else
#define PRI_SIZE_T "lu"
#endif

You could also test for ULLONG_MAX and change "lu" to "llu", though that
may be overkill.

Jul 27 '08 #25

P: n/a
On 27 Jul, 00:33, CBFalconer <cbfalco...@yahoo.comwrote:
James Harris wrote:
pete <pfil...@mindspring.comwrote:

... snip ...
The get_line function is set up so that if you know that you're
going to be dealing with a file which has significantly long
lines, then you can supply an adequately large original buffer
so that no reallocation will be needed.
Ebuf_full allows a buffer of arbitrary size to be pre-allocated,
if preferred. Whether pre-allocated or not increasing the buffer
by factors allows it to scale.

Investigate using ggets, written in purely standard C and released
to the public domain. The whole package is available at:

<http://cbfalconer.home.att.net/download/ggets.zip>
Pete added line reading to the discussion. The code I proposed is NOT
specifically for reading lines. It is intended to be used any time a
variable length buffer is needed. The buffer contents could be
generated in a loop, for example.

As I understand it ggets only works when reading lines.

I believe the xbuf functions have other benefits:

- The user maintains control and can always choose whether to allow
the buffer to expand or not based on whatever criteria the programmer
wishes - for example, when the current size of the buffer reaches a
certain limit.
- Related to this, if using the functions to help read from an input
stream the input can be terminated on any number of specific
characters - for example, carriage return, null, line feed, and/or any
control character.
- The xbuf functions will work with a buffer allocated by the caller
which tends to keep the malloc and free functions in the same code
and, I hope this makes the programmer more aware of the need to free
any buffer space used.
- The caller can manipulate the buffer at any time with realloc (along
with noting the new length) and the xbuf functions will still work on
the same buffer.
- More than one buffer can be grown at the same time - for example, if
reading two interleaved streams or reading one stream and generating
another buffer.
- Control is not relinqushed to the called function as it is with
ggets. If reading a long line over a slow link I understand that ggets
will keep control until the line ends. Multi-threading can address
this but is a very heavy handed approach.

and lastly,

- The names of the functions are intended to make it clear that they
are not part of the standard library. (The name ggets looks too much
like those in standard libraries for my taste - but that is a personal
preference.)

If the xbuf functions fail to do any of the above or can be bettered
improvements would be welcome.

Jul 27 '08 #26

P: n/a
On Sun, 27 Jul 2008 12:59:49 +0530, santosh wrote:
James Harris wrote:

<snip>
>I'm not sure it's a question of learning the meaning of size_t. The
problem was in printing it with printf prior to C99.

The recommendation seems to be to cast to unsigned long or similar but
surely if size_t is wider than unsigned long

In a C90 conforming implementation size_t cannot be wider than unsigned
long since the latter is the largest integral type specified by the
Standard, and size_t is defined as an unsigned integral type.

BTW, is it conforming for a C90 implementation to implement size_t as an
unsigned integral type larger than unsigned long? Is there a specific
statement in the Standard that forbids this? Won't the "as if" rule
rescue such an implementation?
No, it won't. Since the C90 standard says that size_t must be a typedef
for unsigned char, unsigned short, unsigned int, or unsigned long (the
only unsigned integer types), this is a strictly conforming C90 program:

#include <stddef.h>
#include <limits.h>
#define SIZE_MAX ((size_t) -1)
int main(void) {
return SIZE_MAX != UCHAR_MAX
&& SIZE_MAX != USHRT_MAX
&& SIZE_MAX != UINT_MAX
&& SIZE_MAX != ULONG_MAX;
}

It must return 0. If the implementation makes size_t larger than unsigned
long, the program returns 1. Returning 1 where the standard requires 0 is
not allowed by the as-if rule :-)

There are plenty of more convincing correct C90 programs that would be
broken by such an implementation, but converting size_t to unsigned long
and expecting no change in value is one such example, and you weren't
convinced by that. Could you explain why in a bit more detail?
Jul 27 '08 #27

P: n/a
Harald van D?k wrote:
On Sun, 27 Jul 2008 12:59:49 +0530, santosh wrote:
>James Harris wrote:

<snip>
>>I'm not sure it's a question of learning the meaning of size_t. The
problem was in printing it with printf prior to C99.

The recommendation seems to be to cast to unsigned long or similar
but surely if size_t is wider than unsigned long

In a C90 conforming implementation size_t cannot be wider than
unsigned long since the latter is the largest integral type specified
by the Standard, and size_t is defined as an unsigned integral type.

BTW, is it conforming for a C90 implementation to implement size_t as
an unsigned integral type larger than unsigned long? Is there a
specific statement in the Standard that forbids this? Won't the "as
if" rule rescue such an implementation?

No, it won't. Since the C90 standard says that size_t must be a
typedef for unsigned char, unsigned short, unsigned int, or unsigned
long (the only unsigned integer types), this is a strictly conforming
C90 program:
Okay. I do not have access to the C90 standard and I seem to have
misunderstood. I was under the impression that size_t was defined
as "an unsigned integer type" in C90, like it is in C99. So C90
strictly restricts size_t to be an alias for one of unsigned
char/short/int/long.
#include <stddef.h>
#include <limits.h>
#define SIZE_MAX ((size_t) -1)
int main(void) {
return SIZE_MAX != UCHAR_MAX
&& SIZE_MAX != USHRT_MAX
&& SIZE_MAX != UINT_MAX
&& SIZE_MAX != ULONG_MAX;
}

It must return 0. If the implementation makes size_t larger than
unsigned long, the program returns 1. Returning 1 where the standard
requires 0 is not allowed by the as-if rule :-)

There are plenty of more convincing correct C90 programs that would be
broken by such an implementation, but converting size_t to unsigned
long and expecting no change in value is one such example, and you
weren't convinced by that. Could you explain why in a bit more detail?
I think my real question was whether C90 requires size_t to be a typedef
for the "fundamental" unsigned integer types that it defines, or
whether it would be conforming for an implementation to define size_t
as /an/ unsigned integer type, but distinct from unsigned
char/short/int/long. But your program above has answered that question.

So I suppose it's impossible to write a fully conforming C90 program
under 64 bit Windows that calls a Windows API function.

Jul 27 '08 #28

P: n/a
On Sun, 27 Jul 2008 15:02:40 +0530, santosh wrote:
Harald van D?k wrote:
>On Sun, 27 Jul 2008 12:59:49 +0530, santosh wrote:
>>James Harris wrote:

<snip>

I'm not sure it's a question of learning the meaning of size_t. The
problem was in printing it with printf prior to C99.

The recommendation seems to be to cast to unsigned long or similar
but surely if size_t is wider than unsigned long

In a C90 conforming implementation size_t cannot be wider than
unsigned long since the latter is the largest integral type specified
by the Standard, and size_t is defined as an unsigned integral type.

BTW, is it conforming for a C90 implementation to implement size_t as
an unsigned integral type larger than unsigned long? Is there a
specific statement in the Standard that forbids this? Won't the "as
if" rule rescue such an implementation?

No, it won't. Since the C90 standard says that size_t must be a typedef
for unsigned char, unsigned short, unsigned int, or unsigned long (the
only unsigned integer types), this is a strictly conforming C90
program:

Okay. I do not have access to the C90 standard and I seem to have
misunderstood. I was under the impression that size_t was defined as "an
unsigned integer type" in C90, like it is in C99.
It is. However, ...
So C90 strictly
restricts size_t to be an alias for one of unsigned char/short/int/long.
....C90 does not consider any type other than those four an unsigned
integer type. C90's "unsigned integer type" is what C99 calls a "standard
unsigned integer type", and C90 does not recognise what C99 calls
"extended integer types". If the implementation supports some type that
behaves exactly as an integer type would, and is represented exactly the
same way, it still cannot be an integer type according to the C90's
definition.
Jul 27 '08 #29

P: n/a
Harald van D?k wrote:
On Sun, 27 Jul 2008 15:02:40 +0530, santosh wrote:
>Harald van D?k wrote:
>>On Sun, 27 Jul 2008 12:59:49 +0530, santosh wrote:
James Harris wrote:

<snip>

I'm not sure it's a question of learning the meaning of size_t.
The problem was in printing it with printf prior to C99.
>
The recommendation seems to be to cast to unsigned long or similar
but surely if size_t is wider than unsigned long

In a C90 conforming implementation size_t cannot be wider than
unsigned long since the latter is the largest integral type
specified by the Standard, and size_t is defined as an unsigned
integral type.

BTW, is it conforming for a C90 implementation to implement size_t
as an unsigned integral type larger than unsigned long? Is there a
specific statement in the Standard that forbids this? Won't the "as
if" rule rescue such an implementation?

No, it won't. Since the C90 standard says that size_t must be a
typedef for unsigned char, unsigned short, unsigned int, or unsigned
long (the only unsigned integer types), this is a strictly
conforming C90 program:

Okay. I do not have access to the C90 standard and I seem to have
misunderstood. I was under the impression that size_t was defined as
"an unsigned integer type" in C90, like it is in C99.

It is. However, ...
>So C90 strictly
restricts size_t to be an alias for one of unsigned
char/short/int/long.

...C90 does not consider any type other than those four an unsigned
integer type. C90's "unsigned integer type" is what C99 calls a
"standard unsigned integer type", and C90 does not recognise what C99
calls "extended integer types". If the implementation supports some
type that behaves exactly as an integer type would, and is represented
exactly the same way, it still cannot be an integer type according to
the C90's definition.
Hmm, seems pretty restrictive to me. Glad that that was rectified with
C99. I suppose this same restriction would make ptrdiff_t potentially
unusable with objects greater than LONG_MAX bytes, though of course,
objects of that size aren't guaranteed in the first place? Wouldn't an
unsigned ptrdiff_t have been more suitable than a signed one?

Jul 27 '08 #30

P: n/a
On Sun, 27 Jul 2008 18:09:03 +0530, santosh wrote:
Harald van D?k wrote:
>On Sun, 27 Jul 2008 15:02:40 +0530, santosh wrote:
>>Okay. I do not have access to the C90 standard and I seem to have
misunderstood. I was under the impression that size_t was defined as
"an unsigned integer type" in C90, like it is in C99.

It is. However, ...
>>So C90 strictly
restricts size_t to be an alias for one of unsigned
char/short/int/long.

...C90 does not consider any type other than those four an unsigned
integer type. C90's "unsigned integer type" is what C99 calls a
"standard unsigned integer type", and C90 does not recognise what C99
calls "extended integer types". If the implementation supports some
type that behaves exactly as an integer type would, and is represented
exactly the same way, it still cannot be an integer type according to
the C90's definition.

Hmm, seems pretty restrictive to me. Glad that that was rectified with
C99. I suppose this same restriction would make ptrdiff_t potentially
unusable with objects greater than LONG_MAX bytes,
Indeed. But there's no guarantee that ptrdiff_t is large enough to hold
the largest difference between two pointers anyway.
though of course,
objects of that size aren't guaranteed in the first place?
True, but (assuming size_t is wide enough) there's nothing stopping you
from passing a large value to malloc, and only using an object of that
size if malloc succeeds.
Wouldn't an
unsigned ptrdiff_t have been more suitable than a signed one?
(p - 1) - p should evaluate to -1. This is not possible if ptrdiff_t were
unsigned.
Jul 27 '08 #31

P: n/a
santosh wrote:
Harald van D?k wrote:
>On Sun, 27 Jul 2008 15:02:40 +0530, santosh wrote:
>>Harald van D?k wrote:
On Sun, 27 Jul 2008 12:59:49 +0530, santosh wrote:
James Harris wrote:
>
<snip>
>
>I'm not sure it's a question of learning the meaning of size_t.
>The problem was in printing it with printf prior to C99.
>>
>The recommendation seems to be to cast to unsigned long or similar
>but surely if size_t is wider than unsigned long
In a C90 conforming implementation size_t cannot be wider than
unsigned long since the latter is the largest integral type
specified by the Standard, and size_t is defined as an unsigned
integral type.
>
BTW, is it conforming for a C90 implementation to implement size_t
as an unsigned integral type larger than unsigned long? Is there a
specific statement in the Standard that forbids this? Won't the "as
if" rule rescue such an implementation?
No, it won't. Since the C90 standard says that size_t must be a
typedef for unsigned char, unsigned short, unsigned int, or unsigned
long (the only unsigned integer types), this is a strictly
conforming C90 program:
Okay. I do not have access to the C90 standard and I seem to have
misunderstood. I was under the impression that size_t was defined as
"an unsigned integer type" in C90, like it is in C99.
It is. However, ...
>>So C90 strictly
restricts size_t to be an alias for one of unsigned
char/short/int/long.
...C90 does not consider any type other than those four an unsigned
integer type. C90's "unsigned integer type" is what C99 calls a
"standard unsigned integer type", and C90 does not recognise what C99
calls "extended integer types". If the implementation supports some
type that behaves exactly as an integer type would, and is represented
exactly the same way, it still cannot be an integer type according to
the C90's definition.

Hmm, seems pretty restrictive to me. Glad that that was rectified with
C99. I suppose this same restriction would make ptrdiff_t potentially
unusable with objects greater than LONG_MAX bytes, though of course,
objects of that size aren't guaranteed in the first place?
ptrdiff_t doesn't have that problem with programs that
don't exceed any of the guaranteed minimum environmental limits.

N869
7.18.3 Limits of other integer types
[#2]
-- limits of ptrdiff_t
PTRDIFF_MIN -65535
PTRDIFF_MAX +65535
-- limit of size_t
SIZE_MAX 65535

But, the size of an object doesn't have to exceed LONG_MAX bytes
to have that problem. It only has to exceed 65535 bytes.

N869
6.5.6 Additive operators

[#9] When two pointers are subtracted, both shall point to
elements of the same array object, or one past the last
element of the array object; the result is the difference of
the subscripts of the two array elements. The size of the
result is implementation-defined, and its type (a signed
integer type) is ptrdiff_t defined in the <stddef.hheader.
If the result is not representable in an object of that
type, the behavior is undefined.
Wouldn't an
unsigned ptrdiff_t have been more suitable than a signed one?

The POSIX solution, is a signed size_t.
http://bytes.com/forum/thread458286.html

--
pete
Jul 27 '08 #32

P: n/a
In article <g6**********@registered.motzarella.org>, santosh
<sa*********@gmail.comwrote:
...
I think my real question was whether C90 requires size_t to be a typedef
for the "fundamental" unsigned integer types that it defines, or
whether it would be conforming for an implementation to define size_t
as /an/ unsigned integer type, but distinct from unsigned
char/short/int/long. But your program above has answered that question.

So I suppose it's impossible to write a fully conforming C90 program
under 64 bit Windows that calls a Windows API function.
Does the Windows API prevent a C90 compiler on 64-bit windows from making
unsigned long 64 bits wide? If long must be 32 bits, could such a compile
at least make 2^32-1 the maximum object size, so that the 32-bit unsigned
long would be sufficient to hold the size of any object?
Jul 28 '08 #33

P: n/a
blargg wrote:
>
.... snip ...
>
Does the Windows API prevent a C90 compiler on 64-bit windows from
making unsigned long 64 bits wide? If long must be 32 bits, could
such a compile at least make 2^32-1 the maximum object size, so
that the 32-bit unsigned long would be sufficient to hold the size
of any object?
No, the minimum size of a long is 32 bits. Emphasize, minimum.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Jul 28 '08 #34

P: n/a
On Mon, 28 Jul 2008 15:19:37 -0400, CBFalconer wrote:
blargg wrote:
>Does the Windows API prevent a C90 compiler on 64-bit windows from
making unsigned long 64 bits wide?

No, the minimum size of a long is 32 bits. Emphasize, minimum.
The C standard allows long to hold more than 32 bits, but that doesn't
mean the Windows API does. It's similar to how you shouldn't expect an
implementation that makes system(NULL) returns 0 to conform to POSIX, even
though any POSIX-conforming implementation is also an conforming C
implementation.
Jul 28 '08 #35

P: n/a
blargg wrote:
In article <g6**********@registered.motzarella.org>, santosh
<sa*********@gmail.comwrote:
>...
I think my real question was whether C90 requires size_t to be a
typedef for the "fundamental" unsigned integer types that it defines,
or whether it would be conforming for an implementation to define
size_t as /an/ unsigned integer type, but distinct from unsigned
char/short/int/long. But your program above has answered that
question.

So I suppose it's impossible to write a fully conforming C90 program
under 64 bit Windows that calls a Windows API function.

Does the Windows API prevent a C90 compiler on 64-bit windows from
making unsigned long 64 bits wide?
IIRC long is constrained to be 32 bits for code which interfaces with
the Windows API.
If long must be 32 bits, could such
a compile at least make 2^32-1 the maximum object size, so that the
32-bit unsigned long would be sufficient to hold the size of any
object?
It could do so, but then you are giving up one advantage of using a
larger address-space computer.

Honestly, I don't know much about 64 bit Windows, so I expect that I'm
either wrong in what I said above or that there are system specific
workarounds. Please don't take my word for this but ask in a group like
<news:comp.os.ms-windows.programmer.win32or a suitable group under
the <news:microsoft.publiccategory.

Jul 29 '08 #36

P: n/a
pete wrote:
santosh wrote:
>Harald van D?k wrote:
>>On Sun, 27 Jul 2008 15:02:40 +0530, santosh wrote:
Harald van D?k wrote:
On Sun, 27 Jul 2008 12:59:49 +0530, santosh wrote:
>James Harris wrote:
>>
><snip>
>>
>>I'm not sure it's a question of learning the meaning of size_t.
>>The problem was in printing it with printf prior to C99.
>>>
>>The recommendation seems to be to cast to unsigned long or
>>similar but surely if size_t is wider than unsigned long
>In a C90 conforming implementation size_t cannot be wider than
>unsigned long since the latter is the largest integral type
>specified by the Standard, and size_t is defined as an unsigned
>integral type.
>>
>BTW, is it conforming for a C90 implementation to implement
>size_t as an unsigned integral type larger than unsigned long? Is
>there a specific statement in the Standard that forbids this?
>Won't the "as if" rule rescue such an implementation?
No, it won't. Since the C90 standard says that size_t must be a
typedef for unsigned char, unsigned short, unsigned int, or
unsigned long (the only unsigned integer types), this is a
strictly conforming C90 program:
Okay. I do not have access to the C90 standard and I seem to have
misunderstood. I was under the impression that size_t was defined
as "an unsigned integer type" in C90, like it is in C99.
It is. However, ...

So C90 strictly
restricts size_t to be an alias for one of unsigned
char/short/int/long.
...C90 does not consider any type other than those four an unsigned
integer type. C90's "unsigned integer type" is what C99 calls a
"standard unsigned integer type", and C90 does not recognise what
C99 calls "extended integer types". If the implementation supports
some type that behaves exactly as an integer type would, and is
represented exactly the same way, it still cannot be an integer type
according to the C90's definition.

Hmm, seems pretty restrictive to me. Glad that that was rectified
with C99. I suppose this same restriction would make ptrdiff_t
potentially unusable with objects greater than LONG_MAX bytes, though
of course, objects of that size aren't guaranteed in the first place?

ptrdiff_t doesn't have that problem with programs that
don't exceed any of the guaranteed minimum environmental limits.

N869
7.18.3 Limits of other integer types
[#2]
-- limits of ptrdiff_t
PTRDIFF_MIN -65535
PTRDIFF_MAX +65535
-- limit of size_t
SIZE_MAX 65535

But, the size of an object doesn't have to exceed LONG_MAX bytes
to have that problem. It only has to exceed 65535 bytes.
So it is. Thanks for that correction.

<snip>
The POSIX solution, is a signed size_t.
http://bytes.com/forum/thread458286.html
I wonder why ISO C decided not to adopt it? After all, size_t is
specifically meant for counting the bytes of an object, and presumably,
would be suitable for holding pointer offset values too.

Jul 29 '08 #37

This discussion thread is closed

Replies have been disabled for this discussion.