Connecting Tech Pros Worldwide Help | Site Map

On the creation of an API for containers in C

Remo D.
Guest
 
Posts: n/a
#1: Oct 26 '07
I've read with much interest the threads on how to create data
containers in C.

I started thinking if simpler ADT (other than the proposed List) could
be used as a starting point.

What about a dynamic vector? I mean a dynamic array with no predefined
size. A possible API could be:

typedef struct {
.... /* appropriate fields to manage the vector data */
} *vec_t;

vec_t vecNew (size_t elemsize);
void *vecSet (vec_t v, size_t index, void *elem);
void *vecGet (vec_t v, size_t index);
size_t vecCnt (vec_t v);
vec_t vecFree (vec_t v);


An open question is if vecSet should create a copy of the element or
memory management should be demanded to the programmer. For simplicy I
would prefer the former (and that's also the reason for the elemsize of
vecNew().

vecCnt would return the highest index where an element has been stored
with vecSet();

This simple API could be augmented with:

void vecQsort (vec_t v, int (*cmp)(void *, void *));
void *vecBsearch (vec_t v, void *key, int (*cmp)(void *, void *));

etc.

That could be the case for other ADT. Once vec has been defined it's
easy to see how to use it to implement stacks, for example.

Comments?

Remo.D
Malcolm McLean
Guest
 
Posts: n/a
#2: Oct 26 '07

re: On the creation of an API for containers in C



"Remo D." <rdentatowrote in message
Quote:
typedef struct {
.... /* appropriate fields to manage the vector data */
} *vec_t;
>
vec_t vecNew (size_t elemsize);
void *vecSet (vec_t v, size_t index, void *elem);
void *vecGet (vec_t v, size_t index);
size_t vecCnt (vec_t v);
vec_t vecFree (vec_t v);
>
Comments?
>
The challenge isn't in the implementation - anyone with more than a trivial
amount of programming experience ought to be able to knock up a vector quite
easily. It's in the interface.
You haven't even tried to make the interface consistent with that of Chris
Thomasson. The last thing we need is two conventions for comp.lang.c
standard abstract data types.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

cr88192
Guest
 
Posts: n/a
#3: Oct 26 '07

re: On the creation of an API for containers in C



"Remo D." <rdentatowrote in message
news:47224f8b$0$4790$4fafbaef@reader4.news.tin.it. ..
Quote:
I've read with much interest the threads on how to create data containers
in C.
>
I started thinking if simpler ADT (other than the proposed List) could be
used as a starting point.
>
What about a dynamic vector? I mean a dynamic array with no predefined
size. A possible API could be:
>
<snip>
Quote:
etc.
>
That could be the case for other ADT. Once vec has been defined it's easy
to see how to use it to implement stacks, for example.
>
Comments?
>
if one thinks about it, they will realize that they have these void
pointers, and these dynamic vectors.

"maybe it will be useful to nest these dynamic vectors?"
"maybe it will be useful to have other useful container types?"
"maybe it will be useful to conviniently represent values, such as floats or
integers?"
....

then, one can end up with a typeless container system.

if one goes further:
"what if these here pointers somehow remember what kind of object they are
pointing to?"
....

one can arrive at a dynamic type system (and, of all things, see why this
kind of thing is useful, even in the core of C-land...).

of course, it is not a far reach to realize after having something like
this, a garbage collector would also be very useful...

and so on...


so, you may have stumbled on the first step, or maybe one can go a different
direction?...
(over in C++ land, the people there had the idea of templating everything,
coming up with something IMO far more complicated and far less flexible,
though with a gain in terms of faster code and compiler checkability...).

or something...

Quote:
Remo.D

Tor Rustad
Guest
 
Posts: n/a
#4: Oct 27 '07

re: On the creation of an API for containers in C


Malcolm McLean wrote:

[...]
Quote:
You haven't even tried to make the interface consistent with that of
Chris Thomasson. The last thing we need is two conventions for
comp.lang.c standard abstract data types.
There is no c.l.c standard ADT, if there ever was such a proposal, I
would put in a veto on principle grounds. :)

Need a vector, here it is:

T vector[N];

--
Tor <torust [at] online [dot] no>

"Technical skill is mastery of complexity, while creativity is mastery
of simplicity"
Remo D.
Guest
 
Posts: n/a
#5: Oct 27 '07

re: On the creation of an API for containers in C


Malcolm McLean ha scritto:
Quote:
>
The challenge isn't in the implementation - anyone with more than a
trivial amount of programming experience ought to be able to knock up a
vector quite easily. It's in the interface.
I agree, I was only talking about the API (and the semantic of the
functions in terms of memory management).

Quote:
You haven't even tried to make the interface consistent with that of
Chris Thomasson. The last thing we need is two conventions for
comp.lang.c standard abstract data types.
Yes, I've seen a lot of comments on his proposal leading to names that
are too long, so I tried a different path.

Even if I doubt we'll ever have a "c.l.c standard" I hope the discussion
will be helpful for those wanting/needing to implement ADT in C.

Remo.D

Remo D.
Guest
 
Posts: n/a
#6: Oct 27 '07

re: On the creation of an API for containers in C


cr88192 ha scritto:
Quote:
>
if one thinks about it, they will realize that they have these void
pointers, and these dynamic vectors.
>
"maybe it will be useful to nest these dynamic vectors?"
"maybe it will be useful to have other useful container types?"
"maybe it will be useful to conviniently represent values, such as floats or
integers?"
...
then, one can end up with a typeless container system.
Funny you mentioned it! In fact I right after started thinking about a
"generic type" like:

typedef struct {
union {
struct vecVal *vector
void *pointer; /* a generic pointer */
int32_t num_i32; /* 32 bit integer */
float num_flt; /* a float */
... /* other values one may be interested */
} val;
uint16_t type;
} vecVal;

The "ugliness" of this would be the many conversion functions that will
be needed to properly handle those values in C ...

Anyone can think of a better method to express the concept of "generic
typed value" in C?
Quote:
one can arrive at a dynamic type system (and, of all things, see why this
kind of thing is useful, even in the core of C-land...).
I guess one will have to find a point of balance between features and
usability.
Quote:
of course, it is not a far reach to realize after having something like
this, a garbage collector would also be very useful...
That's something I would like to have :). But then, I fear, it would not
be C any longer. It might be D or some other language...
I once had a look at the Boehm-Demers-Weiser GC but I felt rather
uncomforable with it. Not for the concept of GC, which I really like in
other languages, but for the ineherent machine/compiler/OS dependency it
has.
So, my preference would be for easing memory management but not up to
the point of a full GC.
Quote:
so, you may have stumbled on the first step, or maybe one can go a different
direction?...
Surely I might! What I would value most is a discussion on alternative
directions!

Remo.D



Remo D.
Guest
 
Posts: n/a
#7: Oct 27 '07

re: On the creation of an API for containers in C


Tor Rustad ha scritto:
Quote:
Malcolm McLean wrote:
>
Need a vector, here it is:
T vector[N];
That would be perfect! Being able to write:
int vector[N]; /* N is the initial size? */

vector[2*N] = 7

instead of:
vector = vecNew(sizeof(int));
vecSetVal(vector, 2*N, 7, int)

would be great.

But how to achieve it without using a preprocessor? As someone rightly
pointed out in another mail, the discussion is on the API not on the
implementation but I can't see how to introduce dynamic vectors (that
can grow/shrink during usage) still using the C notation for arrays.

BTW if I remember well, D goes exactly in that direction extending the
array notation to include resizable arrays, but we should stay within
the C boundaries.

Remo.D
cr88192
Guest
 
Posts: n/a
#8: Oct 27 '07

re: On the creation of an API for containers in C



"Remo D." <rdentatowrote in message
news:4722ec84$0$10625$4fafbaef@reader2.news.tin.it ...
Quote:
cr88192 ha scritto:
Quote:
>>
>if one thinks about it, they will realize that they have these void
>pointers, and these dynamic vectors.
>>
>"maybe it will be useful to nest these dynamic vectors?"
>"maybe it will be useful to have other useful container types?"
>"maybe it will be useful to conviniently represent values, such as floats
>or integers?"
>... then, one can end up with a typeless container system.
>
Funny you mentioned it! In fact I right after started thinking about a
"generic type" like:
>
typedef struct {
union {
struct vecVal *vector
void *pointer; /* a generic pointer */
int32_t num_i32; /* 32 bit integer */
float num_flt; /* a float */
... /* other values one may be interested */
} val;
uint16_t type;
} vecVal;
>
The "ugliness" of this would be the many conversion functions that will be
needed to properly handle those values in C ...
>
Anyone can think of a better method to express the concept of "generic
typed value" in C?
>
well, I had usually approached it though the use of "opaqueness".

basically, we have a value, in some opaque form, and use a system of wrap,
unwrap, and check functions.

in my most recent system, this looks like this:
dyt dyint(int v); //wrap integer
int dyintv(dyt p); //unwrap integer
int dyintp(dyt p); //type is an integer?...

where 'dyt' was basically a wrapper for an anonymous void pointer.

one ends up with a potentially large number of utility functions (a natural
cost of this approach), although, if one makes use of the dynamic-type
properties, one can cut down the costs substantially vs what they could
be...

of course, 'type' was built in as a part of the allocator, not as part of
the object as in the example above, so, one could do something like:
byte *a;
a=gctalloc("_bytearray_t", 1024);

where 'a' looks just like it came from something like malloc.

and, later be like:
if(gctypep(a, "_bytearray_t")) //check is a is a byte array
... //something specific to byte arrays

Quote:
Quote:
>one can arrive at a dynamic type system (and, of all things, see why this
>kind of thing is useful, even in the core of C-land...).
>
I guess one will have to find a point of balance between features and
usability.
>
yes.

and some options are better than others.
IMO, it is difficult to find the best possible balance (even as recently
demonstated), where I have ping-ponged to some extent between 'dynamically
typed void pointers' and 'tagged references' (where the value is essentially
an integer, using some of the bits for the type, and the rest for either
referring to objects, or for holding the value itself).

usually, the tradeoff is speed vs usability:
typed void pointers are easier to use, but not as fast nor as capable;
tagged references are more difficult, but are usually somewhat faster and
more capable.

a simple example:
pointers usually require some method of extracting and comparing the type;
with tagged references, it is often a mask and equality comparrison (and is
thus usually handled by a macro):
#define INTP(x) (((x)&3)==1)

likewise, if we are fairly sure of the type, we can usually extract it with
a shift:
#define INTVF(x) (((int)(x))>>2)

or, perform arithmetic directly on the tagged values...


but, this ability comes at high cost, as working with them, in some cases,
can become very painful...

one can end up forced inflexibly into a very rigid type system (both good
and bad in some ways), wheras typed pointers allow one to more informally
build a system that may involve structs or other ad-hoc constructions...

as a result, I have more often ended up using tagged references for
interpreters, and void pointers for general coding (and, in the cases I have
mixed them up, the results are not good, such as horribly painful and
non-reusable code in one case, and a horribly slow and garbage spewing
interpreter in the other...).

Quote:
Quote:
>of course, it is not a far reach to realize after having something like
>this, a garbage collector would also be very useful...
>
That's something I would like to have :). But then, I fear, it would not
be C any longer. It might be D or some other language...
I once had a look at the Boehm-Demers-Weiser GC but I felt rather
uncomforable with it. Not for the concept of GC, which I really like in
other languages, but for the ineherent machine/compiler/OS dependency it
has.
So, my preference would be for easing memory management but not up to the
point of a full GC.
>
well, one gets what they are willing to pay for...
portability is a cost, among others...

even within the realm of GC, there is a lot of room for costs and
tradeoffs...

Quote:
Quote:
>so, you may have stumbled on the first step, or maybe one can go a
>different direction?...
Surely I might! What I would value most is a discussion on alternative
directions!
>
yes.

Quote:
Remo.D
>
>
>

cr88192
Guest
 
Posts: n/a
#9: Oct 27 '07

re: On the creation of an API for containers in C



"Remo D." <rdentatowrote in message
news:4722eeb5$0$10620$4fafbaef@reader2.news.tin.it ...
Quote:
Tor Rustad ha scritto:
Quote:
>Malcolm McLean wrote:
>>
>Need a vector, here it is:
>T vector[N];
>
That would be perfect! Being able to write:
int vector[N]; /* N is the initial size? */
>
vector[2*N] = 7
>
instead of:
vector = vecNew(sizeof(int));
vecSetVal(vector, 2*N, 7, int)
>
would be great.
>
I think he was making use of satire, and infact refferring to good old
arrays, say:
float vector[3];

Quote:
But how to achieve it without using a preprocessor? As someone rightly
pointed out in another mail, the discussion is on the API not on the
implementation but I can't see how to introduce dynamic vectors (that can
grow/shrink during usage) still using the C notation for arrays.
>
BTW if I remember well, D goes exactly in that direction extending the
array notation to include resizable arrays, but we should stay within the
C boundaries.
>
Remo.D

Remo D.
Guest
 
Posts: n/a
#10: Oct 27 '07

re: On the creation of an API for containers in C


cr88192 ha scritto:
Quote:
"Remo D." <rdentatowrote in message
Quote:
Quote:
>>Tor Rustad wrote:
>>Need a vector, here it is:
>>T vector[N];
>That would be perfect!
>
I think he was making use of satire, and infact refferring to good old
arrays, say:
Ops. I always put trust in what the other says assuming their
willingness to contibute, so I was inclined to hope he knew much better
than me, hoping he would explain how to achive that :)


Remo.D


Malcolm McLean
Guest
 
Posts: n/a
#11: Oct 27 '07

re: On the creation of an API for containers in C



"Remo D." <rdentatowrote in message
news:472302b4$0$10622$4fafbaef@reader2.news.tin.it ...
Quote:
cr88192 ha scritto:
Quote:
>"Remo D." <rdentatowrote in message
Quote:
>>>Tor Rustad wrote:
>>>Need a vector, here it is:
>>>T vector[N];
>>That would be perfect!
>>
>I think he was making use of satire, and infact refferring to good old
>arrays, say:
>
Ops. I always put trust in what the other says assuming their willingness
to contibute, so I was inclined to hope he knew much better than me,
hoping he would explain how to achive that :)
>
I'd also expect the vector to be able to tell me that, at present, it has no
data within it. And to expand if I want to add a dimension. And to fail
nicely if the computer runs out of memory.
Can't be done even with C99 vectors, which will take a dimension determined
at runtime. However I've been accused of having a long list of things I
dislike about C. Let's say upfront that there is an efficiency and
simplicity of implementation vs convenience and simplicity of use issue
here.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Tor Rustad
Guest
 
Posts: n/a
#12: Oct 28 '07

re: On the creation of an API for containers in C


Remo D. wrote:
Quote:
cr88192 ha scritto:
Quote:
>"Remo D." <rdentatowrote in message
Quote:
>>>Tor Rustad wrote:
>>>Need a vector, here it is:
>>>T vector[N];
>>That would be perfect!
>>
>I think he was making use of satire, and infact refferring to good old
>arrays, say:
>
Ops. I always put trust in what the other says assuming their
willingness to contibute, so I was inclined to hope he knew much better
than me, hoping he would explain how to achive that :)
FYI, C experts are working on a TR for dynamic string management, as a
alternative for "TR 24731 - Part I: Bounds-checking interfaces".

Methinks, when that draft arrives, it's time to contribute constructively.

--
Tor <bwzcab@wvtqvm.vw | tr i-za-h a-z>

My C page: http://www.pg2.moo.no/C/index.html
-include Win32 build of splint 3.1.2
Roland Pibinger
Guest
 
Posts: n/a
#13: Oct 28 '07

re: On the creation of an API for containers in C


On Fri, 26 Oct 2007 22:35:59 +0200, "Remo D." <rdentatowrote:
Quote:
>I've read with much interest the threads on how to create data
>containers in C.
>
>I started thinking if simpler ADT (other than the proposed List) could
>be used as a starting point.
>
>What about a dynamic vector? I mean a dynamic array with no predefined
>size. A possible API could be:
>
>typedef struct {
.... /* appropriate fields to manage the vector data */
>} *vec_t;
Better use
- information hiding with a forward declaration and
- a handle to (partially) prevent the void* type loophole

struct vec_imp; // forwad declaration only

typedef struct {
struct vec_imp *imp;
} vec_handle;
Quote:
>vec_t vecNew (size_t elemsize);
>void *vecSet (vec_t v, size_t index, void *elem);
>void *vecGet (vec_t v, size_t index);
>size_t vecCnt (vec_t v);
>vec_t vecFree (vec_t v);
>
>An open question is if vecSet should create a copy of the element or
>memory management should be demanded to the programmer.
A generic container can only hold pointers. Take a look at the
array_list implementation in http://oss.metaparadigm.com/json-c/ esp.
how it manages memory with a callback function.
Quote:
>For simplicy I
>would prefer the former (and that's also the reason for the elemsize of
>vecNew().
You would use 'elemsize' to dynamically allocate memory and copy the
passed element?
Quote:
>vecCnt would return the highest index where an element has been stored
>with vecSet();
>
>This simple API could be augmented with:
>
>void vecQsort (vec_t v, int (*cmp)(void *, void *));
>void *vecBsearch (vec_t v, void *key, int (*cmp)(void *, void *));
>
>That could be the case for other ADT. Once vec has been defined it's
>easy to see how to use it to implement stacks, for example.
>
>Comments?
It's worth your while to study the approaches of data structure or
foundation libraries available on the internet. Some of them are large
and used widely, others defunct one-man shows, e.g. GLib, APR, NSPR,
OSSP, Kazlib, cprops, Libc-X, SFL, cbase, libmba, MemSL, ...


--
Roland Pibinger
"The best software is simple, elegant, and full of drama" - Grady Booch
Remo D.
Guest
 
Posts: n/a
#14: Oct 28 '07

re: On the creation of an API for containers in C


Roland Pibinger ha scritto:
RPOn Fri, 26 Oct 2007 22:35:59 +0200, "Remo D." <rdentatowrote:
RPRDAn open question is if vecSet should create a copy of the
element or
RPRDmemory management should be demanded to the programmer.
RP>
RPA generic container can only hold pointers. Take a look at the
RParray_list implementation in http://oss.metaparadigm.com/json-c/ esp.
RPhow it manages memory with a callback function.

What's the rational behind this restriction? If the datastructure could
handle memory management for me I'd be very glad. It could avoid me the
hassle of doing a malloc()/free() for each element. What's the drawback
you see?

RPRDFor simplicy I
RPRDwould prefer the former (and that's also the reason for the
elemsize of
RPRDvecNew().
RP>
RPYou would use 'elemsize' to dynamically allocate memory and copy the
RPpassed element?

That would be implementation dependant. Another approach could be to
implement the resizable vector with an array to be reallocated when the
new size exceeds the current size.
I'm currently using the approach described in "Algorithm Alley: Fast and
Small Resizable Arrays" an article from DDJ:
(http://www.ddj.com/architect/184404698). The original technical report
is "Resizable Arrays in Optimal Time and Space"
(http://citeseer.ist.psu.edu/brodnik99resizable.html)

It doesn't use realloc() (avoiding copying the elements in the array
over and over again) and never wastes more memory than sqrt(n) (whereas
the classsical redoubling strategy can waste memory proportionally to n)
Quote:
It's worth your while to study the approaches of data structure or
foundation libraries available on the internet.
The discussion started around the abstract definition of an API for
generic containers. "If we could have in C something like the C++ STL,
what should the API look like?".

I had a look to some of them but the idea was to have a discussion on
the merits and demerits of an approach vs another.

Do you think any of the cited libraries have a sintax for the API that
would be worthwile to reccomend/adopt?

Remo.D






Roland Pibinger
Guest
 
Posts: n/a
#15: Oct 28 '07

re: On the creation of an API for containers in C


On Sun, 28 Oct 2007 13:32:42 +0100, "Remo D." <rdentatowrote:
Quote:
>Roland Pibinger ha scritto:
Quote:
>If the datastructure could
>handle memory management for me I'd be very glad. It could avoid me the
>hassle of doing a malloc()/free() for each element. What's the drawback
>you see?
I'm trying to imagine how your dynamic vector is meant to be used. I
guess it's something like this:

struct something {
int i;
// ...
};

void foo() {
struct something proto;
vec_t myVec = vecNew (sizeof(struct something));
proto.i = 3;
vecAdd (myVec, &proto);
proto.i = 4;
vecAdd (myVec, &proto);
// ..
vecFree (myVec);
}

A local object is created and copied into the dynamic vector. Dynamic
allocation is done only by the vector. This works unless struct
something contains a pointer to dynamically allocated memory. The
latter case cannot be handled by a 'generic' vector.
Quote:
>I had a look to some of them but the idea was to have a discussion on
>the merits and demerits of an approach vs another.
>
>Do you think any of the cited libraries have a sintax for the API that
>would be worthwile to reccomend/adopt?
You can always learn from the merits and mistakes of others. I would
start with a simple approach like the mentioned array_list and proceed
from there.


--
Roland Pibinger
"The best software is simple, elegant, and full of drama" - Grady Booch
Closed Thread


Similar C / C++ bytes