473,387 Members | 1,579 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

On the creation of an API for containers in C

I've read with much interest the threads on how to create data
containers in C.

I started thinking if simpler ADT (other than the proposed List) could
be used as a starting point.

What about a dynamic vector? I mean a dynamic array with no predefined
size. A possible API could be:

typedef struct {
.... /* appropriate fields to manage the vector data */
} *vec_t;

vec_t vecNew (size_t elemsize);
void *vecSet (vec_t v, size_t index, void *elem);
void *vecGet (vec_t v, size_t index);
size_t vecCnt (vec_t v);
vec_t vecFree (vec_t v);
An open question is if vecSet should create a copy of the element or
memory management should be demanded to the programmer. For simplicy I
would prefer the former (and that's also the reason for the elemsize of
vecNew().

vecCnt would return the highest index where an element has been stored
with vecSet();

This simple API could be augmented with:

void vecQsort (vec_t v, int (*cmp)(void *, void *));
void *vecBsearch (vec_t v, void *key, int (*cmp)(void *, void *));

etc.

That could be the case for other ADT. Once vec has been defined it's
easy to see how to use it to implement stacks, for example.

Comments?

Remo.D
Oct 26 '07 #1
14 1620

"Remo D." <rdentatowrote in message
typedef struct {
.... /* appropriate fields to manage the vector data */
} *vec_t;

vec_t vecNew (size_t elemsize);
void *vecSet (vec_t v, size_t index, void *elem);
void *vecGet (vec_t v, size_t index);
size_t vecCnt (vec_t v);
vec_t vecFree (vec_t v);

Comments?
The challenge isn't in the implementation - anyone with more than a trivial
amount of programming experience ought to be able to knock up a vector quite
easily. It's in the interface.
You haven't even tried to make the interface consistent with that of Chris
Thomasson. The last thing we need is two conventions for comp.lang.c
standard abstract data types.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Oct 26 '07 #2

"Remo D." <rdentatowrote in message
news:47**********************@reader4.news.tin.it. ..
I've read with much interest the threads on how to create data containers
in C.

I started thinking if simpler ADT (other than the proposed List) could be
used as a starting point.

What about a dynamic vector? I mean a dynamic array with no predefined
size. A possible API could be:
<snip>
etc.

That could be the case for other ADT. Once vec has been defined it's easy
to see how to use it to implement stacks, for example.

Comments?
if one thinks about it, they will realize that they have these void
pointers, and these dynamic vectors.

"maybe it will be useful to nest these dynamic vectors?"
"maybe it will be useful to have other useful container types?"
"maybe it will be useful to conviniently represent values, such as floats or
integers?"
....

then, one can end up with a typeless container system.

if one goes further:
"what if these here pointers somehow remember what kind of object they are
pointing to?"
....

one can arrive at a dynamic type system (and, of all things, see why this
kind of thing is useful, even in the core of C-land...).

of course, it is not a far reach to realize after having something like
this, a garbage collector would also be very useful...

and so on...
so, you may have stumbled on the first step, or maybe one can go a different
direction?...
(over in C++ land, the people there had the idea of templating everything,
coming up with something IMO far more complicated and far less flexible,
though with a gain in terms of faster code and compiler checkability...).

or something...

Remo.D

Oct 26 '07 #3
Malcolm McLean wrote:

[...]
You haven't even tried to make the interface consistent with that of
Chris Thomasson. The last thing we need is two conventions for
comp.lang.c standard abstract data types.
There is no c.l.c standard ADT, if there ever was such a proposal, I
would put in a veto on principle grounds. :)

Need a vector, here it is:

T vector[N];

--
Tor <torust [at] online [dot] no>

"Technical skill is mastery of complexity, while creativity is mastery
of simplicity"
Oct 27 '07 #4
Malcolm McLean ha scritto:
>
The challenge isn't in the implementation - anyone with more than a
trivial amount of programming experience ought to be able to knock up a
vector quite easily. It's in the interface.
I agree, I was only talking about the API (and the semantic of the
functions in terms of memory management).

You haven't even tried to make the interface consistent with that of
Chris Thomasson. The last thing we need is two conventions for
comp.lang.c standard abstract data types.
Yes, I've seen a lot of comments on his proposal leading to names that
are too long, so I tried a different path.

Even if I doubt we'll ever have a "c.l.c standard" I hope the discussion
will be helpful for those wanting/needing to implement ADT in C.

Remo.D

Oct 27 '07 #5
cr88192 ha scritto:
>
if one thinks about it, they will realize that they have these void
pointers, and these dynamic vectors.

"maybe it will be useful to nest these dynamic vectors?"
"maybe it will be useful to have other useful container types?"
"maybe it will be useful to conviniently represent values, such as floats or
integers?"
...
then, one can end up with a typeless container system.
Funny you mentioned it! In fact I right after started thinking about a
"generic type" like:

typedef struct {
union {
struct vecVal *vector
void *pointer; /* a generic pointer */
int32_t num_i32; /* 32 bit integer */
float num_flt; /* a float */
... /* other values one may be interested */
} val;
uint16_t type;
} vecVal;

The "ugliness" of this would be the many conversion functions that will
be needed to properly handle those values in C ...

Anyone can think of a better method to express the concept of "generic
typed value" in C?
one can arrive at a dynamic type system (and, of all things, see why this
kind of thing is useful, even in the core of C-land...).
I guess one will have to find a point of balance between features and
usability.
of course, it is not a far reach to realize after having something like
this, a garbage collector would also be very useful...
That's something I would like to have :). But then, I fear, it would not
be C any longer. It might be D or some other language...
I once had a look at the Boehm-Demers-Weiser GC but I felt rather
uncomforable with it. Not for the concept of GC, which I really like in
other languages, but for the ineherent machine/compiler/OS dependency it
has.
So, my preference would be for easing memory management but not up to
the point of a full GC.
so, you may have stumbled on the first step, or maybe one can go a different
direction?...
Surely I might! What I would value most is a discussion on alternative
directions!

Remo.D

Oct 27 '07 #6
Tor Rustad ha scritto:
Malcolm McLean wrote:

Need a vector, here it is:
T vector[N];
That would be perfect! Being able to write:
int vector[N]; /* N is the initial size? */

vector[2*N] = 7

instead of:
vector = vecNew(sizeof(int));
vecSetVal(vector, 2*N, 7, int)

would be great.

But how to achieve it without using a preprocessor? As someone rightly
pointed out in another mail, the discussion is on the API not on the
implementation but I can't see how to introduce dynamic vectors (that
can grow/shrink during usage) still using the C notation for arrays.

BTW if I remember well, D goes exactly in that direction extending the
array notation to include resizable arrays, but we should stay within
the C boundaries.

Remo.D
Oct 27 '07 #7

"Remo D." <rdentatowrote in message
news:47***********************@reader2.news.tin.it ...
cr88192 ha scritto:
>>
if one thinks about it, they will realize that they have these void
pointers, and these dynamic vectors.

"maybe it will be useful to nest these dynamic vectors?"
"maybe it will be useful to have other useful container types?"
"maybe it will be useful to conviniently represent values, such as floats
or integers?"
... then, one can end up with a typeless container system.

Funny you mentioned it! In fact I right after started thinking about a
"generic type" like:

typedef struct {
union {
struct vecVal *vector
void *pointer; /* a generic pointer */
int32_t num_i32; /* 32 bit integer */
float num_flt; /* a float */
... /* other values one may be interested */
} val;
uint16_t type;
} vecVal;

The "ugliness" of this would be the many conversion functions that will be
needed to properly handle those values in C ...

Anyone can think of a better method to express the concept of "generic
typed value" in C?
well, I had usually approached it though the use of "opaqueness".

basically, we have a value, in some opaque form, and use a system of wrap,
unwrap, and check functions.

in my most recent system, this looks like this:
dyt dyint(int v); //wrap integer
int dyintv(dyt p); //unwrap integer
int dyintp(dyt p); //type is an integer?...

where 'dyt' was basically a wrapper for an anonymous void pointer.

one ends up with a potentially large number of utility functions (a natural
cost of this approach), although, if one makes use of the dynamic-type
properties, one can cut down the costs substantially vs what they could
be...

of course, 'type' was built in as a part of the allocator, not as part of
the object as in the example above, so, one could do something like:
byte *a;
a=gctalloc("_bytearray_t", 1024);

where 'a' looks just like it came from something like malloc.

and, later be like:
if(gctypep(a, "_bytearray_t")) //check is a is a byte array
... //something specific to byte arrays

>one can arrive at a dynamic type system (and, of all things, see why this
kind of thing is useful, even in the core of C-land...).

I guess one will have to find a point of balance between features and
usability.
yes.

and some options are better than others.
IMO, it is difficult to find the best possible balance (even as recently
demonstated), where I have ping-ponged to some extent between 'dynamically
typed void pointers' and 'tagged references' (where the value is essentially
an integer, using some of the bits for the type, and the rest for either
referring to objects, or for holding the value itself).

usually, the tradeoff is speed vs usability:
typed void pointers are easier to use, but not as fast nor as capable;
tagged references are more difficult, but are usually somewhat faster and
more capable.

a simple example:
pointers usually require some method of extracting and comparing the type;
with tagged references, it is often a mask and equality comparrison (and is
thus usually handled by a macro):
#define INTP(x) (((x)&3)==1)

likewise, if we are fairly sure of the type, we can usually extract it with
a shift:
#define INTVF(x) (((int)(x))>>2)

or, perform arithmetic directly on the tagged values...
but, this ability comes at high cost, as working with them, in some cases,
can become very painful...

one can end up forced inflexibly into a very rigid type system (both good
and bad in some ways), wheras typed pointers allow one to more informally
build a system that may involve structs or other ad-hoc constructions...

as a result, I have more often ended up using tagged references for
interpreters, and void pointers for general coding (and, in the cases I have
mixed them up, the results are not good, such as horribly painful and
non-reusable code in one case, and a horribly slow and garbage spewing
interpreter in the other...).

>of course, it is not a far reach to realize after having something like
this, a garbage collector would also be very useful...

That's something I would like to have :). But then, I fear, it would not
be C any longer. It might be D or some other language...
I once had a look at the Boehm-Demers-Weiser GC but I felt rather
uncomforable with it. Not for the concept of GC, which I really like in
other languages, but for the ineherent machine/compiler/OS dependency it
has.
So, my preference would be for easing memory management but not up to the
point of a full GC.
well, one gets what they are willing to pay for...
portability is a cost, among others...

even within the realm of GC, there is a lot of room for costs and
tradeoffs...

>so, you may have stumbled on the first step, or maybe one can go a
different direction?...
Surely I might! What I would value most is a discussion on alternative
directions!
yes.

Remo.D

Oct 27 '07 #8

"Remo D." <rdentatowrote in message
news:47***********************@reader2.news.tin.it ...
Tor Rustad ha scritto:
>Malcolm McLean wrote:

Need a vector, here it is:
T vector[N];

That would be perfect! Being able to write:
int vector[N]; /* N is the initial size? */

vector[2*N] = 7

instead of:
vector = vecNew(sizeof(int));
vecSetVal(vector, 2*N, 7, int)

would be great.
I think he was making use of satire, and infact refferring to good old
arrays, say:
float vector[3];

But how to achieve it without using a preprocessor? As someone rightly
pointed out in another mail, the discussion is on the API not on the
implementation but I can't see how to introduce dynamic vectors (that can
grow/shrink during usage) still using the C notation for arrays.

BTW if I remember well, D goes exactly in that direction extending the
array notation to include resizable arrays, but we should stay within the
C boundaries.

Remo.D

Oct 27 '07 #9
cr88192 ha scritto:
"Remo D." <rdentatowrote in message
>>Tor Rustad wrote:
Need a vector, here it is:
T vector[N];
That would be perfect!

I think he was making use of satire, and infact refferring to good old
arrays, say:
Ops. I always put trust in what the other says assuming their
willingness to contibute, so I was inclined to hope he knew much better
than me, hoping he would explain how to achive that :)
Remo.D
Oct 27 '07 #10

"Remo D." <rdentatowrote in message
news:47***********************@reader2.news.tin.it ...
cr88192 ha scritto:
>"Remo D." <rdentatowrote in message
>>>Tor Rustad wrote:
Need a vector, here it is:
T vector[N];
That would be perfect!

I think he was making use of satire, and infact refferring to good old
arrays, say:

Ops. I always put trust in what the other says assuming their willingness
to contibute, so I was inclined to hope he knew much better than me,
hoping he would explain how to achive that :)
I'd also expect the vector to be able to tell me that, at present, it has no
data within it. And to expand if I want to add a dimension. And to fail
nicely if the computer runs out of memory.
Can't be done even with C99 vectors, which will take a dimension determined
at runtime. However I've been accused of having a long list of things I
dislike about C. Let's say upfront that there is an efficiency and
simplicity of implementation vs convenience and simplicity of use issue
here.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Oct 27 '07 #11
Remo D. wrote:
cr88192 ha scritto:
>"Remo D." <rdentatowrote in message
>>>Tor Rustad wrote:
Need a vector, here it is:
T vector[N];
That would be perfect!

I think he was making use of satire, and infact refferring to good old
arrays, say:

Ops. I always put trust in what the other says assuming their
willingness to contibute, so I was inclined to hope he knew much better
than me, hoping he would explain how to achive that :)
FYI, C experts are working on a TR for dynamic string management, as a
alternative for "TR 24731 - Part I: Bounds-checking interfaces".

Methinks, when that draft arrives, it's time to contribute constructively.

--
Tor <bw****@wvtqvm.vw | tr i-za-h a-z>

My C page: http://www.pg2.moo.no/C/index.html
-include Win32 build of splint 3.1.2
Oct 28 '07 #12
On Fri, 26 Oct 2007 22:35:59 +0200, "Remo D." <rdentatowrote:
>I've read with much interest the threads on how to create data
containers in C.

I started thinking if simpler ADT (other than the proposed List) could
be used as a starting point.

What about a dynamic vector? I mean a dynamic array with no predefined
size. A possible API could be:

typedef struct {
.... /* appropriate fields to manage the vector data */
} *vec_t;
Better use
- information hiding with a forward declaration and
- a handle to (partially) prevent the void* type loophole

struct vec_imp; // forwad declaration only

typedef struct {
struct vec_imp *imp;
} vec_handle;
>vec_t vecNew (size_t elemsize);
void *vecSet (vec_t v, size_t index, void *elem);
void *vecGet (vec_t v, size_t index);
size_t vecCnt (vec_t v);
vec_t vecFree (vec_t v);

An open question is if vecSet should create a copy of the element or
memory management should be demanded to the programmer.
A generic container can only hold pointers. Take a look at the
array_list implementation in http://oss.metaparadigm.com/json-c/ esp.
how it manages memory with a callback function.
>For simplicy I
would prefer the former (and that's also the reason for the elemsize of
vecNew().
You would use 'elemsize' to dynamically allocate memory and copy the
passed element?
>vecCnt would return the highest index where an element has been stored
with vecSet();

This simple API could be augmented with:

void vecQsort (vec_t v, int (*cmp)(void *, void *));
void *vecBsearch (vec_t v, void *key, int (*cmp)(void *, void *));

That could be the case for other ADT. Once vec has been defined it's
easy to see how to use it to implement stacks, for example.

Comments?
It's worth your while to study the approaches of data structure or
foundation libraries available on the internet. Some of them are large
and used widely, others defunct one-man shows, e.g. GLib, APR, NSPR,
OSSP, Kazlib, cprops, Libc-X, SFL, cbase, libmba, MemSL, ...
--
Roland Pibinger
"The best software is simple, elegant, and full of drama" - Grady Booch
Oct 28 '07 #13
Roland Pibinger ha scritto:
RPOn Fri, 26 Oct 2007 22:35:59 +0200, "Remo D." <rdentatowrote:
RPRDAn open question is if vecSet should create a copy of the
element or
RPRDmemory management should be demanded to the programmer.
RP>
RPA generic container can only hold pointers. Take a look at the
RParray_list implementation in http://oss.metaparadigm.com/json-c/ esp.
RPhow it manages memory with a callback function.

What's the rational behind this restriction? If the datastructure could
handle memory management for me I'd be very glad. It could avoid me the
hassle of doing a malloc()/free() for each element. What's the drawback
you see?

RPRDFor simplicy I
RPRDwould prefer the former (and that's also the reason for the
elemsize of
RPRDvecNew().
RP>
RPYou would use 'elemsize' to dynamically allocate memory and copy the
RPpassed element?

That would be implementation dependant. Another approach could be to
implement the resizable vector with an array to be reallocated when the
new size exceeds the current size.
I'm currently using the approach described in "Algorithm Alley: Fast and
Small Resizable Arrays" an article from DDJ:
(http://www.ddj.com/architect/184404698). The original technical report
is "Resizable Arrays in Optimal Time and Space"
(http://citeseer.ist.psu.edu/brodnik99resizable.html)

It doesn't use realloc() (avoiding copying the elements in the array
over and over again) and never wastes more memory than sqrt(n) (whereas
the classsical redoubling strategy can waste memory proportionally to n)
It's worth your while to study the approaches of data structure or
foundation libraries available on the internet.
The discussion started around the abstract definition of an API for
generic containers. "If we could have in C something like the C++ STL,
what should the API look like?".

I had a look to some of them but the idea was to have a discussion on
the merits and demerits of an approach vs another.

Do you think any of the cited libraries have a sintax for the API that
would be worthwile to reccomend/adopt?

Remo.D


Oct 28 '07 #14
On Sun, 28 Oct 2007 13:32:42 +0100, "Remo D." <rdentatowrote:
>Roland Pibinger ha scritto:
>If the datastructure could
handle memory management for me I'd be very glad. It could avoid me the
hassle of doing a malloc()/free() for each element. What's the drawback
you see?
I'm trying to imagine how your dynamic vector is meant to be used. I
guess it's something like this:

struct something {
int i;
// ...
};

void foo() {
struct something proto;
vec_t myVec = vecNew (sizeof(struct something));
proto.i = 3;
vecAdd (myVec, &proto);
proto.i = 4;
vecAdd (myVec, &proto);
// ..
vecFree (myVec);
}

A local object is created and copied into the dynamic vector. Dynamic
allocation is done only by the vector. This works unless struct
something contains a pointer to dynamically allocated memory. The
latter case cannot be handled by a 'generic' vector.
>I had a look to some of them but the idea was to have a discussion on
the merits and demerits of an approach vs another.

Do you think any of the cited libraries have a sintax for the API that
would be worthwile to reccomend/adopt?
You can always learn from the merits and mistakes of others. I would
start with a simple approach like the mentioned array_list and proceed
from there.
--
Roland Pibinger
"The best software is simple, elegant, and full of drama" - Grady Booch
Oct 28 '07 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: Dylan | last post by:
I'd like to compare two containers. They should be considered equivalent if both containers have the same number of elements with the same values, no matter what order the values are in. For...
14
by: phil_gg04 | last post by:
Dear C++ Experts, Over the last couple of months I have been writing my first program using shared memory. It has been something of an "in-at-the-deep-end" experience, to say the least. At...
18
by: Matthias Kaeppler | last post by:
Hi, in my program, I have to sort containers of objects which can be 2000 items big in some cases. Since STL containers are based around copying and since I need to sort these containers quite...
8
by: Gregory | last post by:
I have a question about using STL containers in C++ class public interface. Lets say that I want to return some container from class method or accept class method parameter as some container. For...
2
by: bob | last post by:
Hi, Given: 1) Custom containers that have nothing to do with vector, deque, list, map etc, 2) a custom version of new and delete defined for these containers, customNew and customDelete,...
1
by: Jean-Marc Blaise | last post by:
IBM recommends to place each ts container on a different physical disk. What about the impacts if all containers (DMS) are in the same FS (supported by multiple disks) ? Does DB2 preallocate...
4
by: Michel Esber | last post by:
Hello, DB2 V8 FP12 LUW. My system is under heavy wait-io times. I have read the following article: http://tinyurl.com/p844z and it says that prefetchsize should be the product of extentsize...
15
by: Nindi73 | last post by:
HI If I define the class DoubleMap such that struct DoubleMap : public std::map<std::string, double>{}; Is there any overhead in calling std::map member functions ? Moreover are STL...
11
by: jimxoch | last post by:
Hi list, Most STL containers are storing their data on the heap. (although some std::string implementations are notable exceptions) Of course, using the heap as storage increases flexibility and...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.