473,406 Members | 2,378 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

A string collection abstract data type

Abstract:

Continuing the discussion about abstract data types, in this
discussion group, a string collection data type is presented,
patterned after the collection in C# and similar languages (Java).
It stores character strings, and resizes itself to accommodate
new strings when needed.

Interface:
----------

The data structure uses a table of function pointers to provide an
extensible basis for many functions without cluttering the
user's workspace with too many names.

This architecture is not only extensible at the API level, but
it allows "subclassing". If the user is unsatisfied with some
of the functions of the API, he/she can:

o Replace one or more of the function pointers in the table
with a function of his/her own.

o Store somewhere the old function pointer, and replace it with
a function of his own that after doing some work calls the
original function pointer, optionally doing some work after the
original function returns.

This type of extensibility can only be achieved with function pointers.
The basic problem with many of the proposals presented here is that
they present too much names that can conflict with user names.

This hiding of names is useful in another way, since it allows the API
to use completely generic names like "Add" or similar without any
ambiguity.

Since those generic names can be used in *other* abstract data types
that use the same type of structure, the user code can be made truly
general, and it is easy to change from a string collection to a list
without too much trouble.

The objective would be to make a standard way of naming things within
a container, so that user code is shielded from change when passing from
one container to another.

The name of the table of functions member is "lpVtbl" following
an old function naming convention under the windows OS and in
the COM system. It can be changed of course.

The interface is described in the header file <strcollection.h>
---------------------------------------------------------------------
#include <string.h>
// Forward declaration of the string collection type
typedef struct _StringCollection StringCollection;
typedef struct {
// Returns the number of elements stored
int (*GetCount)(StringCollection *SC);

// Is this collection read only?
int (*IsReadOnly)(StringCollection *SC);

// Sets this collection read-only or unsets the read-only flag
int (*SetReadOnly)(StringCollection *SC,int flag);

// Adds one element at the end. Given string is copied
int (*Add)(StringCollection *SC,char *newval);

// Adds a NULL terminated table of strings
int (*AddRange)(StringCollection *SC,char **newvalues);

// Clears all data and frees the memory
int (*Clear)(StringCollection *SC);

//Case sensitive search of a character string in the data
int (*Contains)(StringCollection *SC,char *str);

// Copies all strings into a NULL terminated vector
char **(*CopyTo)(StringCollection *SC);

//Returns the index of the given string or -1 if not found
int (*IndexOf)(StringCollection *SC,char *SearchedString);

// Inserts a string at the position zero.
int (*Insert)(StringCollection *SC,char *);

// Inserts a string at the given position
int (*InsertAt)(StringCollection *SC,int idx,char *newval);

// Returns the string at the given position
char *(*IndexAt)(StringCollection *SC,int idx);

// Removes the given string if found
int (*Remove)(StringCollection *SC,char *);

//Removes the string at the indicated position
int (*RemoveAt)(StringCollection *SC,int idx);

// Frees the memory used by the collection
int (*Finalize)(StringCollection *SC);

// Returns the current capacity of the collection
int (*GetCapacity)(StringCollection *SC);

// Sets the capacity if there are no items in the collection
int (*SetCapacity)(StringCollection *SC,int newCapacity);

// Calls the given function for all strings.
// "Arg" is a user supplied argument (that can be NULL)
// which is passed to the function to call
void (*Apply)(StringCollection *SC,int (*Applyfn)(char *,void *
arg),void *arg);

// Calls the given function for each string and saves
// all results in an integer vector
int *(*Map)(StringCollection *SC,int (*Applyfn)(char *));

// Pushes a string, using the collection as a stack
int (*Push)(StringCollection *SC,char *str);

// Pops the last string off the collection
char * (*Pop)(StringCollection *SC);

// Replaces the character string at the given position
// with a new one
char *(*ReplaceAt)(StringCollection *SC,int idx,char *newval);

} StringCollectionFunctions;

// Definition of the String Collection type
struct _StringCollection {
StringCollectionFunctions *lpVtbl; // The table of functions
size_t count; /* in element size units */
char **contents; /* The contents of the collection */
size_t capacity; /* in element_size units */
unsigned int flags; // Read-only or other flags
};
// This is the only exported function from this module
StringCollection * newStringCollection(int startsize);

------------------------------------------end of stringcollection.h

Implementation
--------------

I haven't had the time to comment this more in depth. I hope that
this will be done in the ensuing discussion.

------------------------------------------stringcollection.c
#include "strcollection.h"
// Forward definitions
static int GetCount( struct _StringCollection *SC);
static int IsReadOnly( struct _StringCollection *SC);
static int SetReadOnly( struct _StringCollection *SC,int newval);
static int Add( struct _StringCollection *SC,char *newval);
static int AddRange( struct _StringCollection *SC,char **newvalues);
static int Clear( struct _StringCollection *SC);
static int Contains( struct _StringCollection *SC,char *str);
static char **CopyTo( struct _StringCollection *SC);
static int IndexOf( struct _StringCollection *SC,char *SearchedString);
static int Insert( struct _StringCollection *SC,char *);
static int InsertAt( struct _StringCollection *SC,int idx,char *newval);
static char *IndexAt( struct _StringCollection *SC,int idx);
static int Remove( struct _StringCollection *SC,char *);
static int RemoveAt( struct _StringCollection *SC,int idx);
static int Finalize( struct _StringCollection *SC);
static int GetCapacity( struct _StringCollection *SC);
static int SetCapacity( struct _StringCollection *SC,int newCapacity);
static void Apply(struct _StringCollection *SC,int(*Applyfn)(char *,void
*),void *arg);
static int *Map(struct _StringCollection *SC,int (*Applyfn)(char *));
static int Push(struct _StringCollection *SC,char *str);
static char *Pop(struct _StringCollection *SC);
static char *ReplaceAt(struct _StringCollection *SC,int idx,char *newval);
static StringCollectionFunctions lpVtableSC = {
GetCount, IsReadOnly, SetReadOnly, Add, AddRange,
Clear, Contains, CopyTo, IndexOf, Insert,
InsertAt, IndexAt, Remove, RemoveAt,
Finalize, GetCapacity, SetCapacity, Apply,
Map, Push, Pop, ReplaceAt,
};

static char *DuplicateString(char *str)
{
char *result;
if (str == NULL)
return NULL;
result = MALLOC(strlen(str)+1);
if (result == NULL)
return NULL;
strcpy(result,str);
return result;
}

#define SC_READONLY 1
#define CHUNKSIZE 20

StringCollection * newStringCollection(int startsize)
{
StringCollection *result = MALLOC(sizeof(StringCollection));
if (result == NULL)
return NULL;
result->count = 0;
if (startsize == 0)
startsize = DEFAULT_START_SIZE;
result->contents = MALLOC(startsize*sizeof(char *));
if (result->contents == NULL) {
FREE(result);
return NULL;
}
memset(result->contents,0,sizeof(char *)*startsize);
result->capacity = startsize;
result->count = 0;
result->flags = 0;
result->lpVtbl = &lpVtableSC;
return result;
}

static int GetCount(struct _StringCollection *SC)
{
return SC->count;
}
static int IsReadOnly(struct _StringCollection *SC)
{
return SC->flags * SC_READONLY ? 1 : 0;
}
static int SetReadOnly(struct _StringCollection *SC,int newval)
{
int oldval = SC->flags * SC_READONLY ? 1 : 0;
if (newval)
SC->flags |= SC_READONLY;
else
SC->flags &= ~SC_READONLY;
return oldval;
}

static int Resize(struct _StringCollection *SC)
{
int newcapacity = SC->capacity + CHUNKSIZE;
char **oldcontents = SC->contents;
SC->contents = MALLOC(newcapacity*sizeof(char *));
if (SC->contents == NULL) {
SC->contents = oldcontents;
return 0;
}
memset(SC->contents,0,sizeof(char *)*newcapacity);
memcpy(SC->contents,oldcontents,SC->count*sizeof(char *));
SC->capacity = newcapacity;
return 1;
}

static int Add(struct _StringCollection *SC,char *newval)
{
if (SC->flags & SC_READONLY)
return -1;
if (SC->count >= SC->capacity) {
if (!Resize(SC))
return 0;
}

if (newval) {
SC->contents[SC->count] = DuplicateString(newval);
if (SC->contents[SC->count] == NULL) {
return 0;
}
}
else
SC->contents[SC->count] = NULL;
SC->count++;
return SC->count;
}

static int AddRange(struct _StringCollection * SC,char **data)
{
int i = 0;
if (SC->flags & SC_READONLY)
return 0;
while (data[i] != NULL) {
int r = Add(SC,data[i]);
if (r <= 0)
return r;
i++;
}
return SC->count;
}

static int Clear(struct _StringCollection * SC)
{
int oldval = SC->count,i;
if (SC->flags & SC_READONLY)
return 0;
for (i=0; i<SC->count;i++) {
FREE(SC->contents[i]);
SC->contents[i] = NULL;
}
SC->count = 0;
return oldval;
}

static int Contains(struct _StringCollection * SC,char *str)
{
int c,i;
if (str == NULL)
return 0;
c = *str;
for (i=0; i<SC->count;i++) {
if (c == SC->contents[i][0] && !strcmp(SC->contents[i],str))
return 1;
}
return 0;
}

static char **CopyTo(struct _StringCollection *SC)
{
char **result = MALLOC((1+SC->count)*sizeof(char *));
int i;
if (result == NULL)
return NULL;
for (i=0; i<SC->count;i++) {
result[i] = DuplicateString(SC->contents[i]);
}
result[i] = NULL;
return result;
}

#ifdef __LCC__
/* The lcc-win compiler allows operator overloading, and allows using
this abstract data type with the more natural [ ] operators */
char * __declspec(naked) operator[](StringCollection *SC, int idx)
{
}
#endif
static int IndexOf(struct _StringCollection *SC,char *str)
{
int i;

for (i=0; i<SC->count;i++) {
if (!strcmp(SC->contents[i],str)) {
return i;
}
}
return -1;
}
static char *IndexAt(struct _StringCollection *SC,int idx)
{
if (idx >=SC->count || idx < 0)
return NULL;
return SC->contents[idx];
}

static int InsertAt(struct _StringCollection *SC,int idx,char *newval)
{
if (SC->flags & SC_READONLY)
return 0;
if (SC->count >= (SC->capacity-1)) {
if (!Resize(SC))
return 0;
}
if (idx < 0 || idx SC->count)
return -1;
if (idx == 0) {
if (SC->count 0)
memmove(SC->contents+1,SC->contents,SC->count*sizeof(char *));
SC->contents[0] = newval;
}
else if (idx == SC->count) {
SC->contents[idx] = newval;
}
else if (idx < SC->count) {

memmove(SC->contents+idx+1,SC->contents+idx,(SC->count-idx+1)*sizeof(char
*));
SC->contents[idx] = newval;
}
SC->count++;
return SC->count;
}

static int Insert(struct _StringCollection *SC,char *newval)
{
if (SC->flags & SC_READONLY)
return 0;
return InsertAt(SC,0,newval);
}

static int RemoveAt(struct _StringCollection *SC,int idx)
{
if (idx >= SC->count || idx < 0)
return -1;
if (SC->count == 0)
return -2;
FREE(SC->contents[idx]);
memmove(SC->contents+idx,SC->contents+idx+1,(SC->count-idx)*sizeof(char
*));
SC->contents[SC->count-1]=NULL;
SC->count--;
return SC->count;
}

static int Remove(struct _StringCollection *SC,char *str)
{
int idx = IndexOf(SC,str);
if (idx < 0)
return idx;
return RemoveAt(SC,idx);
}

static int Push(struct _StringCollection *SC,char *str)
{
char *r;
if (SC->flags&SC_READONLY)
return 0;
if (SC->count >= SC->capacity) {
if (!Resize(SC))
return 0;
}
r = DuplicateString(str);
if (r == NULL)
return 0;
SC->contents[SC->count++] = r;
return SC->count;
}

static char * Pop(struct _StringCollection *SC)
{
char *result;
if ((SC->flags&SC_READONLY) || SC->count == 0)
return NULL;
SC->count--;
result = SC->contents[SC->count];
SC->contents[SC->count] = NULL;
return result;
}

static int Finalize(struct _StringCollection *SC)
{
int result = SC->count,i;
for (i=0; i<SC->count;i++) {
FREE(SC->contents[i]);
}
FREE(SC->contents);
FREE(SC);
return result;
}

static int GetCapacity(struct _StringCollection *SC)
{
return SC->capacity;
}

static int SetCapacity(struct _StringCollection *SC,int newCapacity)
{
if (SC->count != 0)
return 0;
FREE(SC->contents);
SC->contents = MALLOC(newCapacity*sizeof(char *));
memset(SC->contents,0,sizeof(char *)*newCapacity);
SC->capacity = newCapacity;
return 1;
}

static void Apply(struct _StringCollection *SC,int (*Applyfn)(char
*,void *),void *arg)
{
int i;
for (i=0; i<SC->count;i++) {
Applyfn(SC->contents[i],arg);
}
}

static int *Map(struct _StringCollection *SC,int (*Applyfn)(char *))
{
int *result = MALLOC(SC->count*sizeof(int)),i;

if (result == NULL)
return NULL;
for (i=0; i<SC->count;i++) {
result[i] = Applyfn(SC->contents[i]);
}
return result;
}

#ifdef __LCC__
/* The lcc-win compiler allows operator overloading, and allows using
this abstract data type with the more natural [ ]= operators */
char * __declspec(naked) operator[]=(StringCollection *SC,int idx,char
*newval)
{
}

static char *ReplaceAt(StringCollection *SC,int idx,char *newval)
{
if (SC->flags & SC_READONLY)
return NULL;
if (idx < 0 || idx SC->count)
return NULL;
FREE(SC->contents[idx]);
SC->contents[idx] = newval;
return newval;
}
#ifdef TEST
#include <stdio.h>
static void PrintStringCollection(StringCollection *SC)
{
int i;
printf("Count %d, Capacity %d\n",SC->count,SC->capacity);
for (i=0; i<SC->count;i++) {
printf("%s\n",SC->lpVtbl->IndexAt(SC,i));
}
printf("\n");
}
int main(void)
{
StringCollection *SC = newStringCollection(10);
char *p;
SC->lpVtbl->Add(SC,"Martin");
SC->lpVtbl->Insert(SC,"Jakob");
printf("Count should be 2, is %d\n",SC->lpVtbl->GetCount(SC));
PrintStringCollection(SC);
SC->lpVtbl->InsertAt(SC,1,"Position 1");
SC->lpVtbl->InsertAt(SC,2,"Position 2");
PrintStringCollection(SC);
SC->lpVtbl->Remove(SC,"Jakob");
PrintStringCollection(SC);
SC->lpVtbl->Push(SC,"pushed");
PrintStringCollection(SC);
SC->lpVtbl->Pop(SC);
PrintStringCollection(SC);
p = SC->lpVtbl->IndexAt(SC,1);
printf("Item position 1:%s\n",p);
PrintStringCollection(SC);
}
#endif

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Oct 28 '07 #1
3 5061
Charlie Gordon wrote:
"jacob navia" <ja***@nospam.orga écrit dans le message de news:
47***********************@news.orange.fr...

<original post snipped>

Thank you for your contribution Jacob.
The code was flushed left probably because of tabs, I reformated it and
fixed a number of small and not so small issues. See below:

Problems fixed:

added API remarks
fixed missing #endif
fixed indentation and spacing
used typedef instead of struct tag in implementation
added const on unmodified string parameters
#include <stddef.hinstead of <string.h>
fixed some bogus bit tests: SC->flags * SC_READONLY ?
added definition for DEFAULT_START_SIZE, allowed it to be 0
removed redundant initialization of result->count
used safer MALLOC idiom (à la c.l.c)
fixed off by one error in ReplaceAt
fixed memory leak in Resize
fixed crash on NULL strings in Contains and IndexOf
used same semantics and method in IndexOf and Contains and shared code
fixed two off by one errors in InsertAt
simplified InsertAt
removed test for condition never reached in RemoveAt
fixed off by one bug in RemoveAt
make SetCapacity work for a non empty collection
added test for MALLOC failure in SetCapacity
duplicate string in Insert, InsertAt, ReplaceAt; handle NULL and failure
allow ReplaceAt with an index of SC->count
used int instead of size_t for count and size for consistency with API.
improved PrintStringCollection and test main output
added missing return 0 in main
Thanks a lot!

The original version of this code is written for
my compiler system and distributed with the source code
in the lcc-win distribution.

I eliminated the lcc-win specific parts in this code.
When doing this, I may have introduced some of the bugs
above. Specifically the replacing of references with pointers
intoruced the bug replacing the "&" in an AND operation
with a multiplication sign!

When I comment about it then, I will refer to this version.

[snip code]

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Oct 29 '07 #2
On Sun, 28 Oct 2007 20:52:11 +0100, jacob navia wrote:
>Abstract:
Continuing the discussion about abstract data types, in this
discussion group, a string collection data type is presented,
patterned after the collection in C# and similar languages (Java).
It stores character strings, and resizes itself to accommodate
new strings when needed.
A collection for strings certainly can be useful in many cases. I have
two major objection to the current design:

1. The function pointers are unnecessary. I can hardly imagine anyone
wants to override/replace some of the container functions. OTOH, the
function pointers make usage clumsy. One has to write:
SC->lpVtbl->Remove(SC,"Jakob");
instead of just:
Remove(SC,"Jakob");

Actually, I don't see why the vtable is exposed to the user at all.
Even if you insist on function pointers you can implement something
like:

static
int Remove (StringCollection *SC, const char *str) {
return SC->lpVtbl->Remove(SC, str);
}

2. Memory management:
Strings in the container are managed by the container (via
Finalize()). But in some cases the returned strings must be freed by
the caller. A clear and unambiguous strategy would prevent memory
leaks.
--
Roland Pibinger
"The best software is simple, elegant, and full of drama" - Grady Booch
Oct 29 '07 #3
Roland Pibinger wrote:
On Sun, 28 Oct 2007 20:52:11 +0100, jacob navia wrote:
>Abstract:
Continuing the discussion about abstract data types, in this
discussion group, a string collection data type is presented,
patterned after the collection in C# and similar languages (Java).
It stores character strings, and resizes itself to accommodate
new strings when needed.

A collection for strings certainly can be useful in many cases. I have
two major objection to the current design:

1. The function pointers are unnecessary. I can hardly imagine anyone
wants to override/replace some of the container functions. OTOH, the
function pointers make usage clumsy. One has to write:
SC->lpVtbl->Remove(SC,"Jakob");
instead of just:
Remove(SC,"Jakob");
This is certainly possible woth some macros.

However, the key idea of the design is precisely to give several
containers the SAME interface, so that if you decide to
replace a string collection with a list, you can still have

SC->lpVtbl->Remove(SC,"Jacob");

without changing a single line from your user code besides a few
declarations.

An added advantage is that it could very well exist some "Remove"
function in a LOT of user code, so that we would be forced to
use SOME kind of identifier prefix like

CLC_LIST_Remove

or similar anyway.

Those macros are so trivial to write that they could be proposed but
not be a forced part of the interface.
Actually, I don't see why the vtable is exposed to the user at all.
Even if you insist on function pointers you can implement something
like:

static
int Remove (StringCollection *SC, const char *str) {
return SC->lpVtbl->Remove(SC, str);
}

The advantages of exposing the vtable is to give the user the freedom to
change to behavior of the containers as he/she wishes WITHOUT changing
a single line in the user code.

For instance if you want to store only unique strings in the table,
and keep it sorted at all times, you just change the table of
functions and add the new functionality WITHOUT any change
to all your code at all!!

Besides, this gives the user the possibility of changing the behavior
of only ONE instance or changing the behavior of ALL instances
without any limitations with a VERY fast pointer assignment operation!

>
2. Memory management:
Strings in the container are managed by the container (via
Finalize()). But in some cases the returned strings must be freed by
the caller. A clear and unambiguous strategy would prevent memory
leaks.

This is true, but I do not see any easy way out. When I return a string
should I allocate a new copy and free the one in the container?

But maybe there is a better solution. You have a valid point.
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
Oct 29 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Neil Schemenauer | last post by:
The title is perhaps a little too grandiose but it's the best I could think of. The change is really not large. Personally, I would be happy enough if only %s was changed and the built-in was...
9
by: craig | last post by:
Assume that you would like to create a custom collection class that is: 1. Strongly-typed (only holds Customer objects) 2. Read-only (user cannot add Customer objects) 3. Able to be bound to...
11
by: Pavils Jurjans | last post by:
Hello, There's some confusion about the purpose and difference between these handy classes... First, both of them are holding number of key - value pairs, right? Then, I see that there may be...
6
by: Dan V. | last post by:
I would like to create a 2D string list (2D ArrayList ???). I would like to pass in a table or query as a parameter and have both columns transform into a 2D ArrayList. When I sort the one...
5
by: aa7im | last post by:
I am attempting to create a base collection that I can use to derive strongly typed collection. My base collection class will derive from CollectionBase and I want my typed collections to be...
2
by: dcew | last post by:
Here's what I'm trying to understand; how can you store a generic collection in a variable/field? If I have an abstract generic collection class as follows... public abstract class...
1
by: conchur | last post by:
Is there an elegant way to have a class utilise an inherited method to return a Typed Collection of itself? Example: class MyList<T> : List<T> where T : MyRoot { Add... Remove... }
2
by: cody | last post by:
Hi! we are building an xml-export tool to export our business entities from out app. we use reflection to determine the data type of properties. e.g. we have a class Customer with a CustNo...
14
by: cwineman | last post by:
Hello, I'm hoping to do something using Generics, but I'm not sure it's possible. Let's say I want to have a bunch of business objects and a data access class cooresponding to each business...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.