By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,501 Members | 1,660 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,501 IT Pros & Developers. It's quick & easy.

Best way to allocate a large amount of data

P: n/a
I have a program that requires x strings all of y length. x will be in the range
of 100-10000 whereas the strings will all be < 200 each.

This does not need to be grown once it has been created.

Should I allocate x strings of y length or should I allocate a single string x *
y long? Which would be more efficient and / or portable?

Thank you.
Nov 14 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
Peter Hickman wrote:
I have a program that requires x strings all of y length. x will be in
the range of 100-10000 whereas the strings will all be < 200 each.

This does not need to be grown once it has been created.

Should I allocate x strings of y length or should I allocate a single
string x * y long? Which would be more efficient and / or portable?
Allocate x strings of y length. Something like this(untested):

char** alloc_strings(size_t x, size_t y)
{
char** mystrings = malloc(sizeof *mystrings * x);
if(mystrings != NULL) {
size_t i;
for(i = 0; i < x; i++) {
if( (mystrings[i] = malloc(y)) == NULL) {
while(i--)
free(mystrings[i]);
free(mystrings);
mystrings = NULL;
}
}
}
return mystrings;
}

Efficiency depends on what you plan to do with all the strings. ;-)

Thank you.


Bjørn
Nov 14 '05 #2

P: n/a

"Peter Hickman" <pe***@semantico.com> wrote in message
news:41***********************@news.easynet.co.uk. ..
I have a program that requires x strings all of y length. x will be in the range of 100-10000 whereas the strings will all be < 200 each.

This does not need to be grown once it has been created.

Should I allocate x strings of y length or should I allocate a single string x * y long? Which would be more efficient and / or portable?


You're best off storing them in a string array.

Allocating a single string with x*y length is a bad idea. One problem is
that you'll need the null terminator or a sentinal character to seperate the
strings. This would force you to re-write many of the string functions that
you need, and add house-keeping data (which is not good if you're concerned
about memory space).

Of course, you could have one string that is 200 characters and x-1 strings
that are 1 character long. But, I'm assuming the deviation of lengths isn't
that big, and you don't know any of the data in advance.
Nov 14 '05 #3

P: n/a
On Tue, 30 Nov 2004 12:54:23 +0000, Peter Hickman wrote:
I have a program that requires x strings all of y length. x will be in
the range
of 100-10000 whereas the strings will all be < 200 each.

This does not need to be grown once it has been created.

Should I allocate x strings of y length or should I allocate a single
string x * y long? Which would be more efficient and / or portable?

Thank you.


The simplest approach would be to allocate memory for each string
individually. 10000 strings isn't a HUGE number (depending on your
environment) and overheads of separate allocation may not be significant.

Allocating one large memory block for all of the strings is likely to be
more efficient in terms of speed and space. You have to write the code to
suballocate from that block but that isn't very tricky. You can't
realloc() for individual strings and you can only free() everything in one
go, which is very simple if that is what you need.

On the portability side there are implementations that can allocate lots
of little objects but not one big object of the same total size. For
example some 16 bit implementations limit the size of any one object to
below 64K but permit the total for all allocations to exceed that. However
a couple of megabytes isn't a particularly large allocation these days and
it is reasonable not to worry about that unless you have a particular
reason to do so.

Lawrence
Nov 14 '05 #4

P: n/a
Sorry I'm being a bit sloppy with my wording here. The length of a string is
likely to be less than 200 characters but all strings will be the same length,
whatever that length is.

What worries me is that allocating a large number of small strings may eat up
resources out of proportion to the data they hold. So it would seem that a large
block of memory would be a good idea but I don't know if allocating a single
large chunk of data has problems of it's own.

Until the 64K limit of some older systems was mentioned I had clean forgotten
about it.
Nov 14 '05 #5

P: n/a
On Wed, 01 Dec 2004 11:28:27 +0000, Peter Hickman wrote:
Sorry I'm being a bit sloppy with my wording here. The length of a string is
likely to be less than 200 characters but all strings will be the same length,
whatever that length is.

What worries me is that allocating a large number of small strings may eat up
resources out of proportion to the data they hold. So it would seem that a large
block of memory would be a good idea but I don't know if allocating a single
large chunk of data has problems of it's own.

Until the 64K limit of some older systems was mentioned I had clean forgotten
about it.


If you create for yourself, say, an array of pointers to char which you
set up and use to access the strings, it becomes almost immaterial which
method you use to allocate (and free) them, because the access method will
be consistent. If you implement one allocation method and don't like it
you can alter the allocation code later on without affecting the code that
accesses the string data.

Lawrence
Nov 14 '05 #6

P: n/a
In <41***********************@news.easynet.co.uk> Peter Hickman <pe***@semantico.com> writes:
Sorry I'm being a bit sloppy with my wording here. The length of a string is
likely to be less than 200 characters but all strings will be the same length,
whatever that length is.

What worries me is that allocating a large number of small strings may eat up
resources out of proportion to the data they hold. So it would seem that a large
block of memory would be a good idea but I don't know if allocating a single
large chunk of data has problems of it's own.
Much less than any other approach. If the number of strings is known at
compile time, just use:

static char mystrings[X][Y];

If it's not, use a pointer to an array of Y characters:

char (*mystrings)[Y] = malloc(X * sizeof *mystrings);

In either case, you acees mystrings using the same syntax: mystring[i]
refers to a whole string, while mystring[i][j] to a character from a
string.

If Y is not a compile-time constant, either, you also need to allocate
a dope vector, which is one more malloc call, but the syntax is still
the same, except that sizeof *mystring will no longer be equal to Y
(except by pure accident):

char **mystrings = malloc(Y * sizeof *mystrings);
mystrings[0] = malloc(X * Y);
for (i = 1; i < X; i++) mystrings[i] = mystrings[i - 1] + Y;

If you ever deallocate the strings, call free(mystrings[0]) first and then
free(mystrings).

Note, however, that, although the syntax to access mystrings is the same,
this method is slower than the first two, because each access needs one
more pointer dereferencing. If this is going to be an issue, you may
want to simply allocate X * Y bytes and do the index arithmetic on your
own (usually using function-like macros).

Error checking deliberately omitted, BTW.
Until the 64K limit of some older systems was mentioned I had clean forgotten
about it.


It was a bogus argument: even those older systems provided ways to
allocate objects as large as the available memory, some of them available
to standard C code (e.g. the huge memory model of the MSDOS C
implementations).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union
Nov 14 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.