473,404 Members | 2,137 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,404 software developers and data experts.

Best way to allocate a large amount of data

I have a program that requires x strings all of y length. x will be in the range
of 100-10000 whereas the strings will all be < 200 each.

This does not need to be grown once it has been created.

Should I allocate x strings of y length or should I allocate a single string x *
y long? Which would be more efficient and / or portable?

Thank you.
Nov 14 '05 #1
6 2347
Peter Hickman wrote:
I have a program that requires x strings all of y length. x will be in
the range of 100-10000 whereas the strings will all be < 200 each.

This does not need to be grown once it has been created.

Should I allocate x strings of y length or should I allocate a single
string x * y long? Which would be more efficient and / or portable?
Allocate x strings of y length. Something like this(untested):

char** alloc_strings(size_t x, size_t y)
{
char** mystrings = malloc(sizeof *mystrings * x);
if(mystrings != NULL) {
size_t i;
for(i = 0; i < x; i++) {
if( (mystrings[i] = malloc(y)) == NULL) {
while(i--)
free(mystrings[i]);
free(mystrings);
mystrings = NULL;
}
}
}
return mystrings;
}

Efficiency depends on what you plan to do with all the strings. ;-)

Thank you.


Bjørn
Nov 14 '05 #2

"Peter Hickman" <pe***@semantico.com> wrote in message
news:41***********************@news.easynet.co.uk. ..
I have a program that requires x strings all of y length. x will be in the range of 100-10000 whereas the strings will all be < 200 each.

This does not need to be grown once it has been created.

Should I allocate x strings of y length or should I allocate a single string x * y long? Which would be more efficient and / or portable?


You're best off storing them in a string array.

Allocating a single string with x*y length is a bad idea. One problem is
that you'll need the null terminator or a sentinal character to seperate the
strings. This would force you to re-write many of the string functions that
you need, and add house-keeping data (which is not good if you're concerned
about memory space).

Of course, you could have one string that is 200 characters and x-1 strings
that are 1 character long. But, I'm assuming the deviation of lengths isn't
that big, and you don't know any of the data in advance.
Nov 14 '05 #3
On Tue, 30 Nov 2004 12:54:23 +0000, Peter Hickman wrote:
I have a program that requires x strings all of y length. x will be in
the range
of 100-10000 whereas the strings will all be < 200 each.

This does not need to be grown once it has been created.

Should I allocate x strings of y length or should I allocate a single
string x * y long? Which would be more efficient and / or portable?

Thank you.


The simplest approach would be to allocate memory for each string
individually. 10000 strings isn't a HUGE number (depending on your
environment) and overheads of separate allocation may not be significant.

Allocating one large memory block for all of the strings is likely to be
more efficient in terms of speed and space. You have to write the code to
suballocate from that block but that isn't very tricky. You can't
realloc() for individual strings and you can only free() everything in one
go, which is very simple if that is what you need.

On the portability side there are implementations that can allocate lots
of little objects but not one big object of the same total size. For
example some 16 bit implementations limit the size of any one object to
below 64K but permit the total for all allocations to exceed that. However
a couple of megabytes isn't a particularly large allocation these days and
it is reasonable not to worry about that unless you have a particular
reason to do so.

Lawrence
Nov 14 '05 #4
Sorry I'm being a bit sloppy with my wording here. The length of a string is
likely to be less than 200 characters but all strings will be the same length,
whatever that length is.

What worries me is that allocating a large number of small strings may eat up
resources out of proportion to the data they hold. So it would seem that a large
block of memory would be a good idea but I don't know if allocating a single
large chunk of data has problems of it's own.

Until the 64K limit of some older systems was mentioned I had clean forgotten
about it.
Nov 14 '05 #5
On Wed, 01 Dec 2004 11:28:27 +0000, Peter Hickman wrote:
Sorry I'm being a bit sloppy with my wording here. The length of a string is
likely to be less than 200 characters but all strings will be the same length,
whatever that length is.

What worries me is that allocating a large number of small strings may eat up
resources out of proportion to the data they hold. So it would seem that a large
block of memory would be a good idea but I don't know if allocating a single
large chunk of data has problems of it's own.

Until the 64K limit of some older systems was mentioned I had clean forgotten
about it.


If you create for yourself, say, an array of pointers to char which you
set up and use to access the strings, it becomes almost immaterial which
method you use to allocate (and free) them, because the access method will
be consistent. If you implement one allocation method and don't like it
you can alter the allocation code later on without affecting the code that
accesses the string data.

Lawrence
Nov 14 '05 #6
In <41***********************@news.easynet.co.uk> Peter Hickman <pe***@semantico.com> writes:
Sorry I'm being a bit sloppy with my wording here. The length of a string is
likely to be less than 200 characters but all strings will be the same length,
whatever that length is.

What worries me is that allocating a large number of small strings may eat up
resources out of proportion to the data they hold. So it would seem that a large
block of memory would be a good idea but I don't know if allocating a single
large chunk of data has problems of it's own.
Much less than any other approach. If the number of strings is known at
compile time, just use:

static char mystrings[X][Y];

If it's not, use a pointer to an array of Y characters:

char (*mystrings)[Y] = malloc(X * sizeof *mystrings);

In either case, you acees mystrings using the same syntax: mystring[i]
refers to a whole string, while mystring[i][j] to a character from a
string.

If Y is not a compile-time constant, either, you also need to allocate
a dope vector, which is one more malloc call, but the syntax is still
the same, except that sizeof *mystring will no longer be equal to Y
(except by pure accident):

char **mystrings = malloc(Y * sizeof *mystrings);
mystrings[0] = malloc(X * Y);
for (i = 1; i < X; i++) mystrings[i] = mystrings[i - 1] + Y;

If you ever deallocate the strings, call free(mystrings[0]) first and then
free(mystrings).

Note, however, that, although the syntax to access mystrings is the same,
this method is slower than the first two, because each access needs one
more pointer dereferencing. If this is going to be an issue, you may
want to simply allocate X * Y bytes and do the index arithmetic on your
own (usually using function-like macros).

Error checking deliberately omitted, BTW.
Until the 64K limit of some older systems was mentioned I had clean forgotten
about it.


It was a bogus argument: even those older systems provided ways to
allocate objects as large as the available memory, some of them available
to standard C code (e.g. the huge memory model of the MSDOS C
implementations).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union
Nov 14 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Chris Mullins | last post by:
I'm building a GUI that needs to be able to view a large amount of text arranged in rows. Large being anywhere from a few hundred lines through a few hundred thousand. I need a way to "cap" the max...
0
by: David Helgason | last post by:
I think those best practices threads are a treat to follow (might even consider archiving some of them in a sort of best-practices faq), so here's one more. In coding an game asset server I want...
1
by: kiplring | last post by:
List<string> effectList = new List<string>(); effectList.Clear(); effectList = null; using (List<string> effectList = new List<string>()) { } If there are so many calls, I should save as...
10
by: rAinDeEr | last post by:
Hi, I am trying to create around 70 tablespaces for around 100 tables.. Am using DB2 UDB 8.2 in Linux environment... This is one i generated through Control centre.... CREATE REGULAR...
1
by: trevor.farchild | last post by:
Hi, long time reader, first time poster I have an application that will be doing this 5 times a second: Get a bunch of data from a NetworkStream and convert it into a Bitmap Therefore, the...
20
by: Joe | last post by:
Is any one charting packing considered to be the "best"? We've used ChartFX but wasn't too happy about the way data had to be populated along with some other issues which slip my mind right now and...
7
by: =?Utf-8?B?TW9iaWxlTWFu?= | last post by:
Hello everyone: I am looking for everyone's thoughts on moving large amounts (actually, not very large, but large enough that I'm throwing exceptions using the default configurations). We're...
29
by: calvert4rent | last post by:
I need to some sort of data type that will hold a listing of ID's and their counts/frequency. As I do some processing I get an ID back which I need to store and keep an accurate count for how many...
11
by: Bryan Parkoff | last post by:
I want to know how much static memory is limited before execution program starts. I would write a large array. The large array has 65,536 elements. The data size is double word. The static...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.