473,386 Members | 1,835 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Storing 15 million items in an array

Hi there,

I'm trying to create a data structure to store at least 13-15 million items that I need constant time access to.

Memory is a real issue, so I tried implementing the structure as a simple array (int a[10000000]), but I've found out, since arrays use the stack, that I overflow it around 100k items.

I don't know how to use the heap and still allow constant time access to my data members, and implement it in assembly to cut memory usage.

Do any of you guys have any ideas that could help?

Thank you.

Best regards,
Thomas J.
Jun 18 '09 #1
8 13579
donbock
2,426 Expert 2GB
Find the size of each item.
Multiply that size by the number of items to compute the amount of storage needed.
Does that amount seem reasonable for the amount of RAM in your computer?

If int is 16-bits on your machine, then you need 20MB of RAM for the array.
If int is 32-bits on your machine, then you need 40MB of RAM for the array.

Try declaring a static array:
Expand|Select|Wrap|Line Numbers
  1. static int a[10000000];
The 'static' keyword keeps the array off the stack.

If you don't have enough RAM for the array, then you need to come up with a new scheme -- you can't keep the entire array in memory.
Jun 18 '09 #2
Thanks so much Don! Highly appreciate it!

Best regards,
Tom
Jun 18 '09 #3
JosAH
11,448 Expert 8TB
@donbock
You can index 32K-1 elements in an array then which makes a bit less than 32K*2 bytes == 64K bytes.

kind regards,

Jos
Jun 18 '09 #4
weaknessforcats
9,208 Expert Mod 8TB
Using the heap just means you have to create your array at run-time.

You don't say whether you are using C++ or not.

If you are using C++, then use a vector. A vector is required to implement an array. The vector manages its array on the heap. You can create the vector of the correct size. The vector itself can be on the stack.

Expand|Select|Wrap|Line Numbers
  1. vector<int> a(10000000);
If you are using C, then allocate your array on the heap:

Expand|Select|Wrap|Line Numbers
  1. int* a = malloc(10000000 * sizeof(int));
In both cases a[10] is the 11th element.

At the end of the program the vector will clean itself up whereas the array allocated by malloc will need to be freed:

Expand|Select|Wrap|Line Numbers
  1. free(a);
Jun 18 '09 #5
donbock
2,426 Expert 2GB
@JosAH
Jos makes a good point. You should use a 'long' (rather than an 'int') variable for the array index.
Jun 18 '09 #6
JosAH
11,448 Expert 8TB
@donbock
Your implementation may not allow you to do that; the type of an index has to fit in a ptrdiff_t and it is defined as an ordinary int in most implementations.

kind regars,

Jos
Jun 18 '09 #7
donbock
2,426 Expert 2GB
@JosAH
My habit when being careful (that is, when it isn't obvious that the array is small), is to use type size_t for array index variables. These two types are related as follows:
Expand|Select|Wrap|Line Numbers
  1. typedef ptrdiff_t signed T;
  2. typedef size_t unsigned T;
where T is required [I think] to be the same integer type for both.

This is a difference that doesn't make much difference. If ptrdiff_t is too small to count up to one million then so is size_t. On the other hand, all kinds of things would go wrong if a size_t variable couldn't hold sizeof the array. For instance, you probably couldn't malloc the array.
Jun 18 '09 #8
Banfa
9,065 Expert Mod 8TB
@thomasjbake
Just because you want constant time access doesn't mean that you need to use a single array to hold your data.

For instance imagine 1000 arrays each holding 1000 items. No array will break the size limits even on a 16 bit machine. I imagine this implemented as an array of 1000 pointers each malloc'd an array of 1000 items (ints).

My access function takes a long ix and if my array of 1000 pointers is called aDataArray returns

aDataArray[ix/1000][ix%1000]

That is a constant time look-up. It is however a slower look-up than occurs if you use a single array, it's got all that computation to start with.

My point is that by saying "constant time access" you don't really say very much. You certainly say nothing of how fast the access needs to be. An array of 1000000 items where it takes 10 seconds to access each item is still constant time access.

So when you said "constant time access" did you also intend to include some sort of statement of speed of access too?
Jun 18 '09 #9

Sign in to post your reply or Sign up for a free account.

Similar topics

6
by: supercomputer | last post by:
I am using this function to parse data I have stored in an array. This is what the array looks like: , , , , , , , , , , , , , , , , , , , , , , , ] This is the code to parse the array:
15
by: Tor Erik Sønvisen | last post by:
Hi I need a time and space efficient way of storing up to 6 million bits. Time efficency is more important then space efficency as I'm going to do searches through the bit-set. regards tores
22
by: Wynand Winterbach | last post by:
I think every C programmer can relate to the frustrations that malloc allocated arrays bring. In particular, I've always found the fact that the size of an array must be stored separately to be a...
3
by: Brad | last post by:
I am storing an array which contains about a dozen chracter items to a Session variable. Later, I need to use this array so I am doing the following: Dim eventTypes As String() =...
2
by: gh | last post by:
Hi, I have a string variable which contains n number of comma delimited elements and I would like to store each element into an array but I could not figure how to do it. for example,...
5
by: rbragg | last post by:
All of my other form data is stored correctly in the db except for my checkbox data. This column in my table is empty. I have this checkbox group on my form: <input name="cbItems"...
1
by: Miesha.James | last post by:
Hello, I'm trying to rewrite visual c++ code into visual c++ .NET code and I've run across a problem with storing objects into a list. Here;s an example of the code I have: ref struct...
3
by: jeet232 | last post by:
I'm writing a search engine, i'll need to store giga bytes of data, and i'm dividing it in multiple machines and multiple structures; now, my question is that which is the best implementation for...
6
by: Carl Banks | last post by:
I was wondering if anyone had any advice on this. This is not to study graph theory; I'm using the graph to represent a problem domain. The graphs could be arbitrarily large, and could easily...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.