Storing 15 million items in an array

Hi there,

I'm trying to create a data structure to store at least 13-15 million items that I need constant time access to.

Memory is a real issue, so I tried implementing the structure as a simple array (int a[10000000]), but I've found out, since arrays use the stack, that I overflow it around 100k items.

I don't know how to use the heap and still allow constant time access to my data members, and implement it in assembly to cut memory usage.

Do any of you guys have any ideas that could help?

Thank you.

Best regards,
Thomas J.

Jun 18 '09 #1

Subscribe Post Reply

13579

donbock

2,426

Expert 2GB

Find the size of each item.
Multiply that size by the number of items to compute the amount of storage needed.
Does that amount seem reasonable for the amount of RAM in your computer?

If int is 16-bits on your machine, then you need 20MB of RAM for the array.
If int is 32-bits on your machine, then you need 40MB of RAM for the array.

Try declaring a static array:

Expand|Select|Wrap|Line Numbers

static int a[10000000];

The 'static' keyword keeps the array off the stack.

If you don't have enough RAM for the array, then you need to come up with a new scheme -- you can't keep the entire array in memory.

Jun 18 '09 #2

thomasjbake

Thanks so much Don! Highly appreciate it!

Best regards,
Tom

Jun 18 '09 #3

JosAH

11,448

Expert 8TB

@donbock
You can index 32K-1 elements in an array then which makes a bit less than 32K*2 bytes == 64K bytes.

kind regards,

Jos

Jun 18 '09 #4

weaknessforcats

9,208

Expert Mod 8TB

Using the heap just means you have to create your array at run-time.

You don't say whether you are using C++ or not.

If you are using C++, then use a vector. A vector is required to implement an array. The vector manages its array on the heap. You can create the vector of the correct size. The vector itself can be on the stack.

Expand|Select|Wrap|Line Numbers

vector<int> a(10000000);

If you are using C, then allocate your array on the heap:

Expand|Select|Wrap|Line Numbers

int* a = malloc(10000000 * sizeof(int));

In both cases a[10] is the 11th element.

At the end of the program the vector will clean itself up whereas the array allocated by malloc will need to be freed:

Expand|Select|Wrap|Line Numbers

free(a);

Jun 18 '09 #5

donbock

2,426

Expert 2GB

@JosAH
Jos makes a good point. You should use a 'long' (rather than an 'int') variable for the array index.

Jun 18 '09 #6

JosAH

11,448

Expert 8TB

@donbock
Your implementation may not allow you to do that; the type of an index has to fit in a ptrdiff_t and it is defined as an ordinary int in most implementations.

kind regars,

Jos

Jun 18 '09 #7

donbock

2,426

Expert 2GB

@JosAH
My habit when being careful (that is, when it isn't obvious that the array is small), is to use type size_t for array index variables. These two types are related as follows:

Expand|Select|Wrap|Line Numbers

 typedef ptrdiff_t signed T;

typedef size_t unsigned T;

where T is required [I think] to be the same integer type for both.

This is a difference that doesn't make much difference. If ptrdiff_t is too small to count up to one million then so is size_t. On the other hand, all kinds of things would go wrong if a size_t variable couldn't hold sizeof the array. For instance, you probably couldn't malloc the array.

Jun 18 '09 #8

Banfa

9,065

Expert Mod 8TB

@thomasjbake
Just because you want constant time access doesn't mean that you need to use a single array to hold your data.

For instance imagine 1000 arrays each holding 1000 items. No array will break the size limits even on a 16 bit machine. I imagine this implemented as an array of 1000 pointers each malloc'd an array of 1000 items (ints).

My access function takes a long ix and if my array of 1000 pointers is called aDataArray returns

aDataArray[ix/1000][ix%1000]

That is a constant time look-up. It is however a slower look-up than occurs if you use a single array, it's got all that computation to start with.

My point is that by saying "constant time access" you don't really say very much. You certainly say nothing of how fast the access needs to be. An array of 1000000 items where it takes 10 seconds to access each item is still constant time access.

So when you said "constant time access" did you also intend to include some sort of statement of speed of access too?

Jun 18 '09 #9

by: supercomputer | last post by:

I am using this function to parse data I have stored in an array. This is what the array looks like: , , , , , , , , , , , , , , , , , , , , , , , ] This is the code to parse the array:

Python

Most efficient way of storing 1024*1024 bits

by: Tor Erik Sønvisen | last post by:

Hi I need a time and space efficient way of storing up to 6 million bits. Time efficency is more important then space efficency as I'm going to do searches through the bit-set. regards tores

Python

Storing the size of an array in the structure itself

by: Wynand Winterbach | last post by:

I think every C programmer can relate to the frustrations that malloc allocated arrays bring. In particular, I've always found the fact that the size of an array must be stored separately to be a...

C / C++

Storing an array to a Session

by: Brad | last post by:

I am storing an array which contains about a dozen chracter items to a Session variable. Later, I need to use this array so I am doing the following: Dim eventTypes As String() =...

ASP.NET

storing delimited string in an array

by: gh | last post by:

Hi, I have a string variable which contains n number of comma delimited elements and I would like to store each element into an array but I could not figure how to do it. for example,...

C / C++

storing checkbox data using implode

by: rbragg | last post by:

All of my other form data is stored correctly in the db except for my checkbox data. This column in my table is empty. I have this checkbox group on my form: <input name="cbItems"...

PHP

Storing objects into a generic list.

by: Miesha.James | last post by:

Hello, I'm trying to rewrite visual c++ code into visual c++ .NET code and I've run across a problem with storing objects into a list. Here;s an example of the code I have: ref struct...

.NET Framework

Storing GBs of data

by: jeet232 | last post by:

I'm writing a search engine, i'll need to store giga bytes of data, and i'm dividing it in multiple machines and multiple structures; now, my question is that which is the best implementation for...

C / C++

Database specialized in storing directed graphs?

by: Carl Banks | last post by:

I was wondering if anyone had any advice on this. This is not to study graph theory; I'm using the graph to represent a problem domain. The graphs could be arbitrarily large, and could easily...

Python

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Storing 15 million items in an array

Similar topics