473,322 Members | 1,522 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

need an efficient data structure


i need to store a lot of integers(say 10,000) in a datastructure.
The constraints are:
1. It should be fast.
2. It should be orderded or sorted.
3.Insterting and deleting should be fast.
4. The no. of elements can change at runtime and hence it should be dynamic.
Right now i am using SortedVector , but it is very slow.
So, can u pls suggest a suitable Datastructure.
Jul 22 '05 #1
10 2594

"Raghavendra Mahuli" <ra************@in.bosch.com> wrote in message
news:c9**********@ns1.fe.internet.bosch.com...

i need to store a lot of integers(say 10,000) in a datastructure.
The constraints are:
1. It should be fast.
2. It should be orderded or sorted.
3.Insterting and deleting should be fast.
4. The no. of elements can change at runtime and hence it should be dynamic. Right now i am using SortedVector , but it is very slow.
So, can u pls suggest a suitable Datastructure.


std::set or std::multiset if you want duplicates.

john
Jul 22 '05 #2
On Thu, 27 May 2004 09:30:01 +0100, "John Harrison"
<jo*************@hotmail.com> wrote:

"Raghavendra Mahuli" <ra************@in.bosch.com> wrote in message
news:c9**********@ns1.fe.internet.bosch.com...

i need to store a lot of integers(say 10,000) in a datastructure.
The constraints are:
1. It should be fast.
2. It should be orderded or sorted.
3.Insterting and deleting should be fast.
4. The no. of elements can change at runtime and hence it should be

dynamic.
Right now i am using SortedVector , but it is very slow.
So, can u pls suggest a suitable Datastructure.


std::set or std::multiset if you want duplicates.


And make sure you have a fast memory allocator! A pool allocator would
help a lot here, if your std::allocator isn't already optimized in
that way.

Tom
--
C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
Jul 22 '05 #3
Raghavendra Mahuli wrote:
i need to store a lot of integers(say 10,000) in a datastructure.
The constraints are:
1. It should be fast.
There will be trade-offs. Which operations should be fastest?
2. It should be orderded or sorted.
Must that property be maintained at all times, or do you plan to do a
bunch of insertions or deletions between reads? What I mean is, if you
will insert 100 elements in a row, does the collection really need to
re-sort itself after each insertion?
3.Insterting and deleting should be fast.
Inserting and deleting where? If you need to maintain the ordering
after absolutely every insertion, a std::set or std::multiset may be
best, as JohnH suggested. If you only need it to be sorted once all
elements have been inserted, a std::vector may be more efficient by a
constant factor in both time and space.
4. The no. of elements can change at runtime and hence it should be dynamic.


The standard library has dynamically growable collections for almost all
your containment needs. :)
Jul 22 '05 #4
"Raghavendra Mahuli" <ra************@in.bosch.com> wrote in message
i need to store a lot of integers(say 10,000) in a datastructure.
The constraints are:
1. It should be fast.
2. It should be orderded or sorted.
3.Insterting and deleting should be fast.
4. The no. of elements can change at runtime and hence it should be dynamic. Right now i am using SortedVector , but it is very slow.
So, can u pls suggest a suitable Datastructure.


In addition to the fine points raised by everyone else, will the integers be
in a range (say 0 to 65535), and can each number occur more than once?
Jul 22 '05 #5
I'll clarify my situation further.....
1. Having a fast structure, what i mean is insertion of elements anywhere
(and deletion anywhere) should not take much time....Also access time, i.e.
time required to read a particular element should also be fast....I know it
is a tough requirement (But thats why i posted the question to this group).
2. Also each integer can occur more than once but not more than twice.....

If no standard structure satisfies these constraints, can u suggest a new
idea that would satisfy the needs.....
Jul 22 '05 #6

"Jeff Schwab" <je******@comcast.net> wrote in message
news:Tr********************@comcast.com...
Raghavendra Mahuli wrote:
i need to store a lot of integers(say 10,000) in a datastructure.
The constraints are:
1. It should be fast.


There will be trade-offs. Which operations should be fastest?

--------- Insertion , deletion & searching should be fast
2. It should be orderded or sorted.


Must that property be maintained at all times, or do you plan to do a
bunch of insertions or deletions between reads? What I mean is, if you
will insert 100 elements in a row, does the collection really need to
re-sort itself after each insertion?


------ At a time, i'll be inserting only 2 elemnts in a row.....it should
reorder itself immediately.
3.Insterting and deleting should be fast.


Inserting and deleting where? If you need to maintain the ordering
after absolutely every insertion, a std::set or std::multiset may be
best, as JohnH suggested. If you only need it to be sorted once all
elements have been inserted, a std::vector may be more efficient by a
constant factor in both time and space.

--------I need to maintain ordering after every 2 insertions...
Jul 22 '05 #7
Raghavendra Mahuli wrote:
"Jeff Schwab" <je******@comcast.net> wrote in message
news:Tr********************@comcast.com...
Raghavendra Mahuli wrote:
i need to store a lot of integers(say 10,000) in a datastructure.
The constraints are:
1. It should be fast.
There will be trade-offs. Which operations should be fastest?


--------- Insertion , deletion & searching should be fast


Is O(ln2 n) good enough for all of those operations?
2. It should be orderded or sorted.


Must that property be maintained at all times, or do you plan to do a
bunch of insertions or deletions between reads? What I mean is, if you
will insert 100 elements in a row, does the collection really need to
re-sort itself after each insertion?

------ At a time, i'll be inserting only 2 elemnts in a row.....it should
reorder itself immediately.


Sounds like std::multiset might be your best bet.
3.Insterting and deleting should be fast.


Inserting and deleting where? If you need to maintain the ordering
after absolutely every insertion, a std::set or std::multiset may be
best, as JohnH suggested. If you only need it to be sorted once all
elements have been inserted, a std::vector may be more efficient by a
constant factor in both time and space.


--------I need to maintain ordering after every 2 insertions...


Is the number of possible values bounded? If so, how many possible
values of elements are there? 10,000? 2^32? I don't mean how many
elements you need to store in your container, but how many different
values might exist for the type of your integer; for example, if you're
using 16-bit integers, there are 2^16=65536 possible values.
Jul 22 '05 #8
> Is O(ln2 n) good enough for all of those operations?
--------I think that is sufficient

Is the number of possible values bounded? If so, how many possible
values of elements are there? 10,000? 2^32? I don't mean how many
elements you need to store in your container, but how many different
values might exist for the type of your integer; for example, if you're
using 16-bit integers, there are 2^16=65536 possible values.


-----i am storing memory addresses....So all values in the range are
possible.....but they'll generally be multiples of 4...........And
maximum
possible value is 2^32.

Jul 22 '05 #9
"Raghavendra Mahuli" <ra************@in.bosch.com> wrote in message
news:c96ls6$ju6
-----i am storing memory addresses....So all values in the range are
possible.....but they'll generally be multiples of 4...........And
maximum
possible value is 2^32.


Try the map idea suggested by John, along with the allocator suggested by
tom_usenet.

A direct address table, hinted at in my post, is out of the question since
the range of values is so big (basically you have an array whose size equals
to the number of different possible elements).

You can use a hybrid approach, which could very well be a hashtable.
Something like: all of the numbers in [0, 2^32] hash to one of 32 different
values with equal probability, which still means many numbers could hash to
the same value, and so you store the different numbers in a map (with custom
pool allocator) data structure. See if your implementation has a
hash_table.
Jul 22 '05 #10
Raghavendra Mahuli wrote:
Is O(ln2 n) good enough for all of those operations?


--------I think that is sufficient
Is the number of possible values bounded? If so, how many possible
values of elements are there? 10,000? 2^32? I don't mean how many
elements you need to store in your container, but how many different
values might exist for the type of your integer; for example, if you're
using 16-bit integers, there are 2^16=65536 possible values.

-----i am storing memory addresses....So all values in the range are
possible.....but they'll generally be multiples of 4...........And
maximum
possible value is 2^32.


For the values that aren't multiples of 4, stick them in a std::set.
The access time for them will be logarithmic with the cardinality of the
set, but since that cardinality is small, the performance should be good.

This leaves 2^(32-2) = 2^30 values to store; you could use a sequence of
bits, initialized to zero, to represent whether each value is in the
collection. Since you allow up to two of each element, you'll probably
want an extra bit per element to show whether it has been inserted more
than once. This means you need 2^31 bits. I would recommend using two
std::vector<bool>: one to represent whether each value had been
inserted at least once, and another to represent whether each value had
been inserted more than once. You'll need at least 256 MB of RAM, plus
space for the std::set. The machines where I work typically have
multiple GB of RAM, so this is probably the solution I would choose if
performance were critical.
Jul 22 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

39
by: Scotter | last post by:
Okay I think my title line was worded misleadingly. So here goes again. I've got quite 20 identical MDB files running on an IIS5 server. From time to time I need to go into various tables and add...
16
by: laclac01 | last post by:
I have developed my own copy function for coping my own dynamic memory structure. It works, but I feel its not too efficient. There must be a quicker way to copy the data. In some of the...
3
by: Sigmathaar | last post by:
Hi, I'm need some advice about lists and vectors. I'm doing a program who needs to have sequential access of a non ordered unit of objects whose size decreases almost each time the sequence is...
22
by: Curious | last post by:
Hi, I am searching for a data structure that stores key-value pairs in it. This data structure is to hold large amounts of key-value pairs, and so needs to be efficient both in insertion and...
19
by: ern | last post by:
I need a FIFO queue of size 20, to keep track of a running average. Once the queue is full with 20 values, I want to take the sum/20 = AVERAGE. Then when a new value comes in, I want: (Old Sum...
1
by: pedagani | last post by:
Dear comp.lang.c++, I'm interested in knowing the general techniques used to handle large binary files (>10GB) efficiently such as tweaking with filebuf , etc. Reading chunk by chunk seems to be...
0
by: Ken Varn | last post by:
I have a managed C++ assembly in which I need to interact with some 'C' APIs that take fixed size 'C' data blocks. I need to wrap these data blocks into a managed object. It seems like a lot of...
15
by: Macca | last post by:
Hi, My app needs to potentially store a large number of custom objects and be able to iterate through them quickly. I was wondering which data structure would be the most efficient to do this,a...
21
by: py_genetic | last post by:
Hello, I'm importing large text files of data using csv. I would like to add some more auto sensing abilities. I'm considing sampling the data file and doing some fuzzy logic scoring on the...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.