473,386 Members | 1,810 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Data Structures Question...

Hi friends,
I am trying to choose the best possible data structure for the probelm
I am going to describe now.

I have lets say tens of thousands of numbers in file1 and tens of
thousands of numbers in another file2. file1 & file2 contents (only
numbers) can be entirely different.

Now the program should be able to read the files and give me a
difference file. Ofcourse this is a very easy implementation if I go
with primitive programming using "ifs" and "whiles".

This is what I need to do. First I need to rearrange the data in file1
in vary compact format so that my program uses as little RAM as
possible. SETS are one way of doing that.

For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
file1 may contain several thousand numbers) then I can rearrange them
like these sets

{1-4} // Numbers 1 to 4
{9-12} // Numbers 9 to 12
{6-7} // Numbers 6 to 7
{22} // Single number 22.

And similraly I need to rearrange contents of file2 in this format.

Now my question is the best possible method for storage of this
datastructure.
Linked Lists, Tree or others or simply use STLs? Also I will be happy
if somebody can give me an idea about the best way of doing
comparision process
between sets of file1 and file2 and produce the difference set in a
number format in a result file "file3".

Thanks,
Sai
Jul 22 '05 #1
6 2588
Saikrishna wrote:
Hi friends,
I am trying to choose the best possible data structure for the probelm
I am going to describe now.

I have lets say tens of thousands of numbers in file1 and tens of
thousands of numbers in another file2. file1 & file2 contents (only
numbers) can be entirely different.

Now the program should be able to read the files and give me a
difference file. Ofcourse this is a very easy implementation if I go
with primitive programming using "ifs" and "whiles".

This is what I need to do. First I need to rearrange the data in file1
in vary compact format so that my program uses as little RAM as
possible. SETS are one way of doing that.

For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
file1 may contain several thousand numbers) then I can rearrange them
like these sets

{1-4} // Numbers 1 to 4
{9-12} // Numbers 9 to 12
{6-7} // Numbers 6 to 7
{22} // Single number 22.

And similraly I need to rearrange contents of file2 in this format.

Now my question is the best possible method for storage of this
datastructure.
Linked Lists, Tree or others or simply use STLs? Also I will be happy
if somebody can give me an idea about the best way of doing
comparision process
between sets of file1 and file2 and produce the difference set in a
number format in a result file "file3".

Thanks,
Sai


Your "sets" are also termed "runs" in a typical Merge Sort
algorithm.

This is possibly an algorithm issue, not a language issue.
My suggestion is to sort both files into a third, then
remove duplicates. Perhaps the folks in news:comp.programming
can offer better advice. Follow-ups set.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 22 '05 #2
Saikrishna wrote:
Hi friends,
I am trying to choose the best possible data structure for the probelm
I am going to describe now.

I have lets say tens of thousands of numbers in file1 and tens of
thousands of numbers in another file2. file1 & file2 contents (only
numbers) can be entirely different.

Now the program should be able to read the files and give me a
difference file. Ofcourse this is a very easy implementation if I go
with primitive programming using "ifs" and "whiles".

This is what I need to do. First I need to rearrange the data in file1
in vary compact format so that my program uses as little RAM as
possible. SETS are one way of doing that.

For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
file1 may contain several thousand numbers) then I can rearrange them
like these sets

{1-4} // Numbers 1 to 4
{9-12} // Numbers 9 to 12
{6-7} // Numbers 6 to 7
{22} // Single number 22.

And similraly I need to rearrange contents of file2 in this format.

Now my question is the best possible method for storage of this
datastructure.
Linked Lists, Tree or others or simply use STLs? Also I will be happy
if somebody can give me an idea about the best way of doing
comparision process
between sets of file1 and file2 and produce the difference set in a
number format in a result file "file3".

Thanks,
Sai


Your "sets" are also termed "runs" in a typical Merge Sort
algorithm.

This is possibly an algorithm issue, not a language issue.
My suggestion is to sort both files into a third, then
remove duplicates. Perhaps the folks in news:comp.programming
can offer better advice. Follow-ups set.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

Jul 22 '05 #3
There are three DIFFerent methods to attack the problem.

The first method is to look at the implementation of the diff utility
for Linux/Unix.
The second is to look at the implemenation of the sequnce-alignment from
biological-computer-science. There they determine the best aligments of
the gene sequences and it may have smth to do with your problem.
The laziest method is to ask in some programming / algorithmics group,
since algorithms/data structures are

O F F - T O P I C

here. That means that people here are not always qualified enough to
reply. (No offence meant.)

--
Best regards,
Alex.

PS. To email me, remove "loeschedies" from the email address given.
Jul 22 '05 #4
There are three DIFFerent methods to attack the problem.

The first method is to look at the implementation of the diff utility
for Linux/Unix.
The second is to look at the implemenation of the sequnce-alignment from
biological-computer-science. There they determine the best aligments of
the gene sequences and it may have smth to do with your problem.
The laziest method is to ask in some programming / algorithmics group,
since algorithms/data structures are

O F F - T O P I C

here. That means that people here are not always qualified enough to
reply. (No offence meant.)

--
Best regards,
Alex.

PS. To email me, remove "loeschedies" from the email address given.
Jul 22 '05 #5


Thomas Matthews schrieb:
Saikrishna wrote:
I am trying to choose the best possible data structure for the probelm
I am going to describe now. For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
file1 may contain several thousand numbers) then I can rearrange them
like these sets

{1-4} // Numbers 1 to 4
{9-12} // Numbers 9 to 12
{6-7} // Numbers 6 to 7
{22} // Single number 22.

And similraly I need to rearrange contents of file2 in this format.

Now my question is the best possible method for storage of this
datastructure.
Linked Lists, Tree or others or simply use STLs?


Have a look at the data structure called "discrete interval encoding
tree" or "diet"; I think that's what you're after.

See e.g. http://www.nist.gov/dads/HTML/discretintrv.html

Michael
--
Feel the stare of my burning hamster and stop smoking!
Jul 22 '05 #6


Thomas Matthews schrieb:
Saikrishna wrote:
I am trying to choose the best possible data structure for the probelm
I am going to describe now. For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
file1 may contain several thousand numbers) then I can rearrange them
like these sets

{1-4} // Numbers 1 to 4
{9-12} // Numbers 9 to 12
{6-7} // Numbers 6 to 7
{22} // Single number 22.

And similraly I need to rearrange contents of file2 in this format.

Now my question is the best possible method for storage of this
datastructure.
Linked Lists, Tree or others or simply use STLs?


Have a look at the data structure called "discrete interval encoding
tree" or "diet"; I think that's what you're after.

See e.g. http://www.nist.gov/dads/HTML/discretintrv.html

Michael
--
Feel the stare of my burning hamster and stop smoking!
Jul 22 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Amit | last post by:
Hello, Can any of you recommend a really good book on data structures and more so, if it relates to STL data structures, and how they are used to build far more complex data structures. ...
4
by: Thomas Paul Diffenbach | last post by:
Can anyone point me to an open source library of /statically allocated/ data structures? I'm writing some code that would benefit from trees, preferably self balancing, but on an embedded system...
11
by: theshowmecanuck | last post by:
As a matter of academic interest only, is there a way to programmatically list the 'c' data types? I am not looking for detail, just if it is possible, and what function could be used to...
10
by: Bart Goeman | last post by:
Hi, I have a question about how to put redundant information in data structures, initialized at compile time. This is often necessary for performance reasons and can't be done at run time (data...
14
by: SD | last post by:
I am thinking about writing a text editor in C for unix sometime soon. I am just doing this to learn more about C. I want to write something like ed.c, a simple line editor. What types of data...
5
by: Shwetabh | last post by:
Hi everyone. My question is, why are data structures implemented only with struct data type? Why not union when it is more efficient as compared with structures? Thanks in advance
13
by: Leszek Taratuta | last post by:
Hello, I have several drop-down lists on my ASP.NET page. I need to keep data sources of these lists in Session State. What would be the most effective method to serialize this kind of data...
6
by: James | last post by:
I am using vb.net and need to keep in memory a large data structure, so I am looking for the best option. And after several test I am pretty confused. So I will be grateful if anyone can help me. ...
11
by: efrat | last post by:
Hello, I'm planning to use Python in order to teach a DSA (data structures and algorithms) course in an academic institute. If you could help out with the following questions, I'd sure...
29
by: Mik0b0 | last post by:
Hallo to everyone. This fall I am going to start data structures as a part of C language course. The problem is I could not find any satisfying tutorial about structures in C. There are plenty of...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.