Hi friends,
I am trying to choose the best possible data structure for the probelm
I am going to describe now.
I have lets say tens of thousands of numbers in file1 and tens of
thousands of numbers in another file2. file1 & file2 contents (only
numbers) can be entirely different.
Now the program should be able to read the files and give me a
difference file. Ofcourse this is a very easy implementation if I go
with primitive programming using "ifs" and "whiles".
This is what I need to do. First I need to rearrange the data in file1
in vary compact format so that my program uses as little RAM as
possible. SETS are one way of doing that.
For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
file1 may contain several thousand numbers) then I can rearrange them
like these sets
{1-4} // Numbers 1 to 4
{9-12} // Numbers 9 to 12
{6-7} // Numbers 6 to 7
{22} // Single number 22.
And similraly I need to rearrange contents of file2 in this format.
Now my question is the best possible method for storage of this
datastructure.
Linked Lists, Tree or others or simply use STLs? Also I will be happy
if somebody can give me an idea about the best way of doing
comparision process
between sets of file1 and file2 and produce the difference set in a
number format in a result file "file3".
Thanks,
Sai 6 2588
Saikrishna wrote: Hi friends, I am trying to choose the best possible data structure for the probelm I am going to describe now.
I have lets say tens of thousands of numbers in file1 and tens of thousands of numbers in another file2. file1 & file2 contents (only numbers) can be entirely different.
Now the program should be able to read the files and give me a difference file. Ofcourse this is a very easy implementation if I go with primitive programming using "ifs" and "whiles".
This is what I need to do. First I need to rearrange the data in file1 in vary compact format so that my program uses as little RAM as possible. SETS are one way of doing that.
For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality file1 may contain several thousand numbers) then I can rearrange them like these sets
{1-4} // Numbers 1 to 4 {9-12} // Numbers 9 to 12 {6-7} // Numbers 6 to 7 {22} // Single number 22.
And similraly I need to rearrange contents of file2 in this format.
Now my question is the best possible method for storage of this datastructure. Linked Lists, Tree or others or simply use STLs? Also I will be happy if somebody can give me an idea about the best way of doing comparision process between sets of file1 and file2 and produce the difference set in a number format in a result file "file3".
Thanks, Sai
Your "sets" are also termed "runs" in a typical Merge Sort
algorithm.
This is possibly an algorithm issue, not a language issue.
My suggestion is to sort both files into a third, then
remove duplicates. Perhaps the folks in news:comp.programming
can offer better advice. Follow-ups set.
--
Thomas Matthews
C++ newsgroup welcome message: http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq: http://www.raos.demon.uk/acllc-c++/faq.html
Other sites: http://www.josuttis.com -- C++ STL Library book
Saikrishna wrote: Hi friends, I am trying to choose the best possible data structure for the probelm I am going to describe now.
I have lets say tens of thousands of numbers in file1 and tens of thousands of numbers in another file2. file1 & file2 contents (only numbers) can be entirely different.
Now the program should be able to read the files and give me a difference file. Ofcourse this is a very easy implementation if I go with primitive programming using "ifs" and "whiles".
This is what I need to do. First I need to rearrange the data in file1 in vary compact format so that my program uses as little RAM as possible. SETS are one way of doing that.
For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality file1 may contain several thousand numbers) then I can rearrange them like these sets
{1-4} // Numbers 1 to 4 {9-12} // Numbers 9 to 12 {6-7} // Numbers 6 to 7 {22} // Single number 22.
And similraly I need to rearrange contents of file2 in this format.
Now my question is the best possible method for storage of this datastructure. Linked Lists, Tree or others or simply use STLs? Also I will be happy if somebody can give me an idea about the best way of doing comparision process between sets of file1 and file2 and produce the difference set in a number format in a result file "file3".
Thanks, Sai
Your "sets" are also termed "runs" in a typical Merge Sort
algorithm.
This is possibly an algorithm issue, not a language issue.
My suggestion is to sort both files into a third, then
remove duplicates. Perhaps the folks in news:comp.programming
can offer better advice. Follow-ups set.
--
Thomas Matthews
C++ newsgroup welcome message: http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq: http://www.raos.demon.uk/acllc-c++/faq.html
Other sites: http://www.josuttis.com -- C++ STL Library book
There are three DIFFerent methods to attack the problem.
The first method is to look at the implementation of the diff utility
for Linux/Unix.
The second is to look at the implemenation of the sequnce-alignment from
biological-computer-science. There they determine the best aligments of
the gene sequences and it may have smth to do with your problem.
The laziest method is to ask in some programming / algorithmics group,
since algorithms/data structures are
O F F - T O P I C
here. That means that people here are not always qualified enough to
reply. (No offence meant.)
--
Best regards,
Alex.
PS. To email me, remove "loeschedies" from the email address given.
There are three DIFFerent methods to attack the problem.
The first method is to look at the implementation of the diff utility
for Linux/Unix.
The second is to look at the implemenation of the sequnce-alignment from
biological-computer-science. There they determine the best aligments of
the gene sequences and it may have smth to do with your problem.
The laziest method is to ask in some programming / algorithmics group,
since algorithms/data structures are
O F F - T O P I C
here. That means that people here are not always qualified enough to
reply. (No offence meant.)
--
Best regards,
Alex.
PS. To email me, remove "loeschedies" from the email address given.
Thomas Matthews schrieb: Saikrishna wrote: I am trying to choose the best possible data structure for the probelm I am going to describe now.
For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality file1 may contain several thousand numbers) then I can rearrange them like these sets
{1-4} // Numbers 1 to 4 {9-12} // Numbers 9 to 12 {6-7} // Numbers 6 to 7 {22} // Single number 22.
And similraly I need to rearrange contents of file2 in this format.
Now my question is the best possible method for storage of this datastructure. Linked Lists, Tree or others or simply use STLs?
Have a look at the data structure called "discrete interval encoding
tree" or "diet"; I think that's what you're after.
See e.g. http://www.nist.gov/dads/HTML/discretintrv.html
Michael
--
Feel the stare of my burning hamster and stop smoking!
Thomas Matthews schrieb: Saikrishna wrote: I am trying to choose the best possible data structure for the probelm I am going to describe now.
For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality file1 may contain several thousand numbers) then I can rearrange them like these sets
{1-4} // Numbers 1 to 4 {9-12} // Numbers 9 to 12 {6-7} // Numbers 6 to 7 {22} // Single number 22.
And similraly I need to rearrange contents of file2 in this format.
Now my question is the best possible method for storage of this datastructure. Linked Lists, Tree or others or simply use STLs?
Have a look at the data structure called "discrete interval encoding
tree" or "diet"; I think that's what you're after.
See e.g. http://www.nist.gov/dads/HTML/discretintrv.html
Michael
--
Feel the stare of my burning hamster and stop smoking! This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Amit |
last post by:
Hello,
Can any of you recommend a really good book on data structures and more so,
if it relates to STL data structures, and how they are used to build far
more complex data structures.
...
|
by: Thomas Paul Diffenbach |
last post by:
Can anyone point me to an open source library of /statically
allocated/ data structures?
I'm writing some code that would benefit from trees, preferably self
balancing, but on an embedded system...
|
by: theshowmecanuck |
last post by:
As a matter of academic interest only, is there a way to programmatically
list the 'c' data types? I am not looking for detail, just if it is
possible, and what function could be used to...
|
by: Bart Goeman |
last post by:
Hi,
I have a question about how to put redundant information in data
structures, initialized at compile time. This is often necessary
for performance reasons and can't be done at run time (data...
|
by: SD |
last post by:
I am thinking about writing a text editor in C for unix sometime soon.
I am just doing this to learn more about C. I want to write something
like ed.c, a simple line editor. What types of data...
|
by: Shwetabh |
last post by:
Hi everyone.
My question is, why are data structures implemented only with struct
data type?
Why not union when it is more efficient as compared with structures?
Thanks in advance
|
by: Leszek Taratuta |
last post by:
Hello,
I have several drop-down lists on my ASP.NET page. I need to keep data
sources of these lists in Session State.
What would be the most effective method to serialize this kind of data...
|
by: James |
last post by:
I am using vb.net and need to keep in memory a large data structure, so I am
looking for the best option. And after several test I am pretty confused. So
I will be grateful if anyone can help me.
...
|
by: efrat |
last post by:
Hello,
I'm planning to use Python in order to teach a DSA (data structures
and algorithms) course in an academic institute. If you could help out
with the following questions, I'd sure...
|
by: Mik0b0 |
last post by:
Hallo to everyone.
This fall I am going to start data structures as a part of C language
course. The problem is I could not find any satisfying tutorial about
structures in C. There are plenty of...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |