473,386 Members | 1,736 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Comparing two text files

Hello,

I am looking for a fast and efficient way to compare two text files and
create a thrid one.

E.g.

Input file 1:
Number 1
Number 2
Number 3
Number 4

Input file 2:
Number 2
Number 3
Number 5

The Output should be in case 1 (Show lines that are the same):
Number 2
Number 3

In case 2 (Show different lines):
Number 1
Number 4
Number 5

This should work regarless of the order of the lines!

Any suggestions?

Thx Holger
Nov 17 '05 #1
6 3582
Why not invoke a free 3rd party tool such as Winmerge?

Brett

Nov 17 '05 #2
Why not invoke a free 3rd party tool such as Winmerge?

Brett

Nov 17 '05 #3
You could read the contents of one file into a hashtable using the value in
the file as the key, then read through the other file and for each value
check to see if the value exists as a key in the hashtable, if it does then
you have a match if not then there is no match and you can put the values
into a DoesMatch and DoesNotMatch collection structure.

Looking up values in a hashtable is very quick. To make it more efficient
you may not read all of the contents into the hashtable at once, you could
page the data, but then the process would take longer and you would have to
do multiple scans.

Hope that helps
Mark R Dawson

"Holger Kasten" wrote:
Hello,

I am looking for a fast and efficient way to compare two text files and
create a thrid one.

E.g.

Input file 1:
Number 1
Number 2
Number 3
Number 4

Input file 2:
Number 2
Number 3
Number 5

The Output should be in case 1 (Show lines that are the same):
Number 2
Number 3

In case 2 (Show different lines):
Number 1
Number 4
Number 5

This should work regarless of the order of the lines!

Any suggestions?

Thx Holger

Nov 17 '05 #4
You could read the contents of one file into a hashtable using the value in
the file as the key, then read through the other file and for each value
check to see if the value exists as a key in the hashtable, if it does then
you have a match if not then there is no match and you can put the values
into a DoesMatch and DoesNotMatch collection structure.

Looking up values in a hashtable is very quick. To make it more efficient
you may not read all of the contents into the hashtable at once, you could
page the data, but then the process would take longer and you would have to
do multiple scans.

Hope that helps
Mark R Dawson

"Holger Kasten" wrote:
Hello,

I am looking for a fast and efficient way to compare two text files and
create a thrid one.

E.g.

Input file 1:
Number 1
Number 2
Number 3
Number 4

Input file 2:
Number 2
Number 3
Number 5

The Output should be in case 1 (Show lines that are the same):
Number 2
Number 3

In case 2 (Show different lines):
Number 1
Number 4
Number 5

This should work regarless of the order of the lines!

Any suggestions?

Thx Holger

Nov 17 '05 #5
Hi,

Even better just a counter as the value in the hashtable , and just read
both files, when done those containing a value of 2 appears in both files,
with value 1 appear in one file only.

Playing with the values you could also know what lines appeared in the first
and not in the second or viceversa.

cheers,

--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation

"Mark R. Dawson" <Ma*********@discussions.microsoft.com> wrote in message
news:06**********************************@microsof t.com...
You could read the contents of one file into a hashtable using the value
in
the file as the key, then read through the other file and for each value
check to see if the value exists as a key in the hashtable, if it does
then
you have a match if not then there is no match and you can put the values
into a DoesMatch and DoesNotMatch collection structure.

Looking up values in a hashtable is very quick. To make it more efficient
you may not read all of the contents into the hashtable at once, you could
page the data, but then the process would take longer and you would have
to
do multiple scans.

Hope that helps
Mark R Dawson

"Holger Kasten" wrote:
Hello,

I am looking for a fast and efficient way to compare two text files and
create a thrid one.

E.g.

Input file 1:
Number 1
Number 2
Number 3
Number 4

Input file 2:
Number 2
Number 3
Number 5

The Output should be in case 1 (Show lines that are the same):
Number 2
Number 3

In case 2 (Show different lines):
Number 1
Number 4
Number 5

This should work regarless of the order of the lines!

Any suggestions?

Thx Holger

Nov 17 '05 #6
Hi,

Even better just a counter as the value in the hashtable , and just read
both files, when done those containing a value of 2 appears in both files,
with value 1 appear in one file only.

Playing with the values you could also know what lines appeared in the first
and not in the second or viceversa.

cheers,

--
Ignacio Machin,
ignacio.machin AT dot.state.fl.us
Florida Department Of Transportation

"Mark R. Dawson" <Ma*********@discussions.microsoft.com> wrote in message
news:06**********************************@microsof t.com...
You could read the contents of one file into a hashtable using the value
in
the file as the key, then read through the other file and for each value
check to see if the value exists as a key in the hashtable, if it does
then
you have a match if not then there is no match and you can put the values
into a DoesMatch and DoesNotMatch collection structure.

Looking up values in a hashtable is very quick. To make it more efficient
you may not read all of the contents into the hashtable at once, you could
page the data, but then the process would take longer and you would have
to
do multiple scans.

Hope that helps
Mark R Dawson

"Holger Kasten" wrote:
Hello,

I am looking for a fast and efficient way to compare two text files and
create a thrid one.

E.g.

Input file 1:
Number 1
Number 2
Number 3
Number 4

Input file 2:
Number 2
Number 3
Number 5

The Output should be in case 1 (Show lines that are the same):
Number 2
Number 3

In case 2 (Show different lines):
Number 1
Number 4
Number 5

This should work regarless of the order of the lines!

Any suggestions?

Thx Holger

Nov 17 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Wescotte | last post by:
I'm writing a tiny php app that will log into our bank of america account and retrieve a file containing a list of checks that cleared the previous day. The problem I'm running into is when I...
4
by: ddd | last post by:
I am trying to build a diff tool that allows me to compare two HTML files. I am looking for resources on how to achive this. The main problem is that I do not want to simply highlight the line of...
5
by: Peteroid | last post by:
I realize this might not be the correct newsgroup, but since I have a general question I need answering in the context of doing a managed VC++.NET Managed application, this is as good a place as...
0
by: richardkreidl | last post by:
I have the following hash script that I use to compare two text files. 'Class Public Class FileComparison Public Class FileComparisonException Public Enum ExceptionType U 'Unknown A 'Add...
3
by: =?Utf-8?B?UG9vamE=?= | last post by:
Hi I have been using Microsoft XmlDiffPatch to compare 2 XML files. I wanted to know if there is any Microsoft Tool which can be used to compare two HTML files in the similar manner or any...
2
by: Smithers | last post by:
I would appreciate some recommendations for programmatically determining if files differ. I'm writing a utility that backs up files that customers upload to Web sites. Rather than mindlessly...
9
by: Shriphani | last post by:
Hello all, I have a problem here. I have a list named list_of_files which contains filenames with their timestamps attached to the name. If I have a string "fstab", and I want to list out the...
1
by: Yash | last post by:
Hi, We are in the process of tuning the performance of our stored procs in SQL 2000 and are looking for a tool that would help us in comparing the result sets of an old SP and a modified SP. The...
3
by: erbrose | last post by:
Hello all! Newbie here, I've been tasked with a fairly intensive project here and my perl skills are still at a minimum so this post may eventually turn into a long one, but I am only going to ask...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.