467,915 Members | 1,628 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 467,915 developers. It's quick & easy.

parse unix-style difference reporting

Hi all,

I want to diff two files or two versions of one file, and parse the output
to find a summary of how many lines of replacement/addition/deletion in the
two files.

Known from diff/cleardiff, the output has a style like:
15a16, 15,17d3, 18c19,21 etc.

Anyone know how to parse these output to generate a summary?

Thanks in advance,
Liang
Jul 19 '05 #1
  • viewed: 4755
Share:
4 Replies
In article <bs**********@avnika.corp.mot.com>,
"Liang" <le*********@hotmail.com> wrote:
Hi all,

I want to diff two files or two versions of one file, and parse the output
to find a summary of how many lines of replacement/addition/deletion in the
two files.

Known from diff/cleardiff, the output has a style like:
15a16, 15,17d3, 18c19,21 etc.

Anyone know how to parse these output to generate a summary?


You can use "diff -c" and count the number of "<", ">", and "!" lines.
Or use the "comm" command and count the number of lines.

--
Barry Margolin, ba****@alum.mit.edu
Arlington, MA
Jul 19 '05 #2
Liang wrote:
I want to diff two files or two versions of one file, and parse the output
to find a summary of how many lines of replacement/addition/deletion in the
two files.

Known from diff/cleardiff, the output has a style like:
15a16, 15,17d3, 18c19,21 etc.

Anyone know how to parse these output to generate a summary?


It isn't very hard to work it out, is it?

Each item conceptually has four numbers and an operation code:

N1,N2 op N3,N4

When there is just one number on one side of the operation, the values
N1 and N2, or N3 and N4, are the same.

Inserts are easy: there's always a single number on the LHS, and the
number of lines inserted is N4-N3+1.

Similarly, deletes are easy: there's always a single number on the RHS
of the operator, and the number of lines deleted is N2-N1+1.

Number of lines replaced has two parts to the value - the number of
lines removed and the number replacing the removed lines. Depending
on your viewpoint, you can either choose to count the two values
separately (number removed NR = N2-N1+1, number inserted NI =
N4-N3+1), or you can be cleverer about the calculation and decide that
when NR > NI, then you have NI changed lines and NR-NI deleted lines,
and that when NR < NI, you have NR changed lines and NI-NR inserted
lines. When NR = NI, you have NR (or NI) changed lines, of course.

That took me five minutes to think and type - how long would it have
taken you to do it? (And cross-posted too?)

--
Jonathan Leffler #include <disclaimer.h>
Email: jl******@earthlink.net, jl******@us.ibm.com
Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/

Jul 19 '05 #3
In comp.software.config-mgmt Liang <le*********@hotmail.com> wrote:
Hi all, I want to diff two files or two versions of one file, and parse the output
to find a summary of how many lines of replacement/addition/deletion in the
two files.


http://invisible-island.net/diffstat/

--
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net
Jul 19 '05 #4
>
You can use "diff -c" and count the number of "<", ">", and "!" lines.
Or use the "comm" command and count the number of lines.
marvellous! this is the simplest solution.

Happy new year!
--
Barry Margolin, ba****@alum.mit.edu
Arlington, MA

Jul 19 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by sky2070 | last post: by
reply views Thread by John M. Lembo | last post: by
6 posts views Thread by Matthew | last post: by
22 posts views Thread by nick | last post: by
2 posts views Thread by Li-fan Chen | last post: by
3 posts views Thread by cgiatras | last post: by
9 posts views Thread by Krumble Bunk | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.