473,386 Members | 1,733 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

diff multi-line whitespace to verify Beautifier output

Hi,

We're reformatting a lot of our project code using the excellent
uncrustify beautifier.

However, to gain confidence that it really is only changing whitespace
(forget { } issues for just now), we were hoping to do a diff - a
textual comparison of the files, ignoring whitespace.

However, most diffs we've tried can't handle multi-line whitespace, so
the following two
prototypes are deemed to be different:

void doStuff( int a, float b);

void doStuff(int a,
float b);

Has anyone found a way to do a diff like this, that handles multi-line
whitespace?

Shug

Feb 12 '07 #1
4 1763
Shug wrote:
We're reformatting a lot of our project code using the excellent
uncrustify beautifier.

However, to gain confidence that it really is only changing whitespace
(forget { } issues for just now), we were hoping to do a diff - a
textual comparison of the files, ignoring whitespace.

However, most diffs we've tried can't handle multi-line whitespace, so
the following two
prototypes are deemed to be different:

void doStuff( int a, float b);

void doStuff(int a,
float b);

Has anyone found a way to do a diff like this, that handles multi-line
whitespace?
I would actually do it differently: tokenize both sources. If the set
of tokens is the same, you have the same source (now, don't ask me where
you can find C++ tokenizers, I don't know, GIYF). The other way is to
convert both of those into the third type of formatting (which should
give you the exactly same output) and compare them. If the formatter
make mistakes, it's likely to make them independently.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Feb 12 '07 #2
On Feb 12, 3:24 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
Shug wrote:
We're reformatting a lot of our project code using the excellent
uncrustify beautifier.
However, to gain confidence that it really is only changing whitespace
(forget { } issues for just now), we were hoping to do a diff - a
textual comparison of the files, ignoring whitespace.
However, most diffs we've tried can't handle multi-line whitespace, so
the following two
prototypes are deemed to be different:
void doStuff( int a, float b);
void doStuff(int a,
float b);
Has anyone found a way to do a diff like this, that handles multi-line
whitespace?

I would actually do it differently: tokenize both sources. If the set
of tokens is the same, you have the same source (now, don't ask me where
you can find C++ tokenizers, I don't know, GIYF). The other way is to
convert both of those into the third type of formatting (which should
give you the exactly same output) and compare them. If the formatter
make mistakes, it's likely to make them independently.
A simple third format would be one where every whitespace is replaced
by a newline, which will give a format that is easy to compare (and I
think it will still be valid C++ :-)

--
Erik Wikström

Feb 12 '07 #3
Erik Wikström wrote:
On Feb 12, 3:24 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
>Shug wrote:
>>We're reformatting a lot of our project code using the excellent
uncrustify beautifier.
>>However, to gain confidence that it really is only changing
whitespace (forget { } issues for just now), we were hoping to do a
diff - a textual comparison of the files, ignoring whitespace.
>>However, most diffs we've tried can't handle multi-line whitespace,
so the following two
prototypes are deemed to be different:
>>void doStuff( int a, float b);
>>void doStuff(int a,
float b);
>>Has anyone found a way to do a diff like this, that handles
multi-line whitespace?

I would actually do it differently: tokenize both sources. If the
set of tokens is the same, you have the same source (now, don't ask
me where you can find C++ tokenizers, I don't know, GIYF). The
other way is to convert both of those into the third type of
formatting (which should give you the exactly same output) and
compare them. If the formatter make mistakes, it's likely to make
them independently.

A simple third format would be one where every whitespace is replaced
by a newline, which will give a format that is easy to compare (and I
think it will still be valid C++ :-)
It wouldn't be valid C++ without some continuation characters (\) in
macro definitions. And broken up include directives aren't going to
work either. :-)

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Feb 12 '07 #4
On 12 Feb, 13:15, "Shug" <enoesqu...@yahoo.co.ukwrote:
Hi,

We're reformatting a lot of our project code using the excellent
uncrustifybeautifier.

However, to gain confidence that it really is only changing whitespace
(forget { } issues for just now), we were hoping to do a diff - a
textual comparison of the files, ignoring whitespace.

However, most diffs we've tried can't handle multi-line whitespace, so
the following two
prototypes are deemed to be different:

void doStuff( int a, float b);

void doStuff(int a,
float b);

Has anyone found a way to do a diff like this, that handles multi-line
whitespace?

Shug
Thanks for your contributions guys.

In the end, we've managed to find another satisfactory solution.

After reformatting the source code, we run both the before and after
source files through tr:

tr -d '\r\n' < file1.cpp temp1.txt
tr -d '\r\n' < file2.cpp temp2.txt

then do a diff on the tr'd files

C:\cygwin\bin\diff -bBw temp1.txt temp2.txt

This is all using a cygwin installation on Windows XP.

This does exactly what we need.

Thanks again.

Shug

Feb 13 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Charley | last post by:
I've got a diff file that I think is a patch for a bunch of file in a directory. How do I apply this file? I thought it was #patch myfile.diff But that does nothing. I must be missing...
0
by: python | last post by:
Hi- I have a lot of monthly time series data. I need to be able to compare two dates and get the number of months that they are apart. The datetime module is a daily-frequency data type. ...
6
by: Pete | last post by:
Hello everybody -- Forgive my multi-posting my question. I posted first to ciwah, but I learned that ciwas is the better group for this CSS question. The problem has me stopped. The page: ...
4
by: Sabra D via AccessMonster.com | last post by:
I have a db with 3 tables, owner, lessee, and tract, tract is the main table and has two lookup fields to find the owner and lessee info. My problem - i have a form with the owner info on top and...
9
by: Ching-Lung | last post by:
Hi all, I try to create a tool to check the delta (diff) of 2 binaries and create the delta binary. I use binary formatter (serialization) to create the delta binary. It works fine but the...
6
by: Igor Shevchenko | last post by:
Hi! Suppose I have "pg_dump -s" of two pg installs, one is "dev", another is "production". Their schemas don't differ too much, and I want to get a "diff -u"-like schema diff so I can quickly...
4
by: Andreas Kasparek | last post by:
Hola! I'm preparing my master thesis about a XML Merge Tool implementation and was wondering if there is any open standard for XML diff regarding topics like: - is a diff result computed on...
7
by: Jon Davis | last post by:
I have a couple questions. First of all, would anyone consider a multi-layered programming approach (building business objects that are seperate from data access logic and seperate from user...
10
by: =?Utf-8?B?UmljaA==?= | last post by:
A lot of users at my workplace use different screen resolutions, and I build apps to use 1680 x 1050 pixels res by default. But some users are using 800 x 600, and the apps are too large for their...
2
by: akshaycjoshi | last post by:
I have got one tree tree view control.I have three levels in it. Example- Root1 ------->child1 ------->child2 ---------------->child1 ---------------->child2 ------->child3 Root2
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.