469,579 Members | 1,102 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,579 developers. It's quick & easy.

diff multi-line whitespace to verify Beautifier output

Hi,

We're reformatting a lot of our project code using the excellent
uncrustify beautifier.

However, to gain confidence that it really is only changing whitespace
(forget { } issues for just now), we were hoping to do a diff - a
textual comparison of the files, ignoring whitespace.

However, most diffs we've tried can't handle multi-line whitespace, so
the following two
prototypes are deemed to be different:

void doStuff( int a, float b);

void doStuff(int a,
float b);

Has anyone found a way to do a diff like this, that handles multi-line
whitespace?

Shug

Feb 12 '07 #1
4 1610
Shug wrote:
We're reformatting a lot of our project code using the excellent
uncrustify beautifier.

However, to gain confidence that it really is only changing whitespace
(forget { } issues for just now), we were hoping to do a diff - a
textual comparison of the files, ignoring whitespace.

However, most diffs we've tried can't handle multi-line whitespace, so
the following two
prototypes are deemed to be different:

void doStuff( int a, float b);

void doStuff(int a,
float b);

Has anyone found a way to do a diff like this, that handles multi-line
whitespace?
I would actually do it differently: tokenize both sources. If the set
of tokens is the same, you have the same source (now, don't ask me where
you can find C++ tokenizers, I don't know, GIYF). The other way is to
convert both of those into the third type of formatting (which should
give you the exactly same output) and compare them. If the formatter
make mistakes, it's likely to make them independently.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Feb 12 '07 #2
On Feb 12, 3:24 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
Shug wrote:
We're reformatting a lot of our project code using the excellent
uncrustify beautifier.
However, to gain confidence that it really is only changing whitespace
(forget { } issues for just now), we were hoping to do a diff - a
textual comparison of the files, ignoring whitespace.
However, most diffs we've tried can't handle multi-line whitespace, so
the following two
prototypes are deemed to be different:
void doStuff( int a, float b);
void doStuff(int a,
float b);
Has anyone found a way to do a diff like this, that handles multi-line
whitespace?

I would actually do it differently: tokenize both sources. If the set
of tokens is the same, you have the same source (now, don't ask me where
you can find C++ tokenizers, I don't know, GIYF). The other way is to
convert both of those into the third type of formatting (which should
give you the exactly same output) and compare them. If the formatter
make mistakes, it's likely to make them independently.
A simple third format would be one where every whitespace is replaced
by a newline, which will give a format that is easy to compare (and I
think it will still be valid C++ :-)

--
Erik Wikström

Feb 12 '07 #3
Erik Wikström wrote:
On Feb 12, 3:24 pm, "Victor Bazarov" <v.Abaza...@comAcast.netwrote:
>Shug wrote:
>>We're reformatting a lot of our project code using the excellent
uncrustify beautifier.
>>However, to gain confidence that it really is only changing
whitespace (forget { } issues for just now), we were hoping to do a
diff - a textual comparison of the files, ignoring whitespace.
>>However, most diffs we've tried can't handle multi-line whitespace,
so the following two
prototypes are deemed to be different:
>>void doStuff( int a, float b);
>>void doStuff(int a,
float b);
>>Has anyone found a way to do a diff like this, that handles
multi-line whitespace?

I would actually do it differently: tokenize both sources. If the
set of tokens is the same, you have the same source (now, don't ask
me where you can find C++ tokenizers, I don't know, GIYF). The
other way is to convert both of those into the third type of
formatting (which should give you the exactly same output) and
compare them. If the formatter make mistakes, it's likely to make
them independently.

A simple third format would be one where every whitespace is replaced
by a newline, which will give a format that is easy to compare (and I
think it will still be valid C++ :-)
It wouldn't be valid C++ without some continuation characters (\) in
macro definitions. And broken up include directives aren't going to
work either. :-)

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Feb 12 '07 #4
On 12 Feb, 13:15, "Shug" <enoesqu...@yahoo.co.ukwrote:
Hi,

We're reformatting a lot of our project code using the excellent
uncrustifybeautifier.

However, to gain confidence that it really is only changing whitespace
(forget { } issues for just now), we were hoping to do a diff - a
textual comparison of the files, ignoring whitespace.

However, most diffs we've tried can't handle multi-line whitespace, so
the following two
prototypes are deemed to be different:

void doStuff( int a, float b);

void doStuff(int a,
float b);

Has anyone found a way to do a diff like this, that handles multi-line
whitespace?

Shug
Thanks for your contributions guys.

In the end, we've managed to find another satisfactory solution.

After reformatting the source code, we run both the before and after
source files through tr:

tr -d '\r\n' < file1.cpp temp1.txt
tr -d '\r\n' < file2.cpp temp2.txt

then do a diff on the tr'd files

C:\cygwin\bin\diff -bBw temp1.txt temp2.txt

This is all using a cygwin installation on Windows XP.

This does exactly what we need.

Thanks again.

Shug

Feb 13 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Charley | last post: by
9 posts views Thread by Ching-Lung | last post: by
6 posts views Thread by Igor Shevchenko | last post: by
4 posts views Thread by Andreas Kasparek | last post: by
reply views Thread by suresh191 | last post: by
4 posts views Thread by guiromero | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.