473,395 Members | 2,010 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Tools for refactoring header files

Does anyone know of any tools for refactoring header files?

We're using a third party codebase at work, and pretty much every file
includes a 50Mb precompiled header file. I'm looking for a tool that will
let us figure out which header files are actually needed by each .cpp, and
allow us to break this up so that we're not including the world in each one.

Ideally, the same tool would also recognize where #includes can be replaced
with forward declarations, and even better, it'd automate the updates to the
code files.

Is there such a tool? Or am I SOL until some bright spark writes one?

Thanks,
Si
Apr 18 '06 #1
15 4349
Simon Cooke wrote:
Does anyone know of any tools for refactoring header files?


I can recall rumors that Lakos pointed out one in his book /Large Scale C++
Software Design/, and I can recall rumors it is not supported. Your Google
is as good as mine.

I would bet folks don't need one because, when they have "a third party
codebase at work," where "pretty much every file includes a 50Mb precompiled
header file," they tend to throw a technique called Pimpl at it:

http://www.gotw.ca/publications/mill04.htm

Get the above-mentioned book, and get /C++ Coding Standards/ by Sutter &
Alexandrescu.

Then hunt down whoever wrote this code and smack them with those books. C++
works because clean logical designs enable clean physical designs, and I
would bet your physical design is also questionable. Read /Working
Effectively with Legacy Code/ by Mike Feathers to get ahead of that problem.

Someone else might indeed know of a tool. I'm only posting because your post
went stale, and you didn't declare that Pimpl was your first line of attack.
It usually is. It's good for your situation because it shows how to clean
nearly everything out of a .h file without changing anything's logical
design. Long term, improving the logical design

--
Phlip
http://www.greencheese.org/ZeekLand <-- NOT a blog!!!
Apr 18 '06 #2

Phlip wrote:
Simon Cooke wrote:
Does anyone know of any tools for refactoring header files?
I would bet folks don't need one because, when they have "a third party
codebase at work," where "pretty much every file includes a 50Mb precompiled
header file," they tend to throw a technique called Pimpl at it:

http://www.gotw.ca/publications/mill04.htm


Pimpl doesn't help the OP, who already knows the header needs
refactoring.

The only way to fix this problem is hours and hours of cut and paste
operations followed by compiling, followed by tracing where the errors
are comming from. Monolithic headers are just way too complex to wrap
your head around so you _have_ to break down and use the compiler as an
error tracing tool. It doesn't make a very good one and will spit out
difficult to interpret errors but it is the best you got in this case.

Took me over two solid straight days to do ours and there is still a
lot I never touched and let be until I need to pull it apart.

Good luck.

Apr 18 '06 #3
On Tue, 18 Apr 2006 11:13:49 -0700, "Simon Cooke"
<sc************@surr-nospam-eal.com> wrote:
Does anyone know of any tools for refactoring header files?


C++ lacks good refactoring tools, mostly due to the complicated
syntax. See also e.g.
http://www.artima.com/weblogs/viewpost.jsp?thread=11070 .

Best wishes,
Roland Pibinger
Apr 18 '06 #4
Noah Roberts wrote:
http://www.gotw.ca/publications/mill04.htm
Pimpl doesn't help the OP, who already knows the header needs
refactoring.


Correct. They didn't say "my partial build time is too long", they said they
needed the kind of fix that made me think they started with that problem.
The only way to fix this problem is hours and hours of cut and paste
operations followed by compiling, followed by tracing where the errors
are comming from. Monolithic headers are just way too complex to wrap
your head around so you _have_ to break down and use the compiler as an
error tracing tool. It doesn't make a very good one and will spit out
difficult to interpret errors but it is the best you got in this case.


Can't they do pimpl first, then get a little breathing room before doing
that?

Anecdote: I know a codebase where every high-level class has an Impl suffix,
and it inherits an abstract base class with an Inf suffix. Client classes
are expected to use only the Inf, following the Dependency Inversion
Principle, and Lakos-style short header files.

Except a few Inf classes inherit Impl classes. Ouch!

--
Phlip
http://www.greencheese.org/ZeekLand <-- NOT a blog!!!
Apr 18 '06 #5

"Noah Roberts" <ro**********@gmail.com> wrote in message
news:11**********************@e56g2000cwe.googlegr oups.com...

Phlip wrote:
Simon Cooke wrote:
> Does anyone know of any tools for refactoring header files?
I would bet folks don't need one because, when they have "a third party
codebase at work," where "pretty much every file includes a 50Mb
precompiled
header file," they tend to throw a technique called Pimpl at it:

http://www.gotw.ca/publications/mill04.htm


Pimpl doesn't help the OP, who already knows the header needs
refactoring.

The only way to fix this problem is hours and hours of cut and paste
operations followed by compiling, followed by tracing where the errors
are comming from. Monolithic headers are just way too complex to wrap
your head around so you _have_ to break down and use the compiler as an
error tracing tool. It doesn't make a very good one and will spit out
difficult to interpret errors but it is the best you got in this case.


Yeah - that's why I was hoping for a tool to assist with this - because
we're dealing with a codebase with millions of lines of code.
Took me over two solid straight days to do ours and there is still a
lot I never touched and let be until I need to pull it apart.

Good luck.


Thanks - I'll need it :)

Si
Apr 18 '06 #6
Roland Pibinger wrote:
C++ lacks good refactoring tools, mostly due to the complicated
syntax. See also e.g.
http://www.artima.com/weblogs/viewpost.jsp?thread=11070 .


The word "refactoring" (according to the ISO Refactoring Standard) means
changing the code while passing all its unit tests. The odds of unit tests
here are very low; that's why I recommended the WELC book.

However, the OP asked about "refactoring" header files, which is a different
beast than general C++ refactoring. It typically requires static analysis
and reverse engineering of the header's dependency graph, followed by manual
changes. The analysis can be "fuzzy", whereas automated refactoring of
source must be so "sharp" that behavior absolutely never changes after a
refactor. C++ makes _that_ so hard that we might as well rely on manual
refactoring.

At a shot, I would run Doxygen on the code, take a vacation, come back, and
look at its Graphviz output. IIRC this output shows the header file graphs,
with circles and arrows. I would look for long sequences whose bases can be
Pimpl-ed out of other long sequences.

This search...

http://www.google.com/search?q=c%2B%...ependency+tool

....says for its second hit, "This tool scans c++ header files and source
files for #includes that ... It then generates the header and source files
for the entire dependency tree for ..."

So, as usual for modern engineering, it may all come down to the right
Google search expression ;-)

--
Phlip
http://www.greencheese.org/ZeekLand <-- NOT a blog!!!
Apr 18 '06 #7

Phlip wrote:
The only way to fix this problem is hours and hours of cut and paste
operations followed by compiling, followed by tracing where the errors
are comming from. Monolithic headers are just way too complex to wrap
your head around so you _have_ to break down and use the compiler as an
error tracing tool. It doesn't make a very good one and will spit out
difficult to interpret errors but it is the best you got in this case.


Can't they do pimpl first, then get a little breathing room before doing
that?


Pimpl isn't always needed or desired. Better to start with other
refactors first. Get the class declarations and stuff separated for
one...

The way I approached the problem was to pull out classes into their own
headers and include that header where the declaration used to be. Get
the thing to compile by including whatever header is needed in that new
header to get things to compile. Then create a blank source file that
includes your new header and try to build an object...this will tell
probably result in more things you depend on that didn't show up when
building it in line with others. Then look for ways to get rid of
headers through "class X;" directives and moving inline functions into
the source files. Then move on to the next class.

Any time anything depends on the main file look for the reason and pull
it out into its own. The first class I tried ended up going down a
bunch of lines of dependencies that had to be pulled out and weeded
through. It was the worst. Took hours of sweat and frustration before
I was even able to compile the first time again but then the rest fell
into place much easier. Anything you depend on is going to be above
your declaration so starting at the top might be a smart move...I
started by trying to pull out the class I needed.

Then start getting rid of the includes in the main header by including
headers in the appropriate source files....find these by removing the
include and looking for what no longer compiles for whatever "reason"
the compiler spits up.

Get rid of the main header...

THEN start looking for ways to lower dependancies amongst the various
header files through actual code refactoring if need be. Until this
point nothing has actually been changed in the design or code at all
except for moving it around in files.

Apr 18 '06 #8

Roland Pibinger wrote:
On Tue, 18 Apr 2006 11:13:49 -0700, "Simon Cooke"
<sc************@surr-nospam-eal.com> wrote:
Does anyone know of any tools for refactoring header files?


C++ lacks good refactoring tools, mostly due to the complicated
syntax. See also e.g.
http://www.artima.com/weblogs/viewpost.jsp?thread=11070 .


I have found Ref++ (for VS) to be fairly helpful. It offers the basic
refactors...rename, encapsulate, extract func, change sig, introduce
var, move up/down, extract super. It works most of the time. It can
be confused by the preproc (or the accasional phase of the moon/sun
spot error) but usually even figures that stuff out...unfortunately it
can decide to only apply a refactor in some places because of the
preproc so make sure all builds work after (we have several defines
based on product branches...) :p Reasonably priced too...

There appears to be one for xemacs that is much more pricy...haven't
tested it.

Apr 18 '06 #9
Noah Roberts wrote:
Pimpl isn't always needed or desired.


I forwarded this heresy to the moderated mailing list. ;-)

--
Phlip
http://www.greencheese.org/ZeekLand <-- NOT a blog!!!
Apr 18 '06 #10
So. C++ is the most difficult to parse language on Earth. Known fact.
However, it seems that there are quite a few projects that are already
doing it.

Here is a company that is selling their refactoring tool:
http://xref-tech.com/xrefactory/main.html

Also, you might want to check this page:
http://www.nobugs.org/developer/parsingcpp/. It's done by a guy who
?wanted? to build a C++ parser, but gave up eventually.

Hope this helped.

Apr 18 '06 #11
st************@gmail.com wrote:
So. C++ is the most difficult to parse language on Earth. Known fact.
However, it seems that there are quite a few projects that are already
doing it.


Are they refactoring the #include graph of header files?

--
Phlip
http://www.greencheese.org/ZeekLand <-- NOT a blog!!!
Apr 18 '06 #12

"Roland Pibinger" <rp*****@yahoo.com> wrote in message
news:44**************@news.utanet.at...
On Tue, 18 Apr 2006 11:13:49 -0700, "Simon Cooke"
<sc************@surr-nospam-eal.com> wrote:
Does anyone know of any tools for refactoring header files?


C++ lacks good refactoring tools, mostly due to the complicated
syntax.


It isn't the syntax, although the syntax defeats most standard parsing
engines (YACC, etc.). The solution there is relatively straightforward.
The hard part is the static sematnics: figuring out what the syntax
says, and what every symbol means. Once you're past that,
you still have to deal with format and comment capture,
multiple dialects, preprocessor directives, and finally get around to
providing tools that can actually transform
the code without breaking it.

The DMS Software Reengineering Toolkit provides a C++ front
end with all the above capability, as well as transformation machinery
to transform and reproduce compilable text with the original comments
and indentation.
See http://www.semanticdesigns.com/Produ...pFrontEnd.html

This would make a good foundation for an interactive
refactoring tool. (Why isn't it one? Well, it took us awhile to teach
DMS about C++...)

You can read about massive transforms applied to C++ code by DMS in the
paper,
Re-engineering C++ Component Models Via Automatic Program Transformation,
at http://www.semanticdesigns.com/Company/Publications/

--
Ira Baxter, CTO
www.semanticdesigns.com
Apr 18 '06 #13

"Simon Cooke" <sc************@surr-nospam-eal.com>
wrote in message news:ua**************@TK2MSFTNGP02.phx.gbl...
Does anyone know of any tools for refactoring header files?

We're using a third party codebase at work, and pretty much every file
includes a 50Mb precompiled header file. I'm looking for a tool that will
let us figure out which header files are actually needed by each .cpp, and
allow us to break this up so that we're not including the world in each one.
Ideally, the same tool would also recognize where #includes can be replaced with forward declarations, and even better, it'd automate the updates to the code files.

Is there such a tool? Or am I SOL until some bright spark writes one?


You're pretty much SOL at this point.

However, we could probably write one for you, based on our C++
program transformation tools.
See http://www.semanticdesigns.com/Produ...pFrontEnd.html
There's nothing else like it on the planet :-}

How much is it costing your organization to live with the current problem?
--
Ira Baxter, CTO
www.semanticdesigns.com
Apr 18 '06 #14
We have recently applied our Jolt award winning dependency matrix based
approach to generic c/c++. It takes the output of Doxygen and creates a
matrix which you can then transform and partition to obtain Lakos style
levelization. You could use it to help you deal with this problem in
the following two different ways:

1. Filter out all inter-file dependencies except for "include" to see
which files are included by which other files.

2. Keep all dependencies but filter out the "include" dependency to see
which files really depend on each other.

The goal of our approach (Lattix LDM) is architecture discovery and
control using inter-module dependencies.. We have applied this approach
fairly successfully to Java and Microsoft C/C++ (where we use bsc
files).

The Doxygen based approach in currently in beta. If you are interested
please send us email (info-AT-lattix.com) and we will be glad to
provide you with more information and make a download available.

Neeraj Sangal
Lattix, Inc.
http://www.lattix.com

Apr 19 '06 #15
Neeraj wrote:
It takes the output of Doxygen and creates a
matrix which you can then transform and partition to obtain Lakos style
levelization.


Diiiing!

--
Phlip
http://www.greencheese.org/ZeekLand <-- NOT a blog!!!
Apr 19 '06 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Andy Bulka | last post by:
Whilst almost responding to the 'dream project' thread I decided that this post warranted its own thread. What about a solid UML tool with round trip functionality for the Python Community? ...
4
by: Moosebumps | last post by:
I have a whole bunch of script files in a custom scripting "language" that were basically copied and pasted all over the place -- a huge mess, basically. I want to clean this up using Python --...
12
by: CppNewB | last post by:
I am absolutely loving my experience with Python. Even vs. Ruby, the syntax feels very clean with an emphasis on simplification. My only complaint is that there doesn't appear to be a great...
3
by: F. S | last post by:
Would somebody please compare different C# development tools, such as MS Visual Studio and Delphi 2005, C# and Delphi.net?
8
by: Frank Rizzo | last post by:
I keep hearing this term thrown around. What does it mean in the context of code? Can someone provide a definition and example using concrete code? Thanks.
3
by: Student | last post by:
Hi all, While compiling a program I had this message : tools.o(.data+0x0): multiple definition of `VAR_1' main.o(.data+0x0): first defined here tools.o(.data+0x4): multiple definition of...
1
by: Spam Catcher | last post by:
Hi all, Do you guys have a suggestion on the best refactoring tool for VB.NET? I've briefly looked at MZ-Tools. I'm currently testing out Refactor! (free MSDN SE version). Any other tools I...
13
by: Richard | last post by:
What to do when the best laid plans change and mazes of legacy are abolished, simplified, or built anew for contemporary reality. A preliminary database was developed for tracking material...
5
by: mbrown | last post by:
I'm fairly new to C, and newer to Linux. I currently do all of my development on a windows machine using Visual Studio, then use Make to build the code on Linux. Obviously this deprives me of a lot...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.