a standard set of refactoring steps for C and C++

While working on a large C++ project with another consultant, I started trying to codify some iterative steps for what we have been doing.

I recently rediscovered something instinct told me 7 years ago when I was working on a C++ called Hailstorm. I was retrofitting unit tests into that code base (pairing with Brian Siepert), and reasoned that starting at the lowest level components would be the best place to start. My reasoning at the time was that unit testing at a higher level would be suboptimal if the lower levels had bugs. On a side note, we found a real bug with every other unit test that we wrote (mostly in poorly written copy constructors), most of which prevented the product from finding exploitable bugs. I had hit a wall with the bug I was finding in the 70% code coverage I was getting using my WinRunner tests, and decided to take the testing down a level of granularity. It was *extremely* effective, as far as I was concerned.

Without initially remembering that experienced, I reasoned it out this time due to the difficulty of testing we are having. Some of the dependent objects are not only difficult to create/mock, but also have nasty static initialization in them (or via their transitive dependencies). This has led me to the same conclusion, but at the macro level in this particular instance:

Getting the dependent objects in other components clean becomes a near-necessity for getting the targeted code clean and tested. We have spent so much time trying to refactor toward testability only to hit a brick wall due to said dependencies.

As such, here is the way we are working now:

Start with the leaf nodes of the object dependency graph
remove unused privates, unused parameters, rebuild the system
remove (now) unused #include and forward declarations, rebuild the system
remove (now unused link dependencies, rebuild the system
fix any transitive dependencies (modules that were indirectly depending on this module's dependencies) by adding those dependencies where they are directly used

are there any fields only used in one or two methods?
remove the field and parameterize the method where it is used, update callers and rebuild

are there any stateless (no instance field access) methods?
make them static (this communicates that don't have cohesion with the class, making them easier to transplant elsewhere later)

does the function accept an object or struct as a parameter, but only use one or two pieces of data from the object/struct?

foo(a)
{
  return a.getB() + a.getC();
}

replace the heavier single parameter with the one or two lighter pieces that are actually used

foo(b, c)
{
  return b + c;
}

replace includes and link dependencies as appropriate
are these new parameter types defined in the same header as their container object?
extract the new parameter types into their own header file -- only include (and link) what is actually being used

does the function accept an object as a parameter but use more than a few pieces of data from the object?
if the class is large or difficult to construct, can another class/struct be extracted/sprouted that contains the fields in question?
if the type is a concrete class, can an existing base class/interface of the parameter's type be used instead?
can an interface (pure virtual class in C++), or lighter-weight base (possibly abstract) class be extracted and then used instead?

are the public methods that don't call other methods only used in one place outside of their containing class?
pull them into the class they are being called from, if the calling class is easily constructed for unit testing
if the parameters to the pulled-up method are present as fields in its new home class, remove parameters (and static keyword, if present) and use the fields instead
if the behaviour represented by the method can easily be unit tested via the class' other public methods, the pulled-up method can probably be private at this point

does the leaf node being studied contain any stateless methods that take one of your other classes?

LeafClass::foo(Bar bar, int a)
{
   return bar.frobnikate(a);
}

can it just be put into the class of the parameter?

Bar::foo(int a)
{
   return frobnikate(a);
}

if the leaf node objects still exist, unit test them (CppUnitLite isn't a bad place to start for a framework, and the approaches from my C# Unit Testing book apply almost 100% to C/C++)
make sure the code coverage of that behaviour in the leaf nodes is > 90%

move up one level in the object dependency graph and start over again

Note that this could be done with C (or whatever other language as well), replacing class with struct and file, etc.

phew. I'm going to do some more in-depth applications of these practices on open source and try to turn them into some small case studies for potential magazine articles and/or talks.

Discussion:

showing all 0 messages

`YaK:: WebLog #535 Topic : 2008-03-06 06.47.15 matt : a standard set of refactoring steps for C and C++`	`[Changes] [Calendar] [Search] [Index] [PhotoTags]`
	`[Back to weblog: pretention]`

a standard set of refactoring steps for C and C++

Discussion:

(No messages)