Thursday, August 12, 2010

How to do proper Code Maintenance

It's safe to say that most developers prefer greenfield development. Wikipedia defines greenfield as "a project that lacks any constraints imposed by prior work." It gives us an opportunity to utilize all of the best practices we strive for: unit testing, code reviews, loose coupling, mockable design and the like.

But in the real world, we often have to maintain existing code. That maintenance includes not only bug fixes, but adding new features too. The older the codebase, the more likely it is that there will be other applications that rely on specific functionality -- especially in the case of class libraries. It may be tempting to jump into older code and start ripping apart the "bad stuff," but we must be careful not to break existing functionality. In this article, I'll review some techniques that will allow you to move old code forward, with little to no impact on existing codebases that rely on the old code. I'll also show you how you can improve the old code to make it more testable.

But in the real world, we often have to maintain existing code. That maintenance includes not only bug fixes, but adding new features too. The older the codebase, the more likely it is that there will be other applications that rely on specific functionality -- especially in the case of class libraries. It may be tempting to jump into older code and start ripping apart the "bad stuff," but we must be careful not to break existing functionality. In this article, I'll review some techniques that will allow you to move old code forward, with little to no impact on existing codebases that rely on the old code. I'll also show you how you can improve the old code to make it more testable.

When you create a library for other developers or even just yourself, you're defining an API. The interface to this library is all of the methods and signatures you've defined. Changing these may allow you, as the library developer, to clean up some things, but it's going to add pain points to all of the consumers of your class. A simple change to a method signature could result in a much better library design with better testability, but could cause 10 or 15 other applications to fail to compile until they're updated to support the new signature. If you have a public API that hundreds or even thousands of developers may be using, you've put them in a tough spot. Not only do their applications fail to compile with the new library, they're probably considering the future of your library and whether they want to continue using it after the pain you've caused them.

Maintaining backward compatibility can be a challenge sometimes, but it leads to stability in the way other developers and other code interact with your library.

Overloading
The easiest way to modify existing code with zero impact is to utilize method overloading. Overloading allows you to add additional parameters to an existing call, but you can maintain the old signatures so as not to break existing clients.

Here's a totally contrived example of some utility that grabs the first 10 files in some directory:

public string[] ReadFiles(string directory)
{
return Directory.GetFiles(directory).Take(10).
ToArray();
}

At some point in the future, you may want to use this function, but you need 15 or maybe even 20 files read in. The quick solution is to simply require the number of files to be passed in as a parameter:

public string[] ReadFiles(string directory, int maxFiles)
{
return Directory.GetFiles(directory).
Take(maxFiles).ToArray();
}


But now any existing code compiled against this new library will break. We need to include an overload that will keep the old code working the same way (returning 10 files):

public string[] ReadFiles(string directory)
{
return ReadFiles(directory, 10);
}


Using overloading is probably the most common way that you can add or enhance functionality while still maintaining backward compatibility.

Remove UI Dependencies
One of the key things I do before modifying any existing code is to make sure there's a unit test for it. This ensures that I don't break the expected functionality of the code when I make my change. If you're like me, you usually find that older code has limited or no unit tests. You should always take advantage of maintenance time to create some unit tests for older code.

From time to time, I've come across library code that makes use of a UI element. It may just be a simple Message Box call or even popping up a Save File dialog, but those elements make writing automated unit tests quite difficult -- if not impossible. When I run into these situations, I like to use a combination of interfaces and method overloading to remove the UI dependency. This makes unit testing much easier and the code becomes more flexible.

Here's an example: Let's say you have a utility class that has a method to close the file you were writing to. However, the file is only opened and written to on an as-needed basis. For this reason, the code that executes to close the file first checks a dirty flag and will only prompt the user to save the file if the dirty flag is true:

public void CloseFile()
{
bool saveFile = true;
if( IsDirty )
{
saveFile = MessageBox.Show("Save Changes?", "Save",
MessageBoxButtons.YesNo) == DialogResult.Yes;
}

if( saveFile )
{
// code to save the file
}
}

The MessageBox in the middle of this code makes it impossible to test this in an automated fashion, using traditional tools like NUnit or MSTest, for instance. You must eliminate the dependency on the MessageBox. First, define an interface to handle the ISaveFilePrompt:

public interface ISaveFilePrompt
{
bool ShouldSaveFile();
}

Next, change the CloseFile method to use this interface instead of the MessageBox:

public void CloseFile(ISaveFilePrompt saveFilePrompt)
{
bool saveFile = true;
if (IsDirty)
{
saveFile = saveFilePrompt.ShouldSaveFile();
}

if (saveFile)
{
// code to save the file
}
}

You now have a method that can easily be unit tested in an automated setting. You can mock out the ISaveFilePrompt object to return whatever you want for the ShouldSaveFile method. But in the process, you've broken existing code that relied on this code to display a MessageBox. Because the point of this article is to try to avoid breaking existing code, you still have some work to do. To complete this refactoring, you need to create an implementation of ISaveFilePrompt that uses a MessageBox:

public class MessageBoxPrompt : ISaveFilePrompt
{
public bool ShouldSaveFile()
{
return MessageBox.Show("Save Changes?", "Save",
MessageBoxButtons.YesNo) == DialogResult.Yes;
}
}

Finally, we'll create a parameter-less version of CloseFile that simply defers execution to our new version and uses the MessageBoxPrompt:

public void CloseFile()
{
CloseFile(new MessageBoxPrompt());
}

We've refactored this code so it can now be tested in an automated fashion while maintaining full backward compatibility and functionality with existing clients. We also have the added benefit of having a CloseFile method that can now be used in a batch mode or other UI-less environment (like a Windows Service).

Eliminate Static Methods
When it comes to unit testing and mocking, static methods pose quite a challenge. A lot of mocking frameworks (like Rhino Mocks and NMock) rely on intercepting method calls via dynamic proxies. This works great for interfaces and virtual class methods, but hits a brick wall with static methods -- in other words, static methods can't be mocked with mocking frameworks that rely on dynamic proxies. While there are some frameworks that use the .NET Profiler API (like Typemock) and can therefore intercept calls anywhere (even static methods), it is generally considered code smell to create static methods that do anything but simple, procedural code.

But there's a lot of code out there that utilizes static methods. Heck, I've even written a lot of it. Maintenance time is the perfect time to clean up those static methods!

I've most often used static methods for configuration files. I'd create a small class with a few read/write properties that would be used for saving and loading user-defined application settings. I'd often add a static "Load" method where I would provide the filename to load up the class from disk:

[Serializable]
public class MyAppSettings
{
public int BatchSize { get; set; }
public string JobName { get; set; }

public static MyAppSettings Load(string filename)
{
// deserialize object from the file "filename"
}
}


And somewhere in my main code when I needed to load the settings, I'd simply call the Load method:

MyAppSettings settings = MyAppSettings.Load("settings.xml");The problem with this approach is that it relies on a physical file on disk -- that's strong coupling. And because this is a static method, I can't mock out the call to MyAppSettings.Load using my favorite mocking tool (Rhino Mocks).

When it comes time to write a unit test for the following code, I hit a road block:

public void ProcessBatch()
{
MyAppSettings settings = MyAppSettings.
Load("settings.xml");

for(int batchNumber = 0 ; batchNumber < settings.
BatchSize ; batchNumber++)
{
// process batch
}
}Just as was discussed in the earlier section on removing UI dependencies, you'll use an interface to handle the loading of the application settings:

public interface IAppSettingsLoader
{
MyAppSettings LoadSettings();
}

Now update the method to use the IAppSettingsLoader:

public void ProcessBatch(IAppSettingsLoader
appSettingsLoader)
{
MyAppSettings settings = appSettingsLoader
.LoadSettings();

for(int batchNumber = 0 ; batchNumber < settings.
BatchSize ; batchNumber++)
{
// process batch
}
}

And create an IAppSettingsLoader that loads the settings from disk so that calls to "ProcessBatch()" by existing clients will continue to work:

public class FileSettingsLoader : IAppSettingsLoader
{
private readonly string filename;

public FileSettingsLoader(string filename)
{
this.filename = filename;
}

public MyAppSettings LoadSettings()
{
// deserialize object from the file "filename"
}
}

And the ProcessBatch overload that matches the existing signature is updated to use this new class:

public void ProcessBatch()
{
ProcessBatch(new FileSettingsLoader("settings.xml"));
}

The last thing you want to do is discourage new code from using the static Load method.

The Microsoft .NET Framework has a built-in attribute called "Obsolete." The obsolete attribute can be applied to just about anything -- classes, methods, enums, interfaces, delegates and so on. In this case, we'll apply the obsolete atribute to the static Load method:

[Obsolete("Use FileSettingsLoader to load MyAppSettings from disk")]
public static MyAppSettings Load(string filename)
{
// deserialize object from the file "filename"
}

Now, any code that uses the static Load method will get a warning at compile time that it should instead be using the FileSettingsLoader. You can keep this attribute as is for a couple of versions, and then add another parameter to the attribute that will actually treat this as an error and prevent client code from compiling:

[Obsolete("Use FileSettingsLoader to load MyAppSettings from disk", true)]
public static MyAppSettings Load(string filename)
{
// deserialize object from the file "filename"
}

The Obsolete attribute is a nice way to give consumers a "heads up" that a method shouldn't be used anymore. Once you get to the point where you want to force people to stop using a method (or class), set the "error" parameter to true and they will be forced to change!

Support Old Data Formats
At times, it's not code you must change, but the data your application produces. You should always make an effort to support old data formats as long as it's feasible. The easiest way to accommodate legacy data formats is to provide a data convertor that will read data in the old format and convert it to the new format. I've done this several times with XML serialization. Imagine the following class:

The Obsolete attribute is a nice way to give consumers a "heads up" that a method shouldn't be used anymore. Once you get to the point where you want to force people to stop using a method (or class), set the "error" parameter to true and they will be forced to change!

Support Old Data Formats
At times, it's not code you must change, but the data your application produces. You should always make an effort to support old data formats as long as it's feasible. The easiest way to accommodate legacy data formats is to provide a data convertor that will read data in the old format and convert it to the new format. I've done this several times with XML serialization. Imagine the following class:



public class ConfigData
{
public string DirectoryName { get; set; }
public int FileCount { get; set; }
public string[] IgnoredExtensions { get; set; }
}

Here's an instance of this ConfigData class when serialized as XML:


C:\Data
20

.exe
.bat



Suppose you need to change the DirectoryName property to an array of strings instead of a single string? This is a breaking change from a data format standpoint. Here's the new class definition:

public class ConfigData
{
public string[] DirectoryNames { get; set; }
public int FileCount { get; set; }
public string[] IgnoredExtensions { get; set; }
}An XMLSerializer for the new ConfigData type would not be able to deserialize the old data into the new structure. However, by writing a simple data convertor that will automatically convert the old format into the new class structure, you can continue to support the old data:

public class LegacyConfigDataReader
{
public ConfigData ReadFile(string filename)
{
XDocument document;
using (var fs = new FileStream(filename, FileMode.
Open, FileAccess.Read))
{
using (var sr = new StreamReader(fs))
{
document = XDocument.Load(sr);
}
}

return new ConfigData
{
DirectoryNames = new[]
{document.Descendants("DirectoryName").
ElementAt(0).Value},
FileCount = Int32.Parse(
document.Descendants("FileCount")
.ElementAt(0).Value),
IgnoredExtensions =
document.Descendants("IgnoredExtensions").
ElementAt(0).Descendants()
.Select(v => v.Value).ToArray(),
};
}
}

It is assumed that once the new data is serialized to disk, it will have a new file extension or some other way for the application to know the format has been updated. Whenever the application needs to read the old format, the LegacyConfigDataReader can be used. This is an example where legacy code may have to be changed depending on how you implemented your original deserialization code.

And, obviously, this technique will only work with a data format you can easily parse (like XML or some other text-based format) or a custom binary format that you've created yourself. If you had used .NET's built-in binary serialization, it would be just about impossible to support older class layouts.

In this article, we've seen a few methods that allow you to modify existing code with little or no impact on existing code. At the same time, we've also enhanced some of that code to make it more testable.

Maintenance time shouldn't be viewed as a walk through a minefield, but instead as an opportunity to solidify and enhance your existing code.

[original source: http://visualstudiomagazine.com/Articles/2010/05/01/Make-Good-Use-of-Code-Maintenance.aspx?Page=3]

No comments: