Summary
This article describes the method for software rewrites when the original source code is unavailable – or unstable, of questionable quality or otherwise the product of a beautiful mind.
Background
One part of what I do is software salvage.
I’ve had the joy of working on bringing life back to abandoned codebases, extend them with new features, modernize services and translate software from obsolete programming languages, operating systems and frameworks.
What has lead to the need for this?
In the past couple of decades, many companies outsourced their application development to detached entities, either within walking distance or anywhere across the globe. Instead of signing a maintenance or SAP contract, they got software custom-designed and built for them at reduced cost.
Results also varied. Budgets ballooned, feelings were hurt, final deliverables were received together with the final invoice, and many bitter promises were made. Sometimes that also meant cutting off communications with previous developers.
Why not start from scratch?
Starting from scratch might be an acceptable solution. However, in many cases this has led to even more work and a less than desirable outcome.
Secret sauce
Some companies’ secret sauce is in their proprietary software stack. It may have started as a bright idea that worked on paper, then got converted into automation when the business was computerized. Both the person with the idea and the people who wrote the software have retired into adjacent villas in the Bahamas.
Hidden logic
Even when a process is straight-forward, there may be hidden nuances in the logic or automation that get missed when rewriting from scratch, only to be discovered months later exporting a yearly report.
Training burden
This only applies if the system to be replaced is user-facing, but there’s significant cost involved in retraining users to new software. This is sometimes met with almost Stockholm-syndrome-level hostility.
Writing’s easy, reading is hard
This is perhaps the biggest reason developers opt to rewrite software from scratch: it’s simply easier to write new software than understand code someone else wrote. This applies even when the original developer has followed good practices, documented their code where necessary, and sit down with the new developer for a day to explain the old codebase.
The process
Since software salvage isn’t exactly a well-defined field in the industry, this is a rough outline of how I approach it. If you are working in the same field and are willing to share your thoughts and feelings, plase write them into comments below. You’ve probably noticed too it’s not easy to find literature on the subject.
Define
The salvage process starts with defining first the big picture functionality, then iterating to finer detail until everything’s covered. But how is the existing functionality established?
1. Documentation
Having sufficient documentation for the original application and its features covers a lot of ground. This can be any documentation, whether it’s written for the development team (the original requirements document) or for the end user (user manual / online help etc).
Reading through a user manual will help you get into the mindset of an existing user, to see what they’re expecting to happen.
2. Running the existing application
If you can find a way to make the old application run, even if it takes some effort and asking around for obsolete hardware, this is often the most transparent way to familiarize yourself with how the application should function. Also, DosBox, VirtualBox, UTM, old iOS and Android emulators are your friends. It’s often possible to even find a way to get them connected to Internet.
3. Tutorials and promotional material
If there are recorded video tutorials to see the system in action or other training material, they can provide a good reference on the functionality.
4. People
Even if the original developer isn’t anywhere to be contacted, it may be best to just ask the current owner of the software for any details you may have missed. They might even point you to a trusted user. It might be the company had commissioned the software but aren’t the ones using it every day. In such cases a user may be in a better position to answer your questions.
Why isn’t people the first step? This is so that the most ground possible is covered before taking the client’s precious time. It could even be they have a different perception of how things work.
5. Disassembly
This is not always an option, since some software EULAs specifically prohibit reverse engineering, but this is more often an exception than a rule.
Many C/C++/ObjC programs can be easily disassembled with software such as Hopper, IDA or Capstone. The achieved level of readability depends heavily on how aggressively the original compile flags were set to optimize the output code.
This method is especially useful when trying to observe some nuance in operation or mapping out a proprietary binary file format used by the application.
Map and revise
For each part of, start by understanding and defining the existing control flow. Use a tool to build a call dependency graph. Map out object inheritance and data model relationships.
Black boxes
Turn any dubious blocks of code into intentional black boxes. Try to define what should go in and what should come out.
Coverage
You can apply the concept of code coverage to a new implementation of old code – a printout and highlighter pen are your friends. Print out the old codebase (try to stay within readable font sizes) and highlight the left edge of any blocks you feel you’ve implemented. This way you can easily go back and figure out any one-off small things that might otherwise get lost in translation.
Overcompensate
When you decide to completely redo any part of software, make sure you follow good practices, document the method and make any future efforts to improve the codebase easier. Otherwise the next person may find themselves in the same spot you’re in.
Compare
Whether you did a partial or complete rewrite, map out the new implementation and try to compare it to the old one. This may help you visualize any key differences that might still be functionality that needs to get added back in but just wasn’t obvious in the original.
Iterate
Like any software development, reserve time to iterate the end result with your client. Include this in your project estimate. Overconfidence isn’t professional, especially when writing software. Everyone makes mistakes and many of them aren’t caught by the compiler.
Leave a Reply