There are many languages, tools, and design methodologies in the software community that are aimed at the creation of new software. But a lot of valuable software is the product of evolution, reuse, and reengineering. Some software is too expensive to “throw away and start over”. A skilled software team will have an arsenal of techniques at their disposal for adapting, evolving, and refactoring existing code and designs.
The workshop participants discussed a wide range of technologies and experiences. The most important practices and techniques were distilled from the participants’ discussion and divided into four major categories:
Reengineer using a small team, where everyone knows the roles of
Any collaborative work, especially reengineering and evolution work, is made easier by having a team that can work smoothly together, and each team member can find the right person to refer questions related to both the domain and the technology.
Use small smart team for exploration; have a ready and willing team
for development; with a supportive management
This has been a common formula for successful reengineering work.
The team should have some expertise in the legacy system
Although it isn’t absolutely necessary for everyone on the team to have expertise in the legacy system, it is extremely useful to have even one part-time team member who was involved in the design decisions for the existing software. This person can resolve a lot of questions and help preserve the best parts of the legacy design.
Focus the smart people on critical areas
The assignment of staff to different areas of a large reengineering project should not be arbitrary. The best results are achieved when the sharpest team members take charge of the critical parts of the system.
Understand and leverage people’s area(s) of expertise
Some ongoing investment is often needed to grow that expertise.
Management should know what motivates your people, and keep focused
on doing the necessary motivation
There are many pitfalls here. If a company or organization gives the greatest financial rewards and the highest prestige to the developers working on new green-field projects, the “maintenance developers” will feel very unmotivated to do a good job.
Make the business case obvious
Whatever you decide to do in reengineering and evolution, you will get good management support if you can help managers understand the business value of the work.
Ask if there is an alternative
Sometimes there are alternatives that need to be considered. This should be part of your business process.
Measure business value of the legacy system
This is to prevent doing unnecessary work, if the legacy system no longer has adequate business value. If the legacy system has limited business value, it may constrain your choices in how to do the reengineering.
Make sure the team has the expertise needed
Another story from the workshop: A large project (Boeing 777) decided on using the Ada language, but they had one computer in their architecture that didn’t have an Ada compiler. They set up a small team (with inadequate compiler experience) to develop an Ada compiler as a “side project”. The schedule languished, no real progress was made, and finally they had to bring in a team of outside experts to work on the Ada compiler in order to meet the schedules.
Retrospectives are sometimes called “post-mortems”. They are quite popular among the agile development and XP community. The best source of information is the book Project Retrospectives by Norm Kerth. Retrospectives often bring up serious issues that concern the usability of the system and its flexibility.
Incremental application of technologies
Don’t try to apply too many technologies at once in a reengineering effort.
Correctly estimate a team’s abilities
Don’t assign a novice team to do work that is beyond them.
Create and maintain accurate models
This might use techniques such as “round trip engineering” if the tools support it.
Use source code control for different lifecycle needs
Good source code control is especially important if development and testing are overlapping. You might want to continue making changes to the software while an earlier cycle is being tested.
Methodology support for maintenance and enhancement
Understand and document the conventions and patterns in the development process
Perform legacy data migration planning
In addition to transforming code, some significant changes may need to occur in existing data files and databases. It is good to start planning this early, especially if the legacy system has a lot of “crufty” data that needs to be restructured in the reengineering process.
This is a practice that is underused by many software professionals and underrated by most managers. Code reading skills are critical to effective software evolution – it is really important to be able to capture and maintain the basic design intent of existing software during the evolution process.
They created a set of transformations using the tool, and then did careful testing on snapshots of the Smalltalk application’s image to make sure that the refactorings were correct. After they were sure things would work, they were able to apply the complete large refactoring in a single weekend afternoon.
Developers can use models as part of the thinking process about how to reengineer their code. It is a good way to keep organized – using both a use-case model to give an overall description of the system behavior that should stay unchanged, and a class model to lay out the details of the transformation.
Code analysis tool with a trained user
Several workshop participants reported significant success in using code analysis tools as part of the reengineering process. They can save a lot of work. But one cautionary note: make sure that there is at least one trained user of the tool on the team.
Languages/architectures for extreme change
Richard Gabriel told the story of the original Yahoo Store software, which was an application written in Common Lisp. It provided two capabilities. First, it could generate and display an on-the-web questionnaire for a potential on-line store owner to fill out to define their store (products, prices, pictures, charging options, and so on). Second, it was a multi-threaded web server that actually served the requests from customers.
The architecture was definitely designed to support change. Every once in a while, a store would crash because of a bug in the software. But this would only crash the thread, not all of the other stores. And the developer could patch and reinstall the code without taking the system down.
When a developer works on the evolution of a piece of legacy code, any information about the design intent is useful. The developer will normally try to make changes that preserve the original design intent (because it is less likely to introduce new bugs because of unforeseen interactions). Design history might be obtained by talking with people, reading documents, extracting information from code and a change management system.
Tools are useful for automating the process of transforming the legacy data at the point that the newly-reengineered system is ready to cut over.
The difference between an economic choice and an emotional choice is perfectly valid for animals. But emotional attachment to a piece of legacy software can be quite irrational.
How long do you keep milking the dying cow?
The question is – what are the economics of continuing to evolve a software system that is no longer economical to maintain and transform. At some point, management needs to make a decision without being clouded by emotion.
For this reason, the sections below give different lists of issues and
strategies for open source development. These sections use the same
basic titles as in section 1.
Open source team process
Developing open source software on a legacy base can in general use a larger software team size than conventional development. There are still important things about the structure of an open team that must be followed.
Expertise in legacy system in core architecture team
Expertise in the legacy system is still needed, just like in conventional development.
Focus of experts on critical areas
It is still necessary to have experts to call on for the most critical parts of the software. This is actually an easier thing to manage in open source development – because the experts tend to be drawn to the critical areas where they have their specialized knowledge and skill.
Emergent understanding of people’s expertise
The expertise of team members in an open source project is not always known early in development, but the team’s knowledge of the skills of its members will increase over time. If someone is a “loser” it will be apparent relatively quickly.
Motivation (a wide spectrum of motivational drivers)
There are just as many different motivators for open source development as for conventional development – including monetary motivations.
Open source projects don’t do a lot of conventional modeling. Most of their “design model” is actually in the source code – plus the comments, README files, HOWTOs, and the O’Reilly books.
Conventions (design conventions, coding rules, and so on) are just as important in open source development as in conventional development.
Source code control
Just about every open source project uses source code control as a central part of the development environment – usually CVS.
Open source project participants do a lot of code reading – probably as much code reading as the rest of the development world should do.
Other sources of design history include code review results and the change history information in block comments at the beginning of each module.
Low-tech power tools
Open source project members generally don’t use fancy software development tools. Most software is either written in C or in newer-generation interpretive scripting languages, in order to maximize the software’s portability. The developers rarely use an interactive development environment (IDE) – most of them use text editors like emacs for writing code.
Black-box (software modules that a team can use, but they can’t see the internal structure) Domain knowledge (techniques for capturing and evolving) Flex point (a “planned variation”) and Levers (one way to change things at a flex point – a flex point may have many different levers) Reverse engineering Model Readability of code (code reading skills) Component (something that can be plug-and-play) Living requirements (requirements that are updated and adapted throughout the software’s lifecycle) Savant (a person who is an intuitive problem solver – this is also the original meaning of the word “hacker”) Separation of concerns Tribal knowledge (and the tribe’s blind spots) Working system (very important in agile development) Literate programming (techniques to write down design decisions – also see Donald Knuth’s book of the same title) Methodology and tool support Managing customer expectations (helping them understand which changes are easy and which are hard) Domain architecture Continuous integration Nuancing Missing documentation (underlying knowledge) Rewriting / transforming Continuous redesign Abundance (a characteristic of most open source projects) Matching (business models to specs) Legacy data Discovery Requirements languages Teams (especially in the context of refactoring) Resistance to change Organizational requirements “Implementors rule” (the concept that whatever the architecture documents and design models say, the architecture and design information in the code is the most important thing for future evolution) Differences in languages Anticipating changes in technology Liability Pattern mining Conventions / style riffs