There are many languages, tools, and design methodologies in the software community that are aimed at the creation of new software. But a lot of valuable software is the product of evolution, reuse, and reengineering. Some software is too expensive to “throw away and start over”. A skilled software team will have an arsenal of techniques at their disposal for adapting, evolving, and refactoring existing code and designs.
The workshop participants discussed a wide range of technologies and experiences. The most important practices and techniques were distilled from the participants’ discussion and divided into four major categories:
Reengineer using a small team, where everyone knows the roles of
others
Any collaborative work, especially reengineering and evolution work,
is made easier by having a team that can work smoothly together, and each
team member can find the right person to refer questions related to both
the domain and the technology.
Use small smart team for exploration; have a ready and willing team
for development; with a supportive management
This has been a common formula for successful reengineering work.
The team should have some expertise in the legacy system
Although it isn’t absolutely necessary for everyone on the team to
have expertise in the legacy system, it is extremely useful to have even
one part-time team member who was involved in the design decisions for
the existing software. This person can resolve a lot of questions
and help preserve the best parts of the legacy design.
Focus the smart people on critical areas
The assignment of staff to different areas of a large reengineering
project should not be arbitrary. The best results are achieved when
the sharpest team members take charge of the critical parts of the system.
Understand and leverage people’s area(s) of expertise
Some ongoing investment is often needed to grow that expertise.
Management should know what motivates your people, and keep focused
on doing the necessary motivation
There are many pitfalls here. If a company or organization gives
the greatest financial rewards and the highest prestige to the developers
working on new green-field projects, the “maintenance developers” will
feel very unmotivated to do a good job.
Make the business case obvious
Whatever you decide to do in reengineering and evolution, you will
get good management support if you can help managers understand the business
value of the work.
Ask if there is an alternative
Sometimes there are alternatives that need to be considered.
This should be part of your business process.
Measure business value of the legacy system
This is to prevent doing unnecessary work, if the legacy system no
longer has adequate business value. If the legacy system has limited
business value, it may constrain your choices in how to do the reengineering.
Make sure the team has the expertise needed
Another story from the workshop: A large project (Boeing 777)
decided on using the Ada language, but they had one computer in their architecture
that didn’t have an Ada compiler. They set up a small team (with
inadequate compiler experience) to develop an Ada compiler as a “side project”.
The schedule languished, no real progress was made, and finally they had
to bring in a team of outside experts to work on the Ada compiler in order
to meet the schedules.
Use retrospectives
Retrospectives are sometimes called “post-mortems”. They are
quite popular among the agile development and XP community. The best
source of information is the book Project Retrospectives by Norm Kerth.
Retrospectives often bring up serious issues that concern the usability
of the system and its flexibility.
Incremental application of technologies
Don’t try to apply too many technologies at once in a reengineering
effort.
Correctly estimate a team’s abilities
Don’t assign a novice team to do work that is beyond them.
Create and maintain accurate models
This might use techniques such as “round trip engineering” if the tools
support it.
Use source code control for different lifecycle needs
Good source code control is especially important if development and
testing are overlapping. You might want to continue making changes
to the software while an earlier cycle is being tested.
Methodology support for maintenance and enhancement
Understand and document the conventions and patterns in the development process
Perform legacy data migration planning
In addition to transforming code, some significant changes may need
to occur in existing data files and databases. It is good to start
planning this early, especially if the legacy system has a lot of “crufty”
data that needs to be restructured in the reengineering process.
Code reading
This is a practice that is underused by many software professionals
and underrated by most managers. Code reading skills are critical
to effective software evolution – it is really important to be able to
capture and maintain the basic design intent of existing software during
the evolution process.
They created a set of transformations using the tool, and then did careful testing on snapshots of the Smalltalk application’s image to make sure that the refactorings were correct. After they were sure things would work, they were able to apply the complete large refactoring in a single weekend afternoon.
Modeling
Developers can use models as part of the thinking process about how
to reengineer their code. It is a good way to keep organized – using
both a use-case model to give an overall description of the system behavior
that should stay unchanged, and a class model to lay out the details of
the transformation.
Code analysis tool with a trained user
Several workshop participants reported significant success in using
code analysis tools as part of the reengineering process. They can
save a lot of work. But one cautionary note: make sure that
there is at least one trained user of the tool on the team.
Languages/architectures for extreme change
Richard Gabriel told the story of the original Yahoo Store software,
which was an application written in Common Lisp. It provided two
capabilities. First, it could generate and display an on-the-web
questionnaire for a potential on-line store owner to fill out to define
their store (products, prices, pictures, charging options, and so on).
Second, it was a multi-threaded web server that actually served the requests
from customers.
The architecture was definitely designed to support change. Every once in a while, a store would crash because of a bug in the software. But this would only crash the thread, not all of the other stores. And the developer could patch and reinstall the code without taking the system down.
Design history
When a developer works on the evolution of a piece of legacy code,
any information about the design intent is useful. The developer
will normally try to make changes that preserve the original design intent
(because it is less likely to introduce new bugs because of unforeseen
interactions). Design history might be obtained by talking with people,
reading documents, extracting information from code and a change management
system.
Decrufting data
Tools are useful for automating the process of transforming the legacy
data at the point that the newly-reengineered system is ready to cut over.
The difference between an economic choice and an emotional choice is perfectly valid for animals. But emotional attachment to a piece of legacy software can be quite irrational.
How long do you keep milking the dying cow?
The question is – what are the economics of continuing to evolve a
software system that is no longer economical to maintain and transform.
At some point, management needs to make a decision without being
clouded by emotion.
For this reason, the sections below give different lists of issues and
strategies for open source development. These sections use the same
basic titles as in section 1.
Open source team process
Developing open source software on a legacy base can in general use
a larger software team size than conventional development. There
are still important things about the structure of an open team that must
be followed.
Expertise in legacy system in core architecture team
Expertise in the legacy system is still needed, just like in conventional
development.
Focus of experts on critical areas
It is still necessary to have experts to call on for the most critical
parts of the software. This is actually an easier thing to manage
in open source development – because the experts tend to be drawn to the
critical areas where they have their specialized knowledge and skill.
Emergent understanding of people’s expertise
The expertise of team members in an open source project is not always
known early in development, but the team’s knowledge of the skills of its
members will increase over time. If someone is a “loser” it will
be apparent relatively quickly.
Motivation (a wide spectrum of motivational drivers)
There are just as many different motivators for open source development
as for conventional development – including monetary motivations.
Modeling
Open source projects don’t do a lot of conventional modeling.
Most of their “design model” is actually in the source code – plus the
comments, README files, HOWTOs, and the O’Reilly books.
Conventions
Conventions (design conventions, coding rules, and so on) are just
as important in open source development as in conventional development.
Source code control
Just about every open source project uses source code control as a
central part of the development environment – usually CVS.
Code reading
Open source project participants do a lot of code reading – probably
as much code reading as the rest of the development world should do.
Other sources of design history include code review results and the change history information in block comments at the beginning of each module.
Low-tech power tools
Open source project members generally don’t use fancy software development
tools. Most software is either written in C or in newer-generation
interpretive scripting languages, in order to maximize the software’s portability.
The developers rarely use an interactive development environment (IDE)
– most of them use text editors like emacs for writing code.
Black-box (software modules that a team can use, but they can’t see the internal structure) Domain knowledge (techniques for capturing and evolving) Flex point (a “planned variation”) and Levers (one way to change things at a flex point – a flex point may have many different levers) Reverse engineering Model Readability of code (code reading skills) Component (something that can be plug-and-play) Living requirements (requirements that are updated and adapted throughout the software’s lifecycle) Savant (a person who is an intuitive problem solver – this is also the original meaning of the word “hacker”) Separation of concerns Tribal knowledge (and the tribe’s blind spots) Working system (very important in agile development) Literate programming (techniques to write down design decisions – also see Donald Knuth’s book of the same title) Methodology and tool support Managing customer expectations (helping them understand which changes are easy and which are hard) Domain architecture Continuous integration Nuancing Missing documentation (underlying knowledge) Rewriting / transforming Continuous redesign Abundance (a characteristic of most open source projects) Matching (business models to specs) Legacy data Discovery Requirements languages Teams (especially in the context of refactoring) Resistance to change Organizational requirements “Implementors rule” (the concept that whatever the architecture documents and design models say, the architecture and design information in the code is the most important thing for future evolution) Differences in languages Anticipating changes in technology Liability Pattern mining Conventions / style riffs