Anbork.info presents,
|
Pain-Reduction
Techniques
|
for
Software Development
|
ith an hour left before the deadline, tensions run high
amongst partners Jim and Matt on Coding Team Alpha. Working separately across
campus, they are communicating via gChat regarding a bug that neither of them
seems to be able to find. Jim suggests that the FOR loop may be going past the
allotted array indices into segfault territory. A quick change of the
parameters, compile, run, and … Success! No errors. Now for the test cases.
Pass, Pass, Pass … Fail. “What happened?” asks Matt. “The last test case
failed” types Jim. “Let me take a look at it, go ahead and commit your code”
says Matt. Jim commits the code to their mutual repository and Matt updates his
version. While merging the changes, he notices the conflict. His code detects the “to” keyword, and has
the regex backend working properly. The repository does not. He compiles and
runs his version. “Mine just passed all the tests” types Matt. “I added the “to”
parser while you were working on that loop.” With all test cases passing, the
final version is committed and tagged. Both members continue on with their
nights with the deadline achieved.
This is just one example of a common problem that
programmers working in groups face. But an error with version control software
is hopefully the least of your worries. Bad documentation, inconsistent
development techniques, and how to avoid that dang Subversion error above are a
few of the topics I will discuss in this article.
·
Each major section has a “Reducing the pain”
area, underlined and Italicized that contains what exactly you should do to
make your life easier. Here is an example
Reducing
the pain on the “How to Read This Paper Section”
Ø
Remember to read this so you know how to reduce
the pain.
Or in an abbreviated format,
RTP – How
to Read This Paper
Ø
Remember to read this so you know how to reduce
the pain.
The information given in this article is not guaranteed to
actually help you. Because people both work and learn differently you may hate
a certain tool or love another one. Either way, the tools I list are my
personal preference for minimizing pain in software development.
Table of Contents
“Interpreting bad writing is like interpreting the bible, crazy
people will start killing babies” - Oscar Wilde
Comments have been explained to me in a few different ways
over the years. They can have several different functions within your code and
all of them are just as valid as the next. The first is commenting with the
purpose of reviewing your code. Essentially with this method, the comments are
treated as pseudo code, and reading the comments gives you a literal
understanding of the program. In this style, planning future code can also take
place. Next is commenting to summarize or explain the intent. This style of
commenting is generally more preferred than the previous method, as long as you
are following good coding standards. A
good explanation for intent commenting comes from McConnell:
“Good
comments don't repeat the code or explain it. They clarify its intent. Comments
should explain, at a higher level of abstraction than the code, what you're
trying to do”. McConnell, Code Complete.
Explain why you did something rather than telling an
observer what you did. An example given in our class was that of a financial
software program written in an ancient language, COBOL. A past programmer did
not comment why vast amounts of money were being sent to certain people, and in
actuality, those people should not have received it. A third style of
commenting is Algorithmic Description. In this method, one explains technical
design choices, for example, why quicksort was chosen instead of merge sort.
And finally, one can use comments for debugging purposes. An example would be commenting
out debug flags and methods.
Dividing a project among team members will allow one to
finish the project sooner. In my recent assignment, my partner and I never
formally discussed this. This led to an unorganized experience and a semi-incomplete
project. My mentality was to sit down and work on it as much as I could. And
hopefully we would get done early. Work assignments were never distributed, and
the workload turned out to be heavier in my favor. This ended with us turning
in an incomplete project. Regardless of the negatives, there were benefits of
working in groups. In an instance I was out of town, my partner was still able
to work on the assignment and vise-versa. Another benefit was the use of
version control. Trying to wrangle email threads or flash drive swapping or the,
always hilarious, “emailing the code to yourself” are all methods that should not
be used. It removed any guesswork from making sure each of us had the most up
to date codebase.
Because I will talk about version control methods in extreme
depth later, I will explain the ‘need-to know’ to better understand what a
commit message is. When you make changes to a server-hosted copy of a project
and its code, the server asks you to write what you did in those changes. The
commit message is that text. It tells
everyone else what you did and why you did it.
·
1-line summary of Main change
·
Why did you make every other
change
|
When
I was first learning how to write commit messages, they were "less than
stellar" as my project partner put it. I would record what changes I made,
but would fail to remember exactly everything I changed in that revision. I
hoped the diffs (see section on Subversion Diff) of the files would explain
what I forgot. The problem though, was not my partner understanding the changes
themselves, but why I changed what I did. He then gave me a format for commit
messages.
The first line is basically a summary of the main changes or a
title. The preceding lines are explanations of the why the changes occurred.
With this format I eventually got the hang of committing verbosely. The morale
of the story is no matter how small the change, someone at some time is going
to have to try to understand it. It might even be you coming back to a previous
iteration, trying to figure out when you made a certain change.
Reducing The
Pain – Writing and Documentation and Groups
Ø
Comment to help others, not yourself.
Ø
Yes, work in groups. Divide the project early
Ø
Commit Messages – Summary, followed by why.
//GNU
static char *
concat (char *s1, char *s2)
{
while (x == y)
{
something ();
somethingelse ();
}
finalthing ();
}
|
Physical Pain in Software Development
Coding requires time, usually a lot. Coding also takes place
in front of a computer. You might be sitting, you might be standing, but you
are probably staring at a screen, using a keyboard. How the code actually looks
on the screen can not only impact on how fast one understands it, but also how
stressed the eye gets over time.
For a recent project, my partner and I decided to switch to
a common coding style for all of our code. We chose GNU style for two main
reasons, increased readability due to eye ergonomics of block statements and
the two space tab width. Like the Allman and Whitesmiths styles, GNU style puts
braces on a line by themselves, indented by two spaces, except when opening a
function definition, where they are not indented. In either case, the contained
code is indented by two spaces from the braces. It is mandated by the GNU
Coding Standards and is used by nearly all maintainers of GNU project software.
All that time spent in front of that computer and keyboard
can also have an effect on one’s wrists. The Dvorak Keyboard layout was
designed to help prevent repetitive strain injury and to reduce instances of
carpal tunnel syndrome. The benefits though are not just limited to ergonomics.
Because the home row contains the most used letters in the English alphabet,
the key stroke distribution is around 70% making you a faster typer. Also,
Dvorak uses around 63% of the finger motion of compared to Qwerty which gives
it increased ergonomics. It is important to note that the letter distribution
is limited to the English language, and international users may not see a
benefit.
RTP – Physical
Pain
Ø
Use GNU style.
Ø
Use the Dvorak keyboard.
Waterfall SDLC
The traditional waterfall model of software design
contains a number of distinct phases.
1. Requirements Specification – Defining
the problem and the expected outcome.
2. Design - This is where a general plan
is created for building the implementation.
3. Implementation - The coding is done
here.
4. Testing and Debugging - Errors are
located and removed.
5. Delivery to the customer and
maintenance of the project.
This SDLC
can be compared to an assembly line in which each phase of the project is
separated. However one of the main drawbacks of this method is the extreme
difficulty to return to a previous stage. For example if a requirement changes
in Step 4, one must go back to the design phase and then repeat the
implementation phase. If one is exactly sure of every single requirement and
every single outcome of a simple project, then this can be an efficient method
to use.
Iterative
SDLC
A counter process to the waterfall method is an
Incremental development scheme. In this system, a series of
mini-Waterfalls methods are completed, such that new features are added on
during each of the mini-Waterfalls. This method can also do some “Big Design Up
Front, and then proceed to truncated mini-waterfalls. The initial software
concept, requirements analysis, and design of architecture are defined via
Waterfall.
An obvious quality of this method is that one can easily
change the project requirements. At the completion of a feature addition, one
can go back and reevaluate what is needed. If large changes are needed, time is
saved because they were discovered sooner.
Defensive Programming is not so much a design choice, but a
design necessity. Writing functional code involves writing code that uses valid
input (if any) to achieve some function. Unfortunately, input is not always
valid. The purpose of writing defensive code is to ensure that the program or
method does not do something bad (i.e., crash or, worse, quietly return a bad
value). There are many ways to do this. For example, before using an object
reference, you should make sure the object is valid (i.e., not null). Before
accessing an array, you should validate that the array index is within range
(one of the more common causes of memory access errors). When using a case
statement, a default block should be used in case the input value doesn't match
any of the expected values.
While not difficult, writing defensive code is a tedious
task. On a recent project we found that adding defensive coding added 5% to our
coding time (i.e., for every 40 hours we spent writing functional code we spent
an additional two hours writing defensive code). For some types of projects,
such as small projects, temporary applications and projects where you are
positive your data is clean (e.g., when you have a small set of long time users
who know what values to enter), it might not make sense investing in defensive
programming. But for projects where data quality is uncertain or the
consequences of failure are high, it makes sense to invest in defensive
programming
When writing code, an overall thought to keep in one’s head
is KISS, “Keep It Simple Stupid.” Simplicity and clarity should be paramount
because who knows who will look at your code next, let alone need to understand
it. You cannot even rely on yourself to understand your own code after a period
of time. Again, this ties into being able to write good comments. In a utopian
programming world, one would write code that is so understandable, that “you could read it to your grandmother and she would
understand.” But that is obviously not the case. Therefore, including
necessary and helpful comments is important to make sure your code is
understood
RTP –
Design Choices
Ø
When faced with a choice between waterfall, and
iterative, choose Iterative.
Ø
Defend your code against Evil input!
Ø
Remember your Grandmother.
·
Who?
– Created by Collabnet
·
What?
- Software versioning and a revision control system.
·
Where?
– Now an Apache Project
·
When?
– October 20, 2000
·
Why?
- An effort to write an open-source version-control system which operated
much like CVS but which fixed the bugs and supplied some features missing
in CVS
|
We were required on a recent project to use a version control method
called Subversion. (SVN) It is mainly a command line interface upon which users
update and commit changes to a shared code base. However don’t let the command
line steer you away. I did not use it once during the entire duration of our
project. An integrated development environment with a subversion plugin was
mainly used, such that most of the problems users run into were eliminated.
Because of this, I highly recommend use of subversion in your next group
project.
·
svn
checkout — Check
out a working copy from a repository.
This is the first thing one does when working with
a repository of code. You check out a copy for making changes. Now
how do you put those changes back?
·
svn
commit — to send
changes from your working copy of your own version of the code to the repository
It is important to note a common problem users face when
committing new files to the repository. Using the command line, one must first
do an “svn add” of the new file. Then you can actually commit it. In my IDE
however, this command was not needed, as it was done automatically.
SVN's best feature is the ability to work on a shared code
base. This code area is called the Trunk. The Trunk is where the full copy of
your code lives all time. When one commits code to SVN they are adding their
changes to the trunk.
·
svn
update — Update
your working copy of the code from the repository.
Updating will bring the latest version of the code from the
repository to your machine. From that point the developers using the repository
can make changes to the updated versions and then commit those changes to
create a new, updated version of the code.
These two features are great but lead to one issue. If two
team members are working on the same file, make changes to that file, and then
try to commit, they will hit a snag because of an out of sync code base. The
process can be resolved with what is called a merge.
·
svn
merge — apply the
differences between two sources to a working copy path.
A Merge is when a developer looks at the changes to his/her
local version of a file and the changes made to a Trunk file that hasn't been
updated on the local version, and fixes the synchronization issue by amending
the code to include both updates and one of the updates. There are 3 possible
outcomes of this process.
Ø
Override and Update: In this case, the local
version of the file is discarded and the Trunk code overwrites
everything.
Ø
Override and Commit: This is similar to
Override and Update, but the Trunk code is overwritten with the local version.
Ø
Merge Changes: In this solution, both changes
are merged into the local version and (s)he is able to commit the file and
overwrite the Trunk file.
Now you might ask, "How am I supposed to know what the differences
in the files are?" well SVN has a command for that too.
·
svn
diff —
Display the differences between two paths (files).
During the process of Committing, Updating, and Merging, the
code base can change quite drastically. If a problem arises, backing up
your data is key to continuing a project without major losses. SVN
provides a way to back up your code via Tags.
A Tag is a snapshot of your code at a given time.
It is similar to a Trunk but it is assumed a Tag will not change in the
future. This is useful if a situation arises where it is necessary to
revert your code base to a prior state. In larger environments, Tags are
created with each build. Version 1.0, and so on. Luckily, we have never
had to revert back to a tag yet in our project.
Just in case you need to offshoot your code for
some reason, say to introduce a new developer to your code, and
you don't want them working in the Trunk, SVN has something
called Branches, They are a separate Trunk-style project used for work on
the project. Branches can be created from existing code bases, including
the Trunk and Tags. Even better, you can merge between a Branch and the
Trunk to bring in changes from the temporary project that needs to be
implemented into the code base.
Unit testing is an important part of designing and
developing code. Here are 5 reasons why.
1.
Unit testing allows you to test your code all
the time, in an easy way -- Instant gratification
2.
If one starts unit testing early in the process
of writing their code, then it leads to a better design. Your method names and
classes will be based on the tests you write and can therefore help you
organize your code.
3.
Unit testing allows one to make changes very
easily to your code later down the line. Developing a good baseline of tests,
lets you refactor you code easily because you know it works already.
4.
Since unit tests actually "test your
code," you can "try to break it." This will give you an
understanding of your code you would not have if not testing.
5.
And the most important (for your boss) Money.
The cost of fixing a bug early is exponentially cheaper than finding it very
late in the development process.
CxxTest for C++ is a very nice way to write unit tests
and do unit testing in C++. Cpp files are generated with Perl from your regular
classes and then you compile those and run them. You therefore do not have to
declare your tests in your regular code and can just attach their framework to
do your testing. It even has a plugin for Eclipse that lets you easily write
tests and use them through a GUI.
Some of the features (as taken from its own description):
·
Doesn't require Runtime Type information.
·
Doesn't require member template functions.
·
Doesn't require exception handling.
·
Doesn't require any external libraries
(including memory management, file/console I/O, graphics libraries.)
·
Is distributed entirely as a set of header
files.
CxxTest was unruly at times because it does not have the
ability to print out error messages if a test fails. The runtime just breaks
and you must manually go in and add a “cout << error message;” to the end
of your test.
Testing is done by writing test cases mainly with a function
called TS_Assert(). It works very similar to the standard Assert() macro. You
just put whatever function you want to test inside it say,
TS_ASSERT(
1 + 1 > 1 ); or
TS_ASSERT_EQUALS( square(2),4 );
Makefiles are a class of Expert Systems, such that an expert
system is a computer system that emulates the decision-making ability of a
human expert. The general idea is that make supports minimal rebuilds. For
example, you tell it what parts of your program depend on what other parts.
When you update some part of the program, it only rebuilds the parts that
depend on that. While you could do this with a shell script, it would be a lot
more work (explicitly checking the last-modified dates on all the files, etc.) The
only obvious alternative with a shell script is to rebuild everything every
time. For tiny projects this is a perfectly reasonable approach, but for a big
project a complete rebuild could easily take an hour or more -- using make, you
might easily accomplish the same thing in a minute or two.
In my experience, makefiles are complicated and have a steep
learning curve. They also strictly depend on formatting specifics. Certain
entries must be an exact amount of spaces in a new line. I used them in a
recent project in a different manner though. My IDE (in the next section)
automatically generated them, and I just modified the makefiles for my own
uses. Even with the negative aspects, makefiles create a repeatable process one
can go through so that they do not have to reinvent the wheel every time they
make a change to their code. Any time a process can be repeated, saves both
time and money.
As I hinted earlier, I have been using Eclipse, ever since
my second computer science class. But what are the real benefits of IDE's and
Integrated Build tools and Debugging? All the things I have previously listed
in this paper show up in this amazing IDE. That alone is the benefit: Singular
Integration. Most of the listed concepts below do occur in other IDE’s but these
are my reasons for using Eclipse.
·
The concept of a project workspace is introduced
in an IDE. A workspace is just what it sounds like, a place to keep all of your
relevant documents. The IDE knows they all connect so that similar operations
perform on all the files.
·
Eclipse has Automatic formatting and indentation
correction, such that I just click a button or press a keyboard shortcut to
format all of my code in the project into the GNU style.
·
It allows you to right click on a variable and
rename that single variable in multiple locations at one. This saves you vast
amounts of time; you do not have to go searching for all instances of your
hidden “getter/value” method/variable in separate files and folders. Anything
in the project workspace is searched and automatically corrected.
·
Depending on the Language you are working in,
Eclipse will generate classes, methods and general code for you, if you supply
the interface/requirements definition.
·
Eclipse has Seamless SVN Integration.
·
Eclipse has CXX test integration.
IDE’s in general have autocompleting function capability.
Your IDE will know what all the possible choices are for every situation you
type, and it will suggest to you which methods to use.
Debugging
Debugging is a methodical process of finding and reducing
the number of bugs, or defects, in a computer program or a piece of electronic
hardware, thus making it behave as expected. Debuggers usually use a concept
called breakpoints to stop at certain lines of code so that one can examine
them. A typical debugging process begins with reproducing the original problem
and trying to simplify it down so that a certain part of the code is deemed a
likely spot that caused the problem. Debugging can be performed via a Trace, or
live, while accessing program states. Print (or tracing) debugging is the act
of watching (live or recorded) trace statements, or print statements, that
indicate the flow of execution of a process. This is sometimes called printf
debugging, due to the use of the printf statement in C.
In Eclipse, the entire process is contained within a
debugging view. You are able to view current variable states, and follow
function calls up and down the stack. Every step the program takes can be
viewed in order. To set a breakpoint you just double-click on the line of
interest. The only problem with debugging in Eclipse is it searches your
workspace $PATH for your source files and your system $PATH for standard
library source files. If it cannot find them, then it does not display the
source code while one tries to step through the program.
Reducing The
Pain – Software Development Tools
Ø
Subversion
o
If using the command line and you have a new
file, always “Add” before you “Commit”
o
Use an IDE with Subversion integration. Recommendation – Eclipse
Ø
Testing
o
When using Cxx test, remember to make sure your
makefile includes the library in the correct top-level directory.
Ø
Makefiles
o
Don’t deal with them. Have your IDE do all your
compilation and run-testing.
Ø
Debugging
o
Again, steer away from archaic tools such as GDB
and stick with your IDE.