August 2008
Sun Mon Tue Wed Thu Fri Sat
         
           

Sun, 28 Nov 2004

Documentation has a Cost and a Value

What is the purpose of documents if not to communicate? Someone might wish to make a few personal notes and the cost is minimal, while the value is high, to that person. When that document starts to be a resource for team planning or communication then it needs to be shared in some way. That's when the cost of creating and maintaining a new non-code document needs to be considered. When it is referred to by many people over time and can change with some frequency, the cost is very high. Cost is generally proportional to the rate of change of the content. When that content is dependent on some code that can also change, the cost is even higher. That cost can be reduced if the document can be regenerated from the code, directly and on demand.

Determining the value of a non-code document is difficult, but over time the most-referred-to documents for a project can be discovered. The Technical Memo is an excellent starting point for documents. The short and single-subject format is low-cost and easy to maintain. Beware creating documents because of an anticipated future need.

Sometimes the only place a change or requirement is written down in a sufficiently rigorous and meaningful way is in the working code. Any effort spent trying to keep non-code documents in sync is potentially wasted, or at least misdirected. That non-code document that doesn't express the meaning as accurately as the code is a potential source of conflict. In many cases, the only place that being correct really matters is the code. All other versions and descriptions of the function are really subservient to what the code says and how it behaves.

permanent link

Mon, 22 Nov 2004

Build vs. Buy, Part 5

Last in a series examining certain arguments for buying commercial software an minimally customizing it versus custom-developed software.

Customer Focus

If there is a single primary theme to the buy vs. build decision, it must be the users' needs as the priority driver. Every automation project will, to be truly successful, entail some realization by the customer of outmoded and wasteful practices that can be changed or eliminated. There is always the danger of "paving the cart path" to look out for. The issue to consider very carefully in a buy vs. build decision is whether implementing the purchased application will cause excessive and wasteful disruption of existing business processes. Some flexibility on the part of staff and management is desirable, in order to accommodate solutions which actually improve the process. In a healthy custom software development process, there is ongoing collaboration between business users and software developers to discover what is possible and desirable, and make the transition gradually. With purchase software, the customer gives up substantial leverage in its negotiating position in exchange for, it hopes, lower costs. But when implementing commercial software entails a radical overhaul of the business process before the benefits can be realized, that cost must be accounted for.

Review

The rule of thumb is "buy for parity, build for advantage". Whether or not there is a competitive edge to be gains through custom software development, there is only competitive equality with purchased COTS.

permanent link

Fri, 19 Nov 2004

Build vs. Buy, Part 4

Continuing a series examining certain arguments for buying commercial software an minimally customizing it versus custom-developed software.

Customization

All but the simplest COTS (Commercial, Off-The-Shelf) software will need customization. While is impossible to accurate assess how much, it is possible to determine what portion of in-house resources go towards needed customizations versus integration with other systems. If the integration is complex and problematic it can take time and expertise away from providing business-critical customization. If the users go without their needs addressed because everyone with any expertise is struggling to make the software talk to the rest of the systems, where is the value in that?

Programming Interfaces

A critical aspect of vendor application customizability is a powerful, well-documented, flexible and stable set of APIs that allow the customer to modify the behavior of the software without introducing incompatibilities and dependencies that cause problems in later upgrades. The question to ask is, are they the right interfaces? Are they sufficient to allow the right customizations without being overwhelmingly complex? Do they make the common, simple case easy, and still make the hard things possible? With custom-developed software, the interfaces are, by design, the ones need to address the business requirements. If the APIs stop meeting the evolving needs of the application to address user requirements, in-house developers can refactor them to be more flexible and useful. For purchased applications, these kinds of changes must go through the vendor's change process, and a single customer's needs are not the highest priority except in unusual circumstances.

permanent link

Thu, 18 Nov 2004

Build Vs. Buy, Part 3

A continuation of a series examining certain arguments for buying commercial software an minimally customizing it versus custom-developed software.

Polish and Quality

Typically, vendor-supplied software includes install and configuration wizards and GUIs that are more extensive and more polished than what is developed for in-house software. This is driven by the need for customer to be able to implement the system without in-depth expertise. While there is certainly value in those kinds of features, are they worth the additional cost? They are primarily an example of the kind of additional development done that is not directly attributable to customer business needs. If those features are of value to the business, why doesn't the in-house development process reflect that value? If the purchased software is perceived as better-tested and more robust than in-house developed software, those cost of those aspects is surely included somewhere in the purchase and support price. Again, if these aspects are valuable enough to enter into the purchase decision, why doesn't the in-house development reflect this value?

permanent link

Wed, 17 Nov 2004

Build Vs. Buy, Part 2.

Continuation of a series examining certain arguments for buying commercial software an minimally customizing it versus custom-developed software.

Upgrade Treadmill

In the memo, the author says that a system used by the business is based on a commercial application and claims 90% of it is standard and 10% is customized. There's no detail to justify that number, but suppose the software does 90% of what the customer needs. Left unmentioned in the memo is how much of the application is useful to the customer. In other words, what additional features and complexity is the business paying for with the application suite but not using? It's unlikely that 100% of the features of a commercial application are used by any one customer -- but can the business say it is using even 50% of them? Is that cost effective?

While the vendor's support may help keep the business current with technology, how much are the vendor updates just a way to keep the upgrade treadmill running? How often does the vendor introduce an update where none of the new or changed functionality is useful to the business? What is the cost of being on the vendor's development schedule, and how disruptive are upgrades? Worst of all, what happens when the vendor eliminates a feature that is key to the business?

Feature Creep

In the memo, the author asserts that there is a significant cost savings to purchasing vendor-supported applications, because the costs of development are spread across all the customers. That's a bold assertion, but looking closer, to what extent is that cost savings eroded by the additional cost of features which are developed but not used in the business? Is the total cost of the software, including those elements useful to the business and those that are not, really less than a solution tailored to the business?

permanent link

Tue, 16 Nov 2004

Build vs. Buy -- A response

Background

Recently I received a copy of a memo forwarded by a CIO with an expressed preference for moving towards buying applications over custom building them. While I can't reprint the memo here, I do wish to cover some of the points it made and respond to them from an alternate perspective. The memo summarizes one case study within the company of what is portrayed as a successful acquisition and implementation. The application is based on a vendor product and customized in-house to a limited extent. The case study asserts the benefits of the buy decision over three areas -- the technology, the cost, and quality. The memo was written by an employee was involved in the acquisition and implementation of the software, and mostly describes the experience in positive terms. Certain cautions are mentioned, but not explored in depth.

Technology
An application purchased from a vendor will remain up-to-date with current technology, as long as the vendor maintains it, but the upgrade cycle may not fit well with the business needs, and a failed vendor can result in "abandon-ware".
Cost
A vendor can amortize the cost of software development over all customers, but the variety of customer needs can lead to higher overall development costs, compared to an application custom-developed for a single business.
Quality
A product viable in the open market is percieved to have a minimum level of quality and to conform to commercial conventions at an affordable price. Organizations often claim to value these things may in fact accept vendor promises. If these business have in-house development shops that are not held to the same minimum standards, it's a strong indicator that the value is not as great as appearances.

Over the next few articles, I'll examine in a bit more detail the specifics of these arguments for build vs. buy.

permanent link

Mon, 15 Nov 2004

Continuous Integration, Misapplied

Putting together an continuous-integration-style automated build and deploy halfway through the project is exponentially more effort than starting off with one. At some point in the project, it may even be that the effort outweighs the gain.

The automated build run at least daily, more often is better, is ideally the release point of the project. Whatever the result of the process is considered the most recent version of the software, and the last good build is the current best version. At the beginning of the project, the result is extremely minimal, but it is working and can be considered the product. A later introduction of continuous integration misses this benefit. Instead of starting with a simple working build and nurturing and growing and evolving it with additional build tasks and deployment steps, the effort begins with a potentially massive and intricate set of components to thrash out. Early in the project, the codebase is small and there are not many pieces to build, and the changes are confined to a smaller range. Later, the changes are widespread and come fast and furious. At the beginning of the project, getting all the code to successfully build and run is easy, and the team can begin to get used to the idea that this is an ideal state.

Having a working, if incomplete, product seems to encourage efforts to keep it working. Developers soon discover that checking in small, specific, changes as often as possible, say on a per-task basis, is the best way to ensure the build doesn't break. Team members not already familiar with small check-ins after every task and seeing the build as the heartbeat of the project will not gain from a mid-project introduction of continuous integration.

Later in a project that does not have an automated build of any kind, it's not unlikely that the code in the repository is impossible to build without local tweaks. Each team member may be able to get it running on his or her workstation by ignoring broken bits or setting various options, so the fact of the entire system is broken may not occur to anyone. Whatever product that might have been handed off to testers of for demonstration purposes came together as the result of an individual or team effort on the workstation of whoever seemed to be most able to make it work.

This "but it builds on my machine" mentality hinders acceptance of the automated build as the "gold standard" against which progress is measured. The team's rhythm has been set and become accustomed to another build process. In addition to getting team effort towards the build process, there is the task of weaning and migrating the team away from the habits already in place.

Without the will and desire from all members of the team the task is not only of questionable value it is also not going to get appreciation.

permanent link

Sat, 13 Nov 2004

Firefox Optimized

As a programmer, one of the things I love about being able to get the source to my favorite tools is then I can learn from it and build it to my specifications. Recently I got the source to Firefox 1.0 and played around with optimizing the build for my desktop machine, an AMD Athlon 1GHz. Not that the generic Firefox builds from the Mozilla foundation were slow, but I wanted to see what was possible. The hardest part was finding the right gcc options that would work and generate acceptable code. My first few attempts failed in the middle of the build because various intermediate build products, like xpidl, would crash when trying to process some step, because they had been built with bad options.

Using gcc 3.3.5 I end up with the following optimization flags: -march=athlon-tbird -mmmx -m3dnow -O3 -funroll-loops -fomit-frame-pointer

I tried various fastmath options but in general I couldn't get a good build. I don't know how much math a browser really needs to do, anyway, so I didn't spend too much time on it.

I would not assert that the resulting binary is spectacularly faster, but there are places where it seems snappier. If you want to try yourself, in addition to the build instructions from Mozilla, here are the settings and scripts I used. I'd be interested to hear of your experiences, and in particular if you have any tips for getting the fastmath options to work.

GCC Details

The specific gcc I used, as reported by gcc -v

Configured with: ../src/configure -v
--enable-languages=c,c++,java,f77,pascal,objc,ada,treelang
--prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info
--with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared
--with-system-zlib --enable-nls --without-included-gettext
--enable-__cxa_atexit --enable-clocale=gnu --enable-debug
--enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
Thread model: posix
gcc version 3.3.5 (Debian 1:3.3.5-2)

My .mozconfig

The complete set of build options for Firefox.

. $topsrcdir/browser/config/mozconfig

export MOZILLA_OFFICIAL=1
export BUILD_OFFICIAL=1

mk_add_options MOZILLA_OFFICIAL=1
mk_add_options BUILD_OFFICIAL=1

ac_add_options --enable-strip
ac_add_options --enable-strip-libs

# Optimization configurations
ac_add_options --enable-optimize="-march=athlon-tbird -mmmx -m3dnow -O3 -funroll-loops -fomit-frame-pointer"
ac_add_options --disable-logging

ac_add_options --disable-tests
ac_add_options --disable-debug

ac_add_options --enable-default-toolkit=gtk2
ac_add_options --enable-xft
#ac_add_options --enable-freetype2

mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/firefox-bin

GCC CPU Options script

Here's the script I found that helped me find the options for my CPU.

#!/bin/sh

# Author: pixelbeat

#This script is Linux specific
#It should work on any gcc >= 2.95 at least

#these apply to any arch (just here for reference)
unsafe_math_opts="-ffast-math -fno-math-errno -funsafe-math-optimizations -fno-trapping-math"

gcc_version=`gcc -dumpversion | sed 's/\([0-9]\{1,\}\.[0-9]\{1,\}\)\.*\([0-9]\{1,\}\)\{0,1\}/\1\2/'`

IFS=":"
while read name value; do
    unset IFS
    name=`echo $name`
    value=`echo $value`
    IFS=":"
    if [ "$name" == "vendor_id" ]; then
        vendor_id="$value"
    elif [ "$name" == "cpu family" ]; then
        cpu_family="$value"
    elif [ "$name" == "model" ]; then
        cpu_model="$value"
    elif [ "$name" == "flags" ]; then
        flags="$value"
    fi
done < /proc/cpuinfo
unset IFS

if [ "$vendor_id" == "AuthenticAMD" ]; then
    if [ "$cpu_family" == "4" ]; then
        _CFLAGS="$_CFLAGS -march=i486"
    elif [ "$cpu_family" == "5" ]; then
        if [ "$cpu_model" -lt "4" ]; then
            _CFLAGS="$_CFLAGS -march=pentium"
        elif [ "$cpu_model" == "6" ] || [ "$cpu_model" == "7" ]; then
            _CFLAGS="$_CFLAGS -march=k6"
        elif [ "$cpu_model" == "8" ] || [ "$cpu_model" == "12" ]; then
            if expr $gcc_version '>=' 3.1 >/dev/null; then
                _CFLAGS="$_CFLAGS -march=k6-2"
            else
                _CFLAGS="$_CFLAGS -march=k6"
            fi
        elif [ "$cpu_model" == "9" ] || [ "$cpu_model" == "13" ]; then
            if expr $gcc_version '>=' 3.1 >/dev/null; then
                _CFLAGS="$_CFLAGS -march=k6-3"
            else
                _CFLAGS="$_CFLAGS -march=k6"
            fi
        fi
    elif [ "$cpu_family" == "6" ]; then
        if [ "$cpu_model" -le "3" ]; then
            if expr $gcc_version '>=' 3.0 >/dev/null; then
                _CFLAGS="$_CFLAGS -march=athlon"
            else
                _CFLAGS="$_CFLAGS -march=k6"
            fi
        elif [ "$cpu_model" == "4" ]; then
            if expr $gcc_version '>=' 3.1 >/dev/null; then
                _CFLAGS="$_CFLAGS -march=athlon-tbird"
            elif expr $gcc_version '>=' 3.0 >/dev/null; then
                _CFLAGS="$_CFLAGS -march=athlon"
            else
                _CFLAGS="$_CFLAGS -march=k6"
            fi
        elif [ "$cpu_model" -ge "6" ]; then #athlon-{4,xp,mp}
            if expr $gcc_version '>=' 3.1 >/dev/null; then
                _CFLAGS="$_CFLAGS -march=athlon-xp"
            elif expr $gcc_version '>=' 3.0 >/dev/null; then
                _CFLAGS="$_CFLAGS -march=athlon"
            else
                _CFLAGS="$_CFLAGS -march=k6"
            fi
        fi
    fi
else #everything else "GenuineIntel"
    if [ "$cpu_family" == "3" ]; then
        _CFLAGS="$_CFLAGS -march=i386"
    elif [ "$cpu_family" == "4" ]; then
        _CFLAGS="$_CFLAGS -march=i486"
    elif [ "$cpu_family" == "5" ] && expr $gcc_version '<' 3.1 >/dev/null; then
        _CFLAGS="$_CFLAGS -march=pentium"
    elif [ "$cpu_family" -ge "6" ] && expr $gcc_version '<' 3.1 >/dev/null; then
        _CFLAGS="$_CFLAGS -march=pentiumpro"
    elif [ "$cpu_family" == "5" ]; then
        if [ "$cpu_model" != "4" ]; then
            _CFLAGS="$_CFLAGS -march=pentium"
        else
            _CFLAGS="$_CFLAGS -march=pentium-mmx" #No overlap with other vendors
        fi
    elif [ "$cpu_family" == "6" ]; then
        if echo "$flags" | grep -vq cmov; then #gcc incorrectly assumes i686 always has cmov
            _CFLAGS="$_CFLAGS -march=pentium -mcpu=pentiumpro" #VIA CPUs exhibit this
        else
            if [ "$cpu_model" == "0" ] || [ "$cpu_model" == "1" ]; then
                _CFLAGS="$_CFLAGS -march=pentiumpro"
            elif [ "$cpu_model" -ge "3" ] && [ "$cpu_model" -le "6" ]; then #4=TM5600 at least 
                _CFLAGS="$_CFLAGS -march=pentium2"
            elif [ "$cpu_model" -ge "7" ] && [ "$cpu_model" -le "11" ]; then #9 invalid
                _CFLAGS="$_CFLAGS -march=pentium3"
            fi
        fi
    elif [ "$cpu_family" == "15" ]; then
        _CFLAGS="$_CFLAGS -march=pentium4"
    fi
fi

if expr $gcc_version '>=' 3.1 >/dev/null; then
    if echo "$flags" | grep -q sse2; then
        _CFLAGS="$_CFLAGS -mfpmath=sse -msse2"
    elif echo "$flags" | grep -q sse; then
        _CFLAGS="$_CFLAGS -mfpmath=sse -msse"
    fi
    if echo "$flags" | grep -q mmx; then
        _CFLAGS="$_CFLAGS -mmmx"
    fi
    if echo "$flags" | grep -q 3dnow; then
        _CFLAGS="$_CFLAGS -m3dnow"
    fi
fi

echo "$_CFLAGS"

permanent link

Thu, 02 Sep 2004

Unit Tests Manage Complexity

Unit testing is more than a design tool, not just a practice that enhances quality, it is a tool for managing complexity.

A proper, well-written unit test isolates a particular piece of behavior, and the code that implements it, from all the unnecessary surrounding context of the running system. The unit test focuses on an aspect of the code in a way that frees the programmer from the need to build and maintain a complex mental picture of the running system. A programmer doesn't have to "hold it all in her head". Where before a programmer might have had to slowly build up a detailed mental model of the executing logic in order to understand the complex behavior sufficiently to (manually) evaluate and compare the results of a test run with the expected behavior, a good unit test narrows the scope of what the programmer must comprehend to something so simple that evaluating it can be done quickly and repeatedly, as often as necessary, automatically.

This narrow and simple focus makes it easer to achieve and recover "flow", and less costly to lose it because of interruption. It becomes so easy to maintain flow that it is possible for two programmers to work together in a combined state of flow.

The unit defined by a test represents a locally self-contained element of simple behavior. If something is too hard to test, that is a strong indication that it is not simple enough. The programmer needs to remove excess behavior from the context until the code being tested represents a unified conceptual whole.

In my early days as a programmer, the usual process consisted of

  1. thinking a while about the problem until I had a notion about how an implementation might look.
  2. writing a bunch of code that seemed like a correct solution at a high level
  3. fixing syntax errors until the code compiled
  4. fixing bugs until the code produced the correct output.
  5. dropping down to a lower level and repeating.
I called this "top-down design".

As the code grew and became more detailed, starting up the system and running through the steps to verify the latest changes meant keeping more and more context in my head. This context includes things like what had just been done, what this run was supposed to have working, and what might break. It was a lot of work to get into flow, it was not easy to maintain, and it was easy for distractions to break flow.

Test-driven development breaks the work down into tiny chunks of independent function that don't require a complex mental context of logic to be in the programmer's head.

Before TDD: A programmer had to keep a complex logical context in a mental model. With TDD: The programmer only keeps a simple context of making the next test pass.

Before TDD: Build and run the entire system, manually check the behavior, set debug breakpoints, and try to remember what was being worked on. With TDD: Build the piece being worked on, run the one test that is failing, have the behavior checked automatically, understand why the test fails. Do something simple to make the test pass. Repeat.

permanent link

Tue, 31 Aug 2004

Am I a bad software engineer?

To some, I'm sure this makes me a terrible software engineer. I know how to draw UML diagrams but I can never get all the details of the design right in them ahead of time. Whenever I go to code there's always some aspect of the environment, language, or specification that I forgot to take into account, didn't know about, or that wasn't clear. By the time I'm done creating the working code the original diagram is at best schematic, and sometimes just flat wrong.

Some people seem to want to try to get it all thought out up front, but I'm not good enough, so for me software development has to take into account the things that emerge during an actual coding session. I try to write unit tests first, to guide my thinking and focus my attention. Making the tests pass means taking into account, on at a time, those details that were too numerous to collapse into a design diagram.

The forethought necessary to draw some diagrams is a good way to think about how to start, but it tends to get thrown out as soon as I start getting code actually written. The working code in the end stands for itself, not some preconceived notion of it, if that preconception turns out to be inaccurate. As one thoughtful friend of mine wrote, to force the code do so is "like deciding you want a pianist before having a child".

permanent link

Sun, 29 Aug 2004

Objects for "Newbies"

Simple coding tips for junior to intermediate programmers new to objects.

Consider this exercise -- take a substantial piece of good procedural code and turn every function into a class. Not necessarily with the goal to produce an ideal implementation, but to explore the nature of objects.

permanent link

Sun, 01 Aug 2004

RubyCocoa TestRunner on RubyForge

Open source, ruby, and unit testing

I shipped the code for my RubyCocoa testrunner off to RubyForge this weekend. CRTestRunner is a graphical TestRunner for Ruby's Test::Unit. It provides an interface similar to jUnit's for running ruby unit tests. Visit the project home, download, try it out. It requires RubyCocoa and ruby 1.8. Of course it only runs on OS X.

Much thanks to Chad Fowler for his encouraging feedback, and Rich Kilmer of RubyForge for excellent advice on dealing with a failed require on test load.

permanent link

Tue, 20 Jul 2004

When DBAs Go Bad

Sometimes the motives of DBAs are inscrutable, and it's hard to understand what the good intention could be that drives a bad solution. Martin Fowler has written on the gap between database-oriented software developers and in-memory application software developers and the conflicts resulting. What some have termed DBAs Gone Bad.

A recent situation I'm aware of appears to demonstrate a specific case of the conflict and problems.

The application involves tracking sums of money, in the form of allocations which have conditions attached specifying what they may be used for. These allocations are of course tracked and persisted. On the other side, money goes out as it is authorized to specific spending instances in accordance with the limitations set for the allocations. Those authorizations are also tracked and persisted. At any given time the allocations, which may or may not be aggregated into larger sums, have a certain balance remaining for their specified purpose.

In theory, it is possible to determine how much money is available for a specific kind of authorization by taking all the allocations in effect and subtracting all the previously handled authorizations over the necessary span of time. No profiling has been done, but it is known there are many allocations and authorizations, with various complex rules, so it appears this calculation is computationally expensive. It also just doesn't make sense to derive these numbers every time they are need.

The developers desire to store these sums back in the database as they are calculated each time, so that the next subtraction needs to only examine the remaining available calculated amount rather than rerunning all the calculations to determined the available balances. The developers cannot convince the database expert to create any sort of rows, tables, or anything to allow these available sums to be stored. The developer in charge of database "stuff" insists that because the available sums can be derived at any time from the allocation and authorization data, then the application must do this calculation every time. In the last conversation there was talk of putting these sums in some sort of scratch table area, not part of the domain schema for the business, but it was not well-recieved.

At this point, it may be that the best available solution, given the resistance from the databae programmer, is for the application to keep the available balances but not ever persist them, meaning that if there is a restart or some other loss of runtime state, then there must be a facility in the code to rerun the calculations forward from the raw data.

This dispute may be over a difference between domain state and application state, in an interpretation of the user requirements. If there is nothing in the user requirements that needs the available sums persisted, is the argument, then it doesn't go in the database. Several questions seem to need investigation. How does the application maintain state over restarts and other sorts of changes? Will maintaining all that in memory result in unaccepatable overhead? How resource-intensive really is the recalculation and can it be done lazily and some intermediate results cached? Is the code for deriving and maintaining state unecessaary complexity?

permanent link

Fri, 04 Jun 2004

Great Readings In Software Development

A short list of recommended reading for software development professionals.

permanent link

Thu, 03 Jun 2004

Coding Pet Peeves

Eight little things I like to nitpick about in code. Mostly applicable to Java, but some are language-neutral.

  1. Don't return null, return the NullObject. Checking for null in a returned value requires code that has nothing to do with the problem domain, and rarely does an API document what it means when a null is returned. My biggest beef is with methods that return a collection type return null. Just return the empty collection, and if it matters to the caller, check the size(). Often it's completely reasonable for code that does something with each element in a collection to do it zero times if the collection is empty.
  2. Only use Java interfaces to define types, not as namespaces for globals. I've seen it suggested many times in questionable guides to create an interface as a dumping ground for public static values. Just Don't Do It. If the values have any meaning in the code, then they can be associated with the class that most closely operates with them. If the "class" is application configuration, then for Pete's sake create a configuration class with the ability to do things like pick up options from a configuration file, environment variables, or command-line arguments.
  3. Override toString for all value types. It's simple enough to generate a readable representation of the properties of an object, and it's tremendously useful in monitoring, logging, and debugging. Far better than the default of the class name and a meaningless VM address.
  4. Sort import statements, starting with java.*, then javax.*, then other 3rd party, then com.mydomain.*. Within a group, alphabetize.
  5. Import down to the class name, unless you have more than 3 imports in a package, then you may import .*. Both this and the previous one are made much simpler by tools like Eclipse that automatically organize imports. Both are just the sort of neatness that any skilled professional should expect.
  6. Vwls r mprtnt. Can't make heads or tails of that? Try this: "Vowels are important". Code with class, method, and variable names that abbreviate out all the vowels in identifiers quickly becomes and indecipherable mess. I see this most frequently in code that is closely associated with a relational database schema, because DBAs seem to love dropping all the vowels. Avoid the temptation to aggressively abbreviate as well -- an example being a proposal to abbreviate every word longer than 6 letters to 4 or less.
  7. Never name a class Manager, Handler, or Data. Don't use the nouns "Object", "Manager", "Handler", or "Data" in class names. These words say nothing about the responsibility of the class, leading maintenance programmers to lump all kinds of irrelevant code into the class. Think instead about what the class is actually supposed to do and name it after that.
  8. Abbreviations used mixed case in identifiers except where an overriding convention exists. e.g UmlIpParser is much more readable then UMLIPParser. For most of my recent employers, things like the company's initials, the project's code name, or the like, are examples of overriding convention. That's OK because generally all the developers involved know the convention.

    permanent link

    Sun, 25 Jan 2004

    Software MFA Week 1 Status Report

    As a first step in my project, Brian asked me to find examples of software that are related to my work and suggest them for annotation. I went through the software listed at http://testingfaqs.org/ and found the following:

    Brian says one of these projects was a failed work, but that it would be useful to discuss why it failed in the annotation. he didn't say which one though -- it's my task to determine that.

    Also, I spent some time looking for a candidate for software to instrument using the concepts I will be exploring in my work. For what I have in mind, a good candidate would be some kind of server-side process that is intended to run continuously for long periods and be typically deployed on a system that is difficult or impossible to place under the control of the usual desktop development tools like IDEs, debuggers, etc. Since I'm using ruby for this project, I think MiniRubyWiki will fit the bill. It runs stand-alone (no web server required), is small, and already includes a page called the WikiNerveCenterWikiNerveCenter. If you have any other suggestions, I'd be happy to hear them. Another potential, if the wiki doesn't work out, is Cerise, "a Ruby web/application server following the same general pattern as J2EE application servers". Finally, if Dave Thomas agrees, I could instrument the code for the website management package he wrote.

    permanent link

    Wed, 14 Jan 2004

    Refactoring Malapropism

    Martin Fowler writes in his blog under RefactoringMalapropism, [the term] "refactoring is often used when it's not appropriate"

    Surprisingly soon after the introduction of the practice, there was widespread and gratuitous misuse of the term at a former employer. I was one of the first users of the term and Fowler's book there, and I began to discuss and demonstrate the techniques to my fellow programmers.

    For some reason, a few managers got the idea that doing refactoring was extra work. This kind of work, they believed, reduced productivity and was not useful as part of maintenance programming.

    This mistaken understanding spread widely, at least in the dysfunctional management groups. At one point a project manager misused it in a directive outlining the goals for the maintenance of one system. Among other things, this directive said that anything beyond the specific changes requested was a "refactoring effort" and therefore out of scope.

    Sigh.

    After that, "refactoring" started to be synonymous with any change.

    permanent link

    Tue, 13 Jan 2004

    Tools for Testable Software Development

    Software needs testability built-in. Automated testing is of critical importance in effective software development. Testing through the user interface is problematic at best, successfully automating this kind of testing is full of difficulties. Code can have testability hooks built in, to help find errors using automated methods. This project explores some ways to design and implement testability in any software in a broadly applicable way. A primary goal is to explore a test support library that can be incorporated into any application.

    Background

    In the context of discussing how to manage and administer deployed production software, an organization needs to have some "levers and knobs" to turn. In other words, "Put a power meter and steering wheel on every Web service". It should be possible to ask, "Who's using the service? How often are they requesting it? How many requests succeed? How many fail?". In order to do this, he lists three needs [Schadler03]:

    In testing software, the development organization needs similar knobs, levers, and meters on the software to ensure that it functions as desired.

    The Test-lead Sink

    We take as metaphors two ideas from the hardware world. First, build in test leads or test points in a manner similar to those in in chips and circuit boards. Second, provide a sink into which test results can be sent. The sink would be something like a monostate object -- one which can receive and act on inputs but which does not itself change state. To get different behaviors, multiple sinks can be attached to or listen to the testing result outputs.

    The input to the sink would look something like this:

     MonitoredEvent
       key
       value or typeAdaptedValue
    
    Sinks would be Testable-event observers, e.g. have a method something like
     postMonitoredEvent(MonitoredEvent evt) {
        ...
     }
    
    Things to send to test lead sink: Many of the events will be recorded an accumulator and min/max extremes may be recorded. For example, as an important dynamic collection changes, we may want to know the number of elements and the high/low water marks.

    Types of monitored values

    This would be a dynamic and real-time display of the sort of information that can be generated by log analysis.

    The test lead sink should be switchable at run-time. The whole facility can be turned on and off, and it should have "quiet", "normal" and "verbose" modes.

    Generating a reasonable, understandable, clear, and simple display of the collected test information is important and potentially difficult.

    References

    [Pettichord02] Pettichord, Bret. ''Design for Testability''

    [Robinson03] Robinson, Harry. "Predicting the Future of Testing"

    [Schadler03] Schadler, Ted. principal analyst, Forrester Research Commentary: Ten tips for killer Web services

    Commentary

    Testers will want to be able to poke state changes into the objects under test. One way to enable this would be to have a way build mock objects and push them into the We'd like to be able to dynamically swap them in for testing purposes. There is a certain overlap between building in this kind of testability and building in monitoring for production operations. Having something that people can see serves two purposes and lowers the barrier for doing either of them.

    permanent link

    Mon, 12 Jan 2004

    Software MFA Application

    This is the application that I wrote for the Software Masters of Fine Arts program trial run.
    "By this it appears how necessary it is for any man that aspires to true knowledge, to examine the definitions of former authors; and either to correct them, where they are negligently set down, or to make them himself. For the errors of definitions multiply themselves according as the reckoning proceeds, and lead men into absurdities, which at last they see, but cannot avoid, without reckoning anew from the beginning." -- Hobbes "Leviathan"

    Keeping in mind the idea that just because something hasn't been done before doesn't mean it isn't good to do, I have been an early adopter of software tools and technologies, including things such as TCP/IP, when Novell was the dominant player, object-orientation and the unified modeling language in the early ninties, and the Java platform at version 1.1, before Swing, and currently OS X and Cocoa.

    An interest in developing new skills, keeping current and ahead of developments in the field, and a willingness to take responsibility for what I do, enabled me to rise to a senior-level technologist. In late 1998, I stepped away from my comfortable expertise in the Perl language to pursue a more junior opportunity that allowed me to develop my skills in the Java platform. More lately, my focus has been outside of specific tools as I've defined experience for myself in terms of practices and attitudes. Programming is a creative problem-solving practice with elements of craft, and mentor/apprentice relationships. Open source and free software represent an expression of the desire to design satisfying solutions to problems.

    I'm interested in mentoring junior programmers to help them develop a professional attitude, care about what they do, have a lifelong love of learning, and take responsibility for their work. These kinds of attitudes are what any good software developer would have, ensuring I will be helping colleagues I would enjoy having work with me.

    As a lead developer I chose, as one of my highest priorities, to encourage greater adoption of unit testing and the JUnit tools specifically. The experience was disappointing in some ways. It was apparent that only the self-motivated really were going to adopt a tool on their own initiative, but that the majority were expecting to be spoon-fed the tool, the practices, and minutae of usage. Exploration and a learning attitude were rare.

    In some ways it appeared that many were expecting and would only respond to a management direction. Perhaps there was a sense that individually, adopting a new tool or practice would not be effective; that there needed to be top-down control for a direction to work, and a bottom-up advocacy of unit testing didn't fit the model of where good things come from.

    This idea, that unless something is centrally managed and directed it can't possibly be effective, exists in the commerical software development world at a high level. In an interview at a Gartner symposium in October of 2003, Microsoft's Steve Ballmer seemed to suggest open source can't work because nobody's in charge. "Should there be a reason to believe that code that comes from, how do I say this, a variety of people unknown around the world somehow will be of higher quality than people who get paid to do it professionally? There's no reason to believe it will be higher quality. I'm not going to claim it necessarily will be worse quality. But why should code that may get written randomly by some hacker in China and contributed to some Open Source project, why is its pedigree by definition somehow better than the pedigree of something that is written in a controlled fashion? I don't buy that."

    Back at the office, with unit testing, the adoption rate continued to be low. Even with presentations, technical white papers, and one-on-one working sessions, it became apparent that JUnit and developer-centered unit testing continued to be viewed as somewhat maverick. There wasn't a vendor-branded product, no centralized corporate office leading the adoption of the practice of unit testing. Could the bottom-up approach have led to the low interest? What assumptions about formalized, centrally managed and mandated practices might have the developers had?

    In exploring the ways that simple self-organizing software development practices can lead to excellent results, it is necessary to understand the view that top-down process is the only possible route to success. What role does a fear of taking personal responsibility have? What forces work against a developer taking the initiative to adopt tools and practices that have demonstrably positive affects on the development outcome when those tools and practices are not validated by the management hierarchy?

    With senior experience both as a consulting developer, helping to solve specific problems, and as an "organization man", working as full-time staff, it is exciting to be able to put off both and see where programming as a craft could go for me.

    permanent link