Joel on Software

Michael Lawley’s Miscellaneous-B weblog directed me to an interesting article written by Joel Spolsky about the history of Hungarian Notation – a variable/procedure naming convention invented by Charles Simonyi of Microsoft. I’ve never liked the Hungarian Notation, but after reading Joel’s article and Simonyi’s paper, I realised that what I thought was Hungarian Notation is actually just a poor imitation of the real thing. Original Hungarian uses a prefix tag on a variable to describe the variable’s domain type rather than its representation type (as Michael so eloquently puts it). The domain type has to do with the vocabulary of the Universe of Discourse (e.g. if you’re a programmer working for NASA or one of its contractors, then you might write programs that are supposed to guide satellites and robots and such like, so one of the quantities you’d be dealing with is distance which can be measured in metres or feet). The representation type is the programming language type in which specific values are stored (so, using the NASA example, you’d store both metres and feet as IEEE floating point values or something like that). However, it would be an error to assign the value of a variable whose domain type is feet to another variable whose domain type is metres. In fact, this could lead to catastrophe, as the software engineers who worked on the Mars Climate Orbiter would know. The commonplace Hungarian notation prefixes variable names with a mnemonic identifying the representation or storage type of the variable. Other than allowing a programmer to see the representation type of a variable immediately, this modern Hungarian notation has little point, since in strongly typed languages the compiler will detect incorrect variable assignments. However, the compiler cannot detect errors to do with the semantics of the Universe of Discourse, which is where Simonyi’s original Hungarian notation comes into play.

Original Hungarian notation, which was known within Microsoft as Apps Hungarian, makes it harder for these kinds of screw-ups to occur. For instance, you might have a variable (or function, but then the first letter of the identifier should be capitalised) ftPhase1Orbit that maintains the distance to the first target orbit measured in feet relative to the satellite. Here, ft is the tag and Phase1Orbit is the qualifier. Other variable names may be prefixed with the tag m or mt to signify the variables store distance in metres. A variable prefaced with ft will never appear in an expression whose result is assigned to a variable starting with mt unless it is wrapped in a function called MtFromFt (metres from feet). For example, if you see the statement dmtOrbits = ftPhase2Orbit - ftPhase1Orbit (the distance in metres between the two orbits is the difference between the distances to the two orbits relative to the craft), you can see immediately that something is wrong because the domain type on the left (mt) does not match the domain type on the right (ft). If instead you wrote something like deltaOrbits = phase2Orbit - phase1Orbit there’s no way to see that you’ve made a mistake. The compiler will give you no help for these kinds of errors. In Apps Hungarian, a function name starts with a tag signifying the domain type that it returns, which is why we have MtFromFt rather than FtToMt. Of course, in NASA’s case, they ought to have been dealing in metric quantities all through their code, so this example is purely for illustrative purposes. All tags must be clearly documented (in code comments or elsewhere) so that the entire development team (current and future) knows the meaning of each one. (See Anthony’s recent informative article on the various code commenting styles he’s tried over the years.)

I’m not sure if the article did quite enough to persuade me to use Apps Hungarian in my programming (I probably should begin to use it because my programming practices are a little sloppy, and anything that instills some sort of discipline in my coding habits can only be a good thing at this point), but nevertheless, I thought the article was really interesting, so thanks to Michael for blogging it. Anyway, I’ve put this article here so that if I do convert to Hungarian notation and I need to justify this to somebody in my own words, I’ll have had some practice already. :-)

After reading Joel’s post on Hungarian Notation, I did a bit more looking around his web site. I found his 12 Steps to better code, which is a list of 12 questions to which you can answer yes or no. The more yeses there are, the better your score. What I like about the test is its simplicity. It is easy for a team of developers to evaluate where they stand on each question. If the answer to any question is no, it is usually clear what the development team needs to do in order to improve. While the test is very simple, each criterion seems very important to me. Last week I pointed my boss to Joel’s 12 point list, and now he wants to use it as a rating system for our own software development processes. So far, we rate about three yeses and four half yeses. With minimum effort, those half yeses could be converted to definite yeses. For instance, we currently do partially automated weekly builds; but with a little extra work, we could do fully automated daily builds, and there are very good reasons for doing daily builds rather than waiting until the end of the week to build and test the current code snapshot. For one thing, bugs are found and resolved sooner than would otherwise be the case.

I think I’ll read Joel more often from now on.