Devious Fish
Music daemons & more

Coding Reflections

Or, stuff I’ve learned from working on pianod (and other software projects).

Philosophy

Planning

A book without an outline is likely to end up incoherent; a building without a blueprint is going to have imperfections. Software is no different.

Software requires planning. It doesn’t need fancy 4-color glossies and video presentations; pianod2 has a crayon block diagram on artist’s newsprint. Despite not being fancy, it did its job: it helped me sketch out how the components would work together before writing code. So when the code was written, it did what was required, and didn’t subsequently need to be changed to accommodate unexpected needs. Effort wasn’t wasted, and the code wasn’t subjected to edit fatigue.

Similarly, it is productive to sketch out details of planned changes before jumping into coding. A spate of audio enhancements on pianod required updating build configuration scripts, where new flexibility lead to the question, “What should this do? Given the choice, do I include more, or less?” Temporarily abandoning the code to write an abstract, I reflect on what the build was supposed to achieve, which naturally lead to the desired behavior and a set of requirements suitable for coding. Had I rushed the coding, the build would have made something—but that something may not have been the best fit for end-users.

Environment

Writing software requires the right environment: a peaceful work area, with minimal interruptions. At any time, a programmer should not have more than one task in each phase of work (planning, coding, testing, etc.). There is reason for this: Software has been around long enough that all easy software has already been written. Anything left requires a great deal of concentration, knowledge and understanding of what is being worked on. Multitasking and being spread across too many projects leaves the programmer scatterbrained, and all work suffers.

Furthermore, interruptions are terrible to performance. Have you ever walked in on a programmer, busily coding, oblivious to your presence? Or perhaps she acknowledged you, but refused anything beyond that until she finished what she was working on. Either way, it’s clear: we are in another world. Our brains are full of what we’re doing.

In my case, when I’m working on a piece of code, that module is in my thoughts. There’s a set of instructions I’m trying to express in code, and I have a list of things I have to worry about (What if that file I’m opening isn’t found? What if I ask for memory, and there is no more? Is this secure against erroneous or malicious input? Is this approach efficient?) On the sidelines I have thoughts about the software components it will interact with, the design (that crayon drawing, or something similar), and other pieces of code that need to be written after this.

Being interrupted breaks the flow. Worse, shifting gears to work on something else brings that something else’s ideas into the foreground, supplanting the details of what I had been working on. When I return to the original project, it will take some time for all the details and considerations to come back to me. And in the meantime, my productivity is down and the quality of my code is compromised.

Minimizing Worries

Worries represent competing thoughts. Since we have a finite capacity for thought, worries reduce the capacity available for making software; eliminating worries, then, allows a programmer to perform better. “Worries” includes:

  • My healthcare, and whether/how I will survive if I am injured, become ill (physically or mentally, possibly as a function of overwork).
  • Interpersonal drama
  • Corporate drama: reorganizations, layoffs, changes to compensation or in health plans
  • Unethical business practices and/or product quality/safety issues

“Simmering”: Contemplation and reflection

“Code not a line before it’s time.” —me

Without preparation, the first approach that comes to mind is used. If it is short-sighted, it’ll have to be fixed after the fact. If the path taken is a blind alley, then the time and effort invested will be for naught.

Contemplation before coding reminds me of making bread. You assemble the ingredients, mix them up, then set it aside and wait. “I thought you were making bread?” Yes, the dough is rising. “No, it’s just sitting there on the counter with a towel over it.” Yes. “Don’t you have to do something to it?” Keep it warm, but not too warm. “Why not just cook it now?” No! That would produce disastrous results.

We understand that despite the appearance of idleness, there’s exciting microbiological activity rising the bread. There’s something akin to this in many creative endeavors, but without yeast explaining its need, its importance is often denied. Like dough, perhaps a closer look will yield understanding.

Contemplation before coding is effective. Most people hate household chores because they are an inconvenience, a waste of their time. I don’t see them that way: they are an essential part of spending time away from the machines, shifting into a different mindset—important because it’s not mentally healthy to spend too much time in cyberspace. Shoveling snow affords me exercise; cooking yields good, healthy food; ironing, dishes and laundry (I use a clothesline) provide a chance to think.

Sometimes, I think about my personal to-do list, my relationships or my aspirations. But sometimes I think about my work, and since I’m not in front of a screen I think differently. Sometimes, I think more broadly, a view of the big picture: how will the modules relate? If the data is structured in one way, one algorithm applies; organizing another way means a different algorithm. Which fits better? Which is more extensible for possible future needs? Sometimes my thoughts are very specific: when I write code for this, I will need to consider that or I’ll have problems with yet another thing. Sometimes I reflect on work I’ve done, and catch problems somewhat after-the-fact: did I check such-and-such a condition?

Contemplation offers foresight to choose an optimal approach and avoid bugs before they are written, while reflection gives hindsight to catch others before they get very far. And because bugs become harder (and more expensive) to fix the more entrenched they are in the code, it’s a good investment.

Rest

All the aforementioned chores are also critical time away from the stresses of coding, time when the brain’s tachometer drops from the red line into the safe zone. It is understood that an athlete cannot perform indefinitely; we need to understand that limits similarly apply to thought. But there’s more to it than that: a hiker or touring cyclist can go all day with a heavy load, but isn’t fast. A racing cyclist goes quickly for long periods, but without weight and as part of the peloton (which reduces individual wind resistance). A sprinter may run 100m much faster than a distance runner, but the distance runner can go 26.2 miles, a feat the sprinter is incapable of. For physical tasks, we understand there are trade-offs between how we do things and how much we can do. And there is always some limit to human endurance.

I posit the same for mental activity. Some people, posed with a problem, pick the first thing that comes to mind and run with it; if they are right, they achieve results quickly. On the flip-side are those who study the problem and consider the implications; when problems are complicated, this is likely to achieve a correct result more quickly, because the rush-to-solve types often iterate a few times before getting it right.

But all of these people have a limited amount before their thoughts become addled and they need a rest. Rest is a chance for the brain’s engine to cool down, whilst the sleep washes the mind’s windows before the view is obscured by days or weeks of dead bugs. And getting away from it all—a holiday or vacation—is like changing the oil. Run a car too long without changing the oil, and it breaks down; run a person too long without a vacation, and they break down too. Only people take much longer and are more expensive to fix.

For the record, there is also a third class, those that don’t do much. Given a problem, they first bemoan it, then make a fuss about how complicated it is, and finally tackle it at a snail’s pace. Perhaps they don’t need rest, because they’re never “red lining.” Because these individuals sometimes spend more time explaining how difficult their work is (instead of getting work done), managers may erroneously respect this type, thinking they are accomplishing the impossible.

Other Passions

While daily chores and rest afford time away from software, they don’t explore alternate passions. I enjoy cooking and it is relaxing busywork to wash dishes, but neither is engaging or stimulating.

Yet I do have interests that are engaging and stimulating. These passions vary over time: out of college it was D&D and gaming. For a while it was helping running a club and exploring my sexuality. That was supplanted by pianod, which was joined by a passion for aerial silk. All through this I maintained an interest in backpacking and long-distance cycling; other interests have come and gone too.

The American expectations for software engineers—at least 40 hours each week, realistically 50 or 60, with only two weeks of vacation—are inadequate to allow exploration of other passions. This is truly unfortunate. One can push off other passions for a time, but not indefinitely. In the long term, one questions what it’s all about. Why am I making software and bothering to earn money if I can’t do the things I want to try?

I believe passion is important in creating software. While pay is nice, my real desire is the joy of being immersed in solving puzzles via application of knowledge, skill and talent. Income is a nice perk, and incentive inasmuch as it funds other interests. Yet, done too long without a rest, software is tiring. Done to the exclusion of all other passions, it ruins the fun and destroys the passion to do it—“burnout”.

Earlier I compared a vacation to changing the brain’s engine oil. Exploring other passions is what creates fresh lubricant. If I take time off to read, relax, and attend to things in my life, I return unburdened. But if I’ve done those things, then been off on an adventure, my desire to make software is renewed. While away, my longing for the stimulation of a software challenge builds anew, and I return reinvigorated about software.

As much as I love making software, doing it to the exclusion of all else is no fun at all. If my heart is to stay in it, I need to explore other interests too. A work/life balance must be struck, and while income is important, it is no substitution for life.

Rushing may not produce results quickly

…or…

Seeming fast is not necessarily actually fast

Scripting languages such as PHP, Ruby, JavaScript, and Perl give a feeling of speed. A programmer can write some small amount of code and run it—and it does something! There’s a pleasant feeling of progress when scripting; things seem to happen quickly.

By comparison, writing code in compiled languages takes longer. Compilers complain when you said you would return a value, but you didn’t. They kvetch when you use an undeclared variable or method, because that’s an error (usually a typo). Comparing incompatible things? No, you can’t do that. Compilers are fussy about passing correct variable types to functions, and whine when your parameters are wrong. Some languages allow you to provide multiple methods with different parameters, but passing arbitrary parameters—something all scripted languages allow—is impossible or awkward in compiled languages.

Compilers remind me of copy-editors for cookbooks. Download recipes from the ’net and you might get half-way through only to find out you need some special ingredient, or vague or unclear directions about how to perform a step; you’re obligated to throw away your attempt or take a best guess and see what happens. Maybe it’ll be edible, maybe not. But with the book, the copy-editor makes sure the recipes are formatted consistently, with all the ingredients listed up front, all the measurements specified including units, instructions clear and specific.

Like the copy-editor, the compiler offers proofreading. Compilers are pedantic, and all that futzing with complaints saves time debugging. Sure, in scripting languages you can try out things out early and often, and debug as you go. But writing anything of any complexity in a scripting language requires a lot of debugging. Passing the wrong variable, making a typo, getting parameter order wrong or leaving one off: all these disappear in compiled languages, leaving only logic errors to focus on.

Compilers pay off even more when revising existing functions. Add a parameter? Your code won’t compile until you fix all the uses of that function—whereas in a scripting language, something will happen (usually a runtime error or misbehavior).

I don’t mean to dismiss scripted languages entirely; I love shell scripts, and JavaScript is a fun little language despite its hazards. And Python’s local-by-default variables, named parameters, and parameter count checking make it considerably more palatable than, say, Perl.

But for all their complaints, I love compilers for all the troubles they give me. Because I am always amazed, after I address a dozen or two compiler objections, how often my code works flawlessly, and how quickly debugging goes other times. I attribute that to compiler pickiness.

Methodology

Revise, revise, revise

The secret to writing well is to revise, revise, revise. The goal of a first draft is to put nascent ideas to paper, with only some regard for how to best express the ideas. In subsequent revisions, we hone ideas and refine words to express ourselves more concisely, more precisely. We clarify ambiguities, address contorted grammar, and become succinct.

Software is no different; unfortunately, non-programmers rarely understand this. To outsiders, if it works, it’s done. But this is insane: when else does a first draft get published? Authors move through several drafts before publication, and the work is copy-edited prior to printing. Proof copies are printed and corrections made before mass production.

Well-written software undergoes the same rigorous editing. This doesn’t involve throwing away and starting again, but it does mean allowing programmers to:

  • Revise or rewrite incoherent sections of code, whether original or a result of edit fatigue.
  • Spend time cleaning up (copy-editing) code that’s sloppy
  • Refactor algorithms that were poorly designed. For example, I once encountered a page and a half of code equivalent to:
    bool is15interval(int i) {
        return ((i % 100) % 15) == 0;
    }

Edit fatigue

Imagine writing a book on the geopolitical situation in the Middle East. Each year, you update it to reflect the latest in the situation by adding to it, and avoiding revisions to previous content. Perhaps this approach would work for a few years, but over time the book would become confusing and erroneous. The original text might describe the kingdom of Elbonia and current rulership of that country, the 2012 update would describe the military overthrow and current rulership of that same country, and the 2014 update the revolution that overthrew the dictatorship and installed a democracy.

If you bought a current edition of this book, you would expect it to be accurate to its publication date. Yet, depending on how much or which sections you read, you end up with different beliefs of the current situation in Elbonia.

This is the problem with patchwork updates; I have deemed this “edit fatigue” because it reminds me of metal fatigue. Individually, this approach may be workable, but over the long haul changes need to be made holistically to stave off confusion, contradictions and errors.

Like the book needs its outline reworked, pieces of code need to be redesigned before repeated editing to “add one more thing” leaves it brittle and buggy. Because the “one more thing” was an add-on, fixes to make it work correctly become their own “one more thing”, and soon the whole becomes incoherent, and refactoring or rewriting the section (taking into account all the additions and changes) is the necessary strategy.

The longer rewrites are delayed, the more burdensome the eventual task will seem. Unfortunately, as projected effort to rewrite grows, “one more thing” seems like a better option. But revisions often introduce new bugs, and bug fixes don’t “stay fixed” as each correction made introduces a new issue. However, that time is attributed to debugging, not engineering, and so the costs are perceived as a cost of business. Were debugging costs tallied according to code in need of rewrite, the cost of the holistic approach would not seem so bad. Unfortunately, as is often the case, hidden or aggregate costs are ignored.

Software by engineering vs. evolution

Engineering is “the action of working artfully to bring something about.” [OS X dictionary] Artfully is an important part of this; engineering isn’t simply building haphazardly until something works.

While natural evolution is potent enough to produce the diversity of life on Earth, it is not reliable: colorblindness, hemophilia, down syndrome, sickle-cell, and numerous other genetic disorders come with evolution. Evolution can do amazing things, but it comes with side-effects.

Because the patchwork approach to software repair resembles evolution, it produces results reminiscent of evolution:

  • Problems that won’t “stay fixed”. A change or fix doesn’t work as completely as expected, and requires repeated patches. Sickle-cell provides resistance to malaria, but misshapes blood cells that then jam in capillaries. If this was software, we would “fix” the capillaries, which might create another new problem, which could then be fixed. All these downstream problems are because the original problem, malaria, was fixed with a hack, and those endowed with this hack have shorter lives.
  • Vestigial code. I was working on ASP e-commerce websites circa 2012, and one customer had a page that consistently took 25–30 seconds to load. I found code that used the time doing something mysterious, and after some research, determined it was related to an earlier database architecture. The code was no longer necessary, but nobody had removed it after the database change. I deleted it, and load times declined to less than a second. Much like our appendix, it no longer served a purpose, and only made trouble.

Unit Testing

I cannot say enough for automated unit testing and herd testing. pianod got automated tests late in its life, but they were there from the beginning with pianod2, though perhaps not as extensive as ideal. Nevertheless, as code undergoes refactoring, unit tests repeatedly catch subtle problems that are missed, allowing them to be corrected before release. Without automation, most of these would make it into the wild and cause frustration for users and a perception of a dodgy product.

pianod2 added herd testing early on. A single shell script builds a release and distributes it to a heterogenous herd of machines, where each one compiles and runs the product through the unit test. Inadequacies of a compiler, platform-specific issues, execution sequence issues in expressions, and mistakes that don’t show up on the development platform but do on others are a few of the problems herd testing catches.

Herd testing also addresses configuration varieties. For example, one of the herd uses libav, two ffmpeg, and one AVFoundation.

The Perils of Professional Programming

or,

The Sucky Side of Software Jobs

Most companies I’ve worked for have violated the above rules. I think the best environment I worked in was Nortel, but unfortunately during the dot-com boom/bust, the corporate leadership busied themselves capturing much new business as possible—even when that business didn’t make any sense.

To fix, or not to fix?

The most common problem is edit fatigue. “There isn’t time to address that properly now, so just make it work.” There are other pressing issues, or the risk of a larger fix is too high, or the budget is tight and there is only making do. If there isn’t time now, when will there be time? Unfortunately, a good time to revisit these issues never appears, partly for the same excuses, but also because the effort of a proper solution grows with each short-term fix.

The flipside is that being on a team restricts changing things. On a solo project, you can correct or redesign things as you go. But once others are involved, radical changes are limited by the potential to effect others. On my own I might rename a function call or change parameter order to make things clearer; with others, I risk their wrath if I change something they are familiar with, or frustrate them by breaking things they are working on.

The cumulative effect of lots of small patchwork fixes and lack of refactoring is incoherent code.

Prototyped vs. Working vs. Done

One of the problems with software is different definitions of what it means to be complete.

When building a house, walls are often framed on the ground or the floor. When squared up a 2-by–4 is nailed into place diagonally to keep it that way. Eventually the resulting wall is lifted up and nailed into place; there needs to be something else—perhaps another wall at a right angle to the first—to help it stand in place. But when that’s happened, the 2-by–4 is usually left on, yet nobody calls the house done.

One of the first goals when building a house is to get the exterior walls and roof on to protect the interior from the elements. It’s possible for windows to be put in and shingles and siding on before the interior gets much attention. If we saw this, we might think, “Wow, it’s nearly done.” But if you went inside and saw walls without drywall, plywood floors, and floor stand lighting powered by extensions cords, you would know better.

A few weeks later perhaps the drywall is up on one side of the walls, the electrical boxes and runs are in place. When the electrical inspector performs the rough inspection, does he think the walls are done? After all, they prevent you seeing from one room to the next—the wall is working. But no, he understands the walls are not done; the studs are still visible from the one side, allowing access for the electrical work.

The house is done when the electrical is installed, the outlets and switches wired; the diagonal 2-by–4 taken off and drywall up, mudded, and painted (on both sides of the wall); the tile, carpet or flooring over the plywood decking; the doors and ceilings hung and trimmed. If you’re building it yourself, you might move in while still working on it—but you understand it’s not done.

All these different states parallel software. When building pianod2, for example, my first audio player was a tone generator that played sine waves—the software equivalent of the diagonal 2-by–4, in place so I could build and test command interfaces, audio outputs and run loops without having to deal with the complications of reading MP3s yet. It allowed me to put up the framing without having to hang the drywall right away.

Also in pianod2, there was a point where it played music, but could not crossfade yet. But if you only looked at the surface, would it seem to be complete? It played music—must be done, right?

But more than that, “complete” has a lot of details that are below the surface. In the house example, there is a difference between livable and complete, and we understand the difference (though someone from the third world living a hut might believe it’s done). But since outsiders see software as mysterious wizardry, perception of “complete” is tied to “it works.” To an experienced eye, the code is not done if it’s not commented well, or comments say things like, “TODO: Check the return code on this,” or “TO DO: What if this calls fails?” These are software for, “This is temporarily rigged up, and needs work.” It parallels a crappy light fixture that’s been rigged onto some electrical wires dangling from the ceiling, put there so the drywall guy could see what he’s doing until the permanent fixture arrives. It might turn on and off at the flip of a switch, but the house is not complete that way.

But the finishing details are critical. They’re what takes software from “working” (but possibly fragile, quirky and unreliable) to “complete” (stable, well-behaved, reliable).

The thing about software is it’s often undergoing continual additions. Imagine building addition after addition onto house. If each one gets completed properly, it’s not so bad. But imagine if it’s only completed on the exterior: inside, there are extension cords everywhere, lots of walls framed up with no drywall, or maybe just on one side near windows so it looks done if you peek in from outside. But additions are going on so fast, there’s no chance to finish details properly. How long could this go on before the extension cords would incur too much load or line loss, or start failing from being walked over too long? Meanwhile, the workers trip over cables and fight with tripped circuit breakers because nothing’s complete, and they’re building on a temporary infrastructure. It’s demoralizing, dangerous, produces a bad result and cannot be sustained.

This is what it’s like working on software projects where management doesn’t understand or respect the difference between “working” and “complete.”

Agile vs. Rigid Development

The previous section describes problems that occur with both agile and rigid development methodologies. Agile avoids deeper fixes because that’s not fast and flexible—it only has to work, right? Rigid doesn’t fix it, because that’s not on the schedule.

And as far as refactoring, that gets skipped with agile because it’s about small and fast changes, and there’s never a chance to look at the big picture; rigid, because the big picture was fixed some point in the past, and things shalt not be changed now.

I don’t mean to bash either methodology, and my points certainly aren’t true of all projects in either scheme. My point is that both can be (and are) used to justify unproductive behaviors; in the right hands, any ideology can be used to justify any viewpoint.

I think the optimal value is somewhere in the middle. Some planning is necessary to accommodate future enhancements, but too much planning and a project becomes mired. There needs to be some flexibility in case plans don’t work out, but that doesn’t equate to building haphazardly.

As humans we commit to ideologies we like (often completely rejecting others), and our blind faith not only leads us down bad alleys, but leaves us unwilling to pursue other avenues. We are better off when we understand there are limits and keep an open mind to different approaches.

Tool of the Month

Managers (often new ones to a position) try to improve performance by introducing new and better tools. Often these replace existing tools that do similar jobs; ostensibly the new tool will be more efficient once everybody knows how to use it. If there is sufficient gain to be made, perhaps this is justified; however this is rarely the case.

First off, consider why the current tool is not used efficiently. Usually it’s because people don’t know how to use it. Nobody (or perhaps only a key few) was trained when it, and knowledge is limited; everyone knows the minimum needed to do their jobs, and that’s all. The existing tool might be used more effectively if employees knew how to use it properly, but:

  • There’s no budget to send people to training
  • Learning to use the tools does not produce work-product, so workers don’t spend time educating themselves
  • Neither do team managers press their charges to build skills.

Now, reflecting on the replacement tool, what will be the options?

  • Everyone can be trained in the proper use of the new tool, but training junkets for all become expensive.
  • Workers could be expected to read the manual and familiarize themselves, but it’s still not part of their work-product.
  • Managers similarly have no motivation to give their underlings time to study up on things that are not a part of the team’s responsibility.

The new product has the same problem as the old: people will learn only the minimum necessary to do their jobs. Thus, the perceived gains that could be there, won’t be there in practice.

Additionally, change is expensive. Knowledge of the existing product maybe limited, but it exists. For the first few months of the new system, employees will be muddling their way through things, less productive with the new software than with the old while they take time to figure it all out. If, after a while, the new product does provide a small gain in productivity, it must be sustained for enough duration to overcome the period of reduced productivity following the change.

Though I initially directed this concern toward managers, it’s not only them: programmers do the same thing, wanting to jump on the bandwagon of some great new library, tool, language, instant messenger or technology.

While there is the occasional game-changer that justifies switching tools, it is generally unproductive to keep up with the latest, greatest fad tool because of the learning curve inherent in switching, in addition to any data wrangling when transitioning between tools. (And if data is not consolidated when switching tools, then employees are obligated to search around for information, which is a long-term loss of productivity.)

Upgrading existing tools is helpful, because it doesn’t require relearning from ground zero, and there is less data-wrangling to be done when upgrading as opposed to changing software.

Management of the Month

This problem is the same as the tool of the month problem, except searching for the perfect management arrangement.

Lack of Longevity

Statistics on the industry show that only half of programmers last 6 years. Only ⅓ of us make it 15 years.

Why does that matter? Because at 6 years, we’re only just getting a handle on programming. Sure, you can grab a How-To book on any language that will walk you through some small projects, and in a few days or weeks you can be cookbooking your way up to larger things.

But what the neophytes write and what the experienced write aren’t the same. Neophytes don’t understand the implications of different data structures or algorithms for access or memory utilization. They don’t know the libraries extensively, thus reinventing or doing things the hard way. They make more errors, leading to more debugging, which takes longer for lack of experience. “Easy fixes” come with unconsidered side-effects. And the code they write is haphazardly styled and hard to read.

Second graders aren’t novelists. Neither are forth or sixth graders. High-schoolers only turn into writers if they are motivated to practice and develop the craft, and even then only with time. Writing well is more than taking a gander at the Little Brown Handbook, then giving it a go.

Unfortunately, this isn’t the attitude with programming. Non-programmers ask, “Can you write code that goes? If yes, you’re hired.” Then, once hired, they just want code that “goes”; there’s no assessment of quality. Because not being a coder, “it goes” is the only assessment they can make.

When I read code—both my own and other people’s—there’s the objective and the subjective side. Objectively, does it do what it’s supposed to? Subjectively, is it coherently organized? Does it completely handle special cases or errors that might happen? Is the approach effective and straightforward, or is it a Rube Goldberg contraption?

As one who reads code, I can differentiate failings. A sysadmin friend has clever approaches, but his code is unreliable because he never checks completion statuses. An electrical engineer whose code I inherited was reliable, coherently organized, but suffered badly from a cut-and-paste approach; I used his design but shifted to an object-oriented approach, eliminating 80% of the code and reducing memory footprint of the resulting executable while increasing maintainability. Although his style differs, I respect promylop’s libpiano because the code is coherent, clean, reliable. I assess my own code, too: pianod2 is a better design than the original; the worst of the ugliness has been concentrated into the Pandora source/libpiano interface. The User and Users classes are pretty rudimentary; the recently-redesigned filter code shows my C++ skills are maturing.

EnumeratedArray class stands out as an awesome solution to a problem that many have had, which I developed because when I went looking for a solution to array indexing with a scoped enumeration (enum class), it was clear to me those I saw were bad solutions—a conclusion a cookbooking newbie would probably not have made. Like a boss, neophytes would judge a solution by whether it would enable their code to work, not elegance or type safety.

If we want well-written software, we need to stop burning our programmers out just when they’re getting good.

Coding

Know the Language & Libraries

In the old days of C, libraries were small. Yet even there, some people didn’t know the libraries and would reinvent the wheel. The expansive libraries available today haven’t helped this problem, although automatic documenting systems like Doxygen are improving the state of things.

I acknowledge the problem that when picking up a new language, wrangling the syntax can be plenty to deal with. Writing savvy, elegant statements is impossible when you only know pidgeon. When that’s the case, grinding out unrefined code helps gain fluency. That can be balanced with Internet searches for ideas; there’s often someone that’s wanted to do the exact same thing and cook-booking can help build style and introduce appropos pieces of library. StackOverflow.com is great for this.

After attaining comfort writing code that works, one is still not a language expert. With a basic of fluency and a memory scaffold to fit information on, (re)reading a good book on the language cover-to-cover fills in language and library details. While studying each topic, recall related, clumsily-written pieces of code, then go back and revise them. This is where one advances from “adequate” to “excellent” code, learning nuances and subtleties of different expressions, better able to express code concisely and precisely.

Use Better Languages

One of my problems is that, because I’m good at shell scripting, a lot of problems look like candidates for shell scripts. Perhaps pragmatism justifies this for some short-term needs, but often featuritis leads small things to grow into monsters; ergo, my Internet-enabled software-controlled sump-pump. Shell is not an ideal language for parsing METARs (automated weather data), nor is it decent for writing complicated heuristics about precipitation levels and when to turn on the gutter’s ice melter, run the sump pump, or launch/quit BOINC. Yet shell it is, because in September 2004 it seemed like an easy way to ask Wunderground if it was raining and respond in some way.

Classic pianod was written in C because pianobar, its ancestor, was written in C. As I am implementing pianod2 in C++, I feel déjà vu of switching from Pascal to Object Pascal in the 1993; I can express more action in less time, with less code, more easily. The C++ STL library provides all sorts of handy data structures and algorithms so I don’t have to write and debug them. It’s beautiful to declare a vector<something> and just stick things in it, without having to carefully allocate or reallocate blocks of memory, making sure my references are kept correct so I don’t leak, lose, or abuse memory.

Choose wisely when selecting a language at the start of a project. It is usually a decision you’re stuck with for a long time.

Consider that although it may be a hassle to learn a new language, if that language is an improvement over the existing, that it will yield easier maintenance to your project’s code for years to come, and will be a skill available for the next project too.

Language enhancements

It is interesting how the language mores change over time, and the choices we have in alternative implementations. Using the STL you could write:

    copy_if(service.begin(), service.end(), back_inserter(connections),
        [user] (PianodConnection *conn) { return conn->user == user; });

This isn’t bad, but uses a lambda function which is C++11; previously, you’d have to use a functor:

struct when {
    User *user;
    when (User *u) { user = u };
    bool operator() (PianodConnection *conn) {
        return conn->user == user;
    }
};

But that’s adding up, so unless you wanted to do this a lot, I think it’s clearer to write the ugly loop:

for (PianodService::iterator s = service.begin(); s != service.end; s++) {
    if ((*s)->user == user) {
        connections.push_back ((*s)->user);
    }
}

But in C++11 my preference is to use the new for loop:

for (auto conn : service) {
    if (conn->user == user) {
        connections.push_back (conn);
    }
}

Any language worth using is evolving, both the libraries and the syntax of the language. Language enhancements allow code to be expressed more succinctly, with better error checking by compilers. They may offer new solutions or better or more adaptable ways of approaching problems. Language enhancements should be embraced, not shunned.

Formatting & Indenting

Indent code consistently and in a reasonable way that best improves legibility.

Although an 80-column line limit made sense when 80-column terminals were commonplace, the limit is arbitrary in today’s windowed environments. A day will come when IDEs are smart enough to adaptively wrap code sanely across lines; indent(1) and various automatic indenting adjustments are already offered. It seems reasonable to simply write for a comfortable width, and expect the technology will provide a complete solution soon enough. We are still bent on monospace editors, though there’s no good reason for this anymore: modern IDEs point right at the errors; no longer do we need to repeat :wqmake<enter><esc>?vi<enter><enter>54G37| (inserting the appropriate file and line numbers from the compile output) to find our way to the problem. But I’m digressing.

Comb vs. Toothbrush braces

Consider:

f()
{
    exit (1);
}

vs…

f() {
    exit (1);
}

The second is preferred, but this is a religious preference.

Consistent use of curly braces is also preferred, especially if using an editor that doesn’t have auto indenting. It’s too easy to have:

bool was_negative = false;
if (x < 0)
    x = -x;
    was_negative = true;
int y = pow (5, x);
return was_negative ? -y : y;

Modern editors will adjust indenting, exposing the need for braces, so I’m less adamant about this than previously. However, within a single if consistency is absolutely preferable for readability. This is ugly:

enum { negative, zero, positive } x_state;
if (x > 0) 
    x_state = positive;
else if (x < 0) {
    x_state = negative
    x = -x;
} else
    x_state = zero;

Parameter Checking

assert()

All function arguments that have expected values (non null values, integers within a specific range, etc.) should be checked with assert() to ensure callers are obeying the function’s contract. Early detection saves time, and forces callers to behave, whereas sloppy contracts allow errors to slip by and cause problems in other pieces of code.

Those other pieces of code are often patched to fix problems caused by the contract violation, but as they are deeper the checks may not be as straightforward. After a while code has redundant, obfuscated error checks all over and still isn’t reliable.

Where applicable, return values from functions can also be checked.

Additional checking

In addition to assert(), libraries should sanity check their API parameters to ensure application code is behaving when assertions are compiled out.

For calls within a library or program, parameter sanity checking is a judgment call based on the function’s contract with calling functions.

Consider, however, that CPU is cheap, debugging is hard, your time is valuable, and adding a check now may save hours later.

Coding Styles

Code should not resemble line noise

Why use an if when you can use a short-circuiting or? This bit of horrible C++ is from a unit test I wrote:

test_parser ("shorter_with_optional_and_longer-a", parser, "fish dinner", 1,
             0, NULL,
             1, NULL,
             CHECK_NEXT_FIELD, NULL,
             CHECK_END) || (success = false);

The goal should not be to fit as much action as possible onto each and every line of code. The goal is to write coherent, understandable code that works reliably. Clever abuses of syntax, complicated run-on statements, these are bad things.

If something is getting too complex, you should:

  • Move logic to a function, to break up the complexity into manageable chunks, or

  • Comment the code to explain what it is trying to achieve, and if complicated, how it is doing it. You don’t need to comment every line, and don’t just write in a comment what the code says.

    Bad:

    for (auto const &item : mycollection) {
        total += item.second.rating; // Add rating to total
    }
    float rating = total / float (mycollection.size()); // divide
    

    Add, divide: yeah, I knew that. Why? Here’s a better option:

    // Get the average rating
    for (auto const &item : mycollection) {
        total += item.second.rating;
    }
    float rating = total / float (mycollection.size());
    

Guard ifs vs. nested ifs

Consider:

g() {
    if (a = alloc(object_a)) {
        ...
        if (b = open_file (file_b)) {
            ...
            if (c = alloc (object_c)) {
                ...
                if (d = calloc (object_d)) {
                    ...
                    return true
                } else {
                    alert ("alloc object_d botched");
                }
                free (c);
            } else {
                alert ("alloc object_c botched");
            }
            close_file (b);
        } else {
            alert ("open file_b failed");
        }
        free (a)
    } else {
        alert ("alloc object_a botched");
    }
    return false;
    } /* End of g()

Versus:

g() {
    if (!(a = alloc (object_a))) {
        alert ("alloc object_a botched");
        return false;
    }
...
    if (!(b = open_file (file_b))) {
        alert ("open_file file_b failed");
        free (a);
        return false;
    }
    ...
    if (!c = alloc (object_c))) {
        alert ("alloc object_c botched");
        close_file (b);
        free (a);
        return false;
    }
    ...
    if (!d = alloc (object_d))) {
        alert ("alloc object_d botched");
        close_file (b);
        free (a);
        return false;
    }
    ...
    return true;
}

Suppose a change requires adding another allocation between allocating A and opening file B.

  • With guard ifs, the repeated releases of all the previously allocated resources get long and repetitive.
  • When changes are necessary, there is a possibility (likelihood?) of forgetting the resource release in some the guard-style cases. This leak is hard to find.
  • Guard ifs can work well if there’s no leak-risk work being done, such as parameter checking.
  • Nested ifs are preferred for allocations, file opens, or any tasks that require some sort of release/undo if they fail.
  • If success requires releasing resources too, consider storing the result in a variable and return that at end of function to take advantage of the error handling releases.

In C, it is up to us to release all resources, and nested ifs are pragmatic to ensure releases balance allocations. In a language like C++, where a variable that goes out of scope is automatically destructed (assuming it’s not a pointer or reference), guard ifs may be clearer. That all said, the best choice depends on the particular goal, and there is no hard-and-fast rule that applies to all situations in any language.

Incidentally, the above guard if code leaks C if D allocation fails. Did you notice that?

Case & Type

These are completely arbitrary, and I’m not 100% satisfied. I’m not sure there is a satisfactory manner.

And perhaps more important than any particular style, is adhering to the style of existing code when making revisions. Conformity to existing style generally overrides any of the following preferences.

  • Enumerations and constants are ALL_CAPITALS. C++ enum classes (whose members must be referenced by EnumClassName::MemberName) are CamelCaseWithLeadingCapitals.
  • Typedefs, classes, and structs as public classes are CamelCaseWithLeadingCapital.
  • C structures (non-class structs) are lower_case; their corresponding typedef is CamelCase.
  • Member functions are camelCaseWithLeadingSmall.
  • Non-member functions are all_lowercase (C modules) or camelCaseWithLeadingSmall (C++ modules).
  • Variables are all_lowercase. This includes locals, globals, members and parameters.

See Also

The Joel Test: 12 Steps to Better Code by Joel Spolsky of Joel on Software.