Thursday, March 05, 2009

What Makes A Good Program?

This is a continuation of the review of "The Psychology of Programming". You can find the post on Chapter 1 here.

Chapter 2 - What Makes A Good Program?

If we plan to study programming as a human activity, we are going to have to develop some measures of programming performance. That is, we are going to have some idea of what we mean when we say that one programmer is better than another, or one program is better than another. Although we all have opinions on these questions, we shall find that the answers are not as simple as we might wish. For programming is not just human behavior; it is complex human behavior.


What do we mean by complex human behavior? This article at betterexplained.com does a good job highlighting the false dichotomy between simplicity and powerful. Another explanation my good friend Scott sent me used simple, complicated, and complex. I'll let Dave Nicolette speak for himself.

The spectrum goes from easy to understand and predict to extremely hard to understand and predict (but possible with enough effort), to impossible to understand and predict regardless of the level of effort.

In our studies of the psychology of programming, we shall be hampered by our inability to measure the goodness of programs on an absolute scale.
Plenty of effort has been wasted trying to discover that mythical absolute scale; lines-of-code, lines per day, runtime, memory usage, the list goes on and on. The only thing that is agreed it would be that there is no single metric which can be applied to all programs.

Looking honestly at the situation, we are never looking for the best program, seldom looking for a good one, but always looking for one that meets the requirements.
This might be the hidden reason for why so much lip-service is paid to software quality. Crappy code that ready right now always beats higher-quality, however defined, that is late. Unless there is a business reason which drives a requirement connected to quality (like uptime, throughput, etc.) where time-to-market is trumped by proper outcomes, managers will revert to turning a blind-eye to quality. Risk deferred is risk ignored.


Specifications
Of all the requirements that we might place on a program, first and foremost is that it be correct. In other words, it should give the correct outputs for each possible input. This is what we mean when we say that a program "works," and it is often and truly said that "any program that works is better than any program that doesn't."
Here follows a good story about a system for a car maker to handle all the options which prospective car purchasers could order. When the programmer arrives on the scene of a disaster in the making, the long-settled approach used was not only overly complicated but didn't produce the correct results. When the programmer exclaimed that the Emporer had no clothes, he was condemned for being uncooperative. While on the way home, he can't stop thinking about the problem. In a fit of insight, he realizes that a workable approach was within reach. Upon returning to the customer site, he describes his solution. Naturally the audience gave him a cool reception but they listened without questions - until the original create of the old system raised his hand and asked, "And how long does _your_ program take?" to which the response was, "That varies with the input but on average, about ten seconds per card" (an indication of how old this story is). "Aha," was the triumphant reply. "But _my_ program takes only one second per card." This seemed to settle the issue with the audience until our protagonist observes,

"But your program doesn't work. If the program doesn't have to work, I can write one that takes one millisecond per card - and that's faster than our card reader."
Here is where the sarcastic could say "We just redefine what 'done' means", or "that isn't a bug, it's an undocumented feature"

In effect, then, there is a difference between a program written for one user and a piece of "software." When there are multiple users, there are multiple specifications. When there are multiple specifications, there are multiple definitions of when the program is working. In our discussions of programming practices, we are going to have to take into account the difference between programs developed for one user and programs developed for many. They will be evaluated differently, and they should be produced by different methods.
Imagine a scene from "I Love Lucy" where Ricky Ricardo and Fred Mertz are trying to rearrange furniture at the behest of Lucy and Ethel. "Move that painting to the other wall", Lucy commands. When she then turns her attention to having the men move the couch "a teensy bit" closer to the wall, Fred moves his side an inch while Ricky moves his side a foot; after all, how many "teensies" are in a foot? Ethel, meanwhile, sneaks over to the picture while everyone else is busy with the couch and moves the painting back to its original location, "It looked better that anyway", she thinks to herself.

However comical this might be, when this happens in software it can be many, many times worse. With multiple stakeholders, multiple upstream and downstream dependencies, and a large group of developers trying to write code, it should be easy to realize how requirements (and their management) can become a nightmare.

Schedule
One of the recurring problems in programming is meeting schedules, and a program that is late is often worthless.
Most people prefer to wait a fixed ten minutes for the bus each morning than to wait one minute on four days and tewenty-six minutes once a week. Even though the average wait is six minutes in the second case, the derangement caused by one long and unexpected delay more than compensates for this. This is where a psychological study would be rewarding. Project management is not a science and only partially is controlled by mathematics. Knowing the psychological landscape in which the project functions can make a big difference in its success.

Adaptability


Few programmers of any experience would contradict the assertion that most programs are modified in their lifetime. Why, then, when we are forced to modify programs do we find it such a Herculean task that w4e often decide to throw them away and start over? Reading programs gives us some insight, for we rarely find a program that contains any evidence of having been written with an eye to subsequent modification. But this is only a symptom, not a disease. Why, we should ask, do programmers, who know full well that programs will inevitably be modified, write programs with no thought to such modifications? Why, indeed, their programs sometimes look as if they had been devilishly contrived to resist modification - protected like the Pharaohs' tomb against all intruders?
This really goes to the heart of the quality software movement. If it's shown time and again that the majority of software cost is in maintenance, why do we fail so miserably to factor in maintainability while writing software?

Adaptability is not free. Sometimes, to be sure, we get a program that happens to be adaptable as well as satisfactory in all other ways, but we generally pay for what we get - and often fail to get what we pay for.
It's as if the manager figures that maintenance doesn't come out of his budget and she's only being measured by whether the project comes in on time even if the resulting code will cost twice as much to maintain.


Fisher's Fundamental Theorem states - in terms appropriate to the present context - that the better adapted a system is to a particular environment, the less adaptable it is to new environments.
This has large effects on requirements. If one requirement is to have a database that runs on a specific system, a hidden cost is now attached where if that vendor goes out of business, the code can be next to impossible to move to another platform. You satisfied the requirement but paid a huge price. "An ounce of prevention is worth a pound of cure."

However, the same managers who scream for efficiency are the ones who tear their hair when told the cost of modifications. Conversely, the managers who ask for generalized and easily modified programs are wont to complain when they find out how slow and spacious there programs turn out to be. We must be adult about such matters: neither psychology nor magic is going to help us to achieve contradictory goals simultaneously. Asking for efficiency and adaptability in the same program is like asking for a beautiful and modest wife. Although beauty and modesty have been known to occur in the same woman, we'll probably have to settle for one or the other. At least that's better than neither.
Be carefule what you wish for, you just might get it...

Efficiency

Moreover, with the cost per unit of computation decreasing every year and cost per unit of programming increasing, we have long since passed the point where the typical installation spends more money on programming than it does on production work. [like the cost of machine time, ed.] This imbalance is even more striking when all the work improperly classified as "production" is put under the proper heading of "debugging."
Even 30 years ago, the cost of development was the largest cost in the field of programming. If you don't have a manager who understands the challenges and complexities of software development, you are facing a two-front war, 1) the challenge of writing software that works in the time required, using the resources allotted and, 2) a struggle to keep a clueless, however well-meaning, boss from making things worse through ignorance much less incompetence.

Summary

Questions to ponder:
1. Does the program meet specifications? Or, rather, how well does it met specifications?
2. Is it produced on schedule, and what is the variability in the schedule that we can expect from particular approaches?
3. Will it be possible to change the program when conditions change? How much will it cost to make the change?
4. How efficient is the program, and what do we mean by efficiency? Are we trading efficiency in one area for inefficiency in another?


I can here the pointy-haired-boss now, "Why are you asking so many (implied -stupid-) questions? I don't understand why you're making it so hard, just go write it already!"

This is a good place to point out that repeating the same behavior and expecting a different outcome is one of the definitions of insanity.

Next is Chapter 3 - How Can We Study Programming
Delicious Bookmark this on Delicious

Tuesday, February 17, 2009

Review - The Psychology of Computer Programming

I got a "new" book in the mail a few weeks ago and finally had time to crack it open, "The Psychology of Computer Programming" by Gerald M. Weinberg published in 1971. There is a Silver Anniversary edition that recently came out but I wanted a cheap copy in hardback so mine came from an online used book site. Why read a computer programming book that is almost as old as myself? Because it addresses the core issue of producing computer programs, the people who read, write, edit, maintain, or otherwise cuss at (other people's of course), code. People are the primary factor in all programs; before hardware, language, specifications, anything. If you don't understand how humans go about creating these abstract entities called programs, you will constantly be mystified by missed deadlines, frustrated by missing functionality, and stymied by endless scope creep. Enough of my pontificating, let's talk about the book.

Table of Contents


  1. Programming as Human Performance


    1. Reading Programs

    2. What Makes a Good Program?

    3. How Can We Study Programming?


  2. Programming as a Social Activity


    1. The Programming Group

    2. The Programming Team

    3. The Programming Project


  3. Programming as an Individual Activity


    1. Variations in the Programming Task

    2. Personality Factors

    3. Intelligence, or Problem-Solving Ability

    4. 1Motivation, Training, and Experience


  4. Programming Tools


    1. Programming Languages

    2. Some Principles for Programming Language Design

    3. Other Programming Tools


  5. Epilogue

"Computer programming is a human activity." A pretty bold thesis from the intro to Part I. Is there a mystique to programming? "Either you can program or you cannot. Some have it; some don't." Both quotes give you a good idea as to what is in this book, tackled expertly by Mr. Weinberg.

Chapter 1 - Reading Programs

Some years ago, when COBOL was the great white programming hope, one heard much talk of the possibility of executives being able to read programs. Weith the perspective of time, we can see that this claim was merely inteded to attract the funds of exectutives who hoped to free themselves from bondage to their programmers. Nobody can seriously have believed that executives could read programs. Why should they? Even programmers do not read programs.
I hear a similar story from when assemblers were introduced - as in, "With the development of assemblers, we won't need programmers anymore!" I believe similar statements have been propagated by hordes of 4th generation language and CASE salesmen.

"Programming is, among other things, a kind of writing."
This is not a very mainstream view in the programming world even though we work with constructs which are literally "languages".

"We shall need a method of approach for reading programs, for, unlike novels, the best way to read them is not always from beginning to end. They are not even like mysteries, where we can turn to the penultimate page for the best part -- or like sexy books, which we can let fall open to the most creased pages in order to find the warmest passages. No, the good parts of a program are not found in any necessary places -- although we will later see how we can discover crucial sections for such tasks as debugging and optimization. Instead, we might base our reading on a conceptual framework consisting of the origin of each part. In other words, as we look at each peice of code, we ask ourselves the questions, 'Why is this piece here?'"
The author begins examining a section of PL/I code showing how certain machine, language, and human limitations influence a program. Machine limitations like lack of memory to hold the entire problem set in memory at the same time causing the use of two loops instead of one. Language limitations like the lack of an end-of-file indicator which forces the operators to include a special character (or card in the 1970's punch-card centric world) and causes the programmer to account for the special character in code. Programmer limitations like not really understanding array notation of the language, in this instance PL/I. As a program is modified over time, machines change, languages are updated, and programmers come and go.

"And so, some years later, a novice programmer who is given the job of modifying this program will congratulate himself for knowing more about PL/I than the person who originally wrote this program. Since that person is probably his supervisor, an unhealthy attitude may develop -- which, incidentally, is another psychological reality of programming life which we shall have to face eventually."
Some programs have inscrutable logic like the use of special characters which are ordinarily invalid; "magic numbers" used for interim states for some long-forgotten problem.

"Once the patch was made, it worked so well that everyone forgot about it -- more psychology -- and there it sat until unearthed many years later by two archeologist programmers."
To different versions of the same PL/I program are compared, the first includes many limitations of which we've already spoken, the second much improved by removal of the previous limitations. Of the comparison,

"When we look at the difference between Figures 1-1 and 1-4, we might begin to believe that very little of the coding that is done in the world has much to do with the problems we are trying to solve"
Would we be any better off if we could use the latest code (Figure 1-4) as our spec?

"Specifications evolve together with programs and programmers. Writing a program is a process of learning -- both for the programmer and the person who commissions the program. Moreover, this learning takes place in the context of a particular machine, a particular programming language, a particular programmer or programming team in a particular working environment, and a particular set of historical events that determine not just the form of the code but also what the code does!

In a way, the most important reason for stufying the process by which programs are written by people is not to make the programs more efficient, more compact, cheaper, or more easily understood. Instead, the most important gain is the prospect of getting from our programs what we really want -- rather than just whatever we can manage to produce in our fumbling, bumbling way"

Hopefully the reason to study a 30+ year-old book is apparent; those who fail to learn from history are bound to repeat it.

Let's study historical mistakes so we can make our own new mistakes instead of repeating mistakes of programmers long-retired.

The rest of the book is just as rich in lessons we can still take to heart. By approaching the book chapter by chapter, I'm hoping to increase my chances of getting all the way through without writing a 50-page article all at once which would never get finished.

Here is the link for the next chapter: Chapter 2 - What Makes a Good Program
Delicious Bookmark this on Delicious

Tuesday, February 03, 2009

Does Anyone Know What Testing Is?

(With apologies to Chicago for the title) I'm a big fan of good design which should be no surprise to those who know me. It can irritate my wife when I analyze a building and then start describing all the different ways in which you can tell it was poorly designed. I do rant when I see great design but those are much more rare in these "I don't care if it's a half-baked piece of crap, we need it ready now!" times. I apologize for any management flashbacks that caused.

In any event, I'm also a big fan of testing. I see testing scenarios around me during the day. This story just happens to be one that is easy to tell. The mens room where I work has three sinks, each with a soap dispenser. There was one sink that more heavily used than the others based on the fact that there was never any soap in the dispenser. You'd push on the plunger a couple of times and when nothing came out you'd shrug and move to another sink. I always figured that it was running out quickly since I would see the buildings cleaning crew in the bathroom a couple times a week. Surely they were refilling the dispensers right? After a few weeks of the "soapless sink shuffle" I mentioned the problem to the cleaning guy. I got his attention and showed him how no soap came out. He looked confused and said something to the effect of "it should work because it has plenty of soap". As I looked under the counter where he was examining the half-full bottle of soap it was clear that lack of soap wasn't the problem. I left him to his work, figuring that maybe the tube from the plunger was disconnected; plus, he didn't need me looking over his shoulder.

Upon reflection, I realized that his 'test' for being out of soap was looking under the counter to see if any of the containers were empty. Our 'test' for being out of soap was trying to get soap from the dispenser. So even though his test passed, it still was failing the end user because he wasn't verifying the expected result; the assumption being that the only reason you wouldn't get soap was if the container was empty.

The bottom line is to always be aware of what your tests are testing. Just because you're tests all pass does not guarantee everything is working as long as you aren't testing the way the system/device is used in real life. So, when you look at your test plan ask yourself, am I checking that we are out of soap, or am I checking whether I can get soap out of the dispenser? It might seem like a small difference but it can be difference between a cleanly running system and one that leaves you all wet.
Delicious Bookmark this on Delicious