Friday, September 7, 2012

Clean Code by Example: Unit Tests and Mindfulness

In this posting I going to do a bit of a code review. I am trying to apply the guidelines given in Robert C. Martin's «Clean Code» to spot problems in a little sample application of mine, and then try to fix them.

Only recently I applied for a job at a major financial service provider that has its headquarters in the heart of Berlin. It is a real pearl, and I am in high hopes still to get hired. (Update 09/18/2012: On the evening of the very same day I posted this article, I received a phone call that I would not get the job—bummer!) Part of their rather thorough procedure was a little programming challenge. The assignment was to compute the overall score for a bowling game. In an afterward interview, the chief architect of the department went through the code with me. He pointed out spots that could be improved, and in many of them I agree.

The developers of that financial service provider follow Robert C. Martin's book «Clean Code» as a coding guideline. I did not know it at the time, and so, I guess, it is high time I read it now. In this code review I will follow both the comments of the chief architect and the principles of «Uncle Bob»—which may overlap. Also, I will try to mend those parts which I do not like about my code—there are a few.

The program code that went with my job application you can find on Github. For those not familiar with the scoring rules of ten pin bowling, Wikipedia has a sum-up.

The chief architect focused on the following issues. Maybe there were more, but these I remember:
  1. Missing dependencies.
  2. Test-cases are too large.
  3. Test-cases contain redundant code.
  4. Something is fishy about the exceptions thrown. (Posted on 09/24/2012)
  5. There are too much comments.
  6. There are too few comments.
  7. Some methods or to long,
  8. Some methods are hard to understand.
I ordered them by increasing complexity and difficulty to fix. Some of them are related, like the second or fourth, or the last four items. I will try to recollect the point the chief architect made about them. I also will rationalize—reason I mean—why I programmed them that way in the fist place. In this blog posting I will only cover the first three items, covering the other in later articles.

Missing dependencies

This is not really a code issue, but it is an issue nonetheless. The chief architect complained that my solution depended on external libraries (JUnit here), but neither did I package them with my solution nor did I provide a way to easily satisfy the dependency, like a Maven project. The odd fact is, first thing I did, when I had my solution ready, was to put it in a Maven project. Before that, Eclipse did all the magic for me. But I was familiarizing myself with Maven at the time, so it was the natural thing to do.

I deliberately did not send the Maven project along with the source code, and that is the real problem. It is a problem specifically if you are in for an employment, mind you. Because in such a situation you might want to consider, what the employing party is expecting of you. Also you certainly do not want to annoy them in a task they are actually doing for you, like evaluating your working samples. I failed in both aspects, because I neither minded the small but avoidable trouble the missing libraries would cause. Nor did I reason, that a professional Java programmer would be familiar with the prevalent build tool of his field. My line of thinking went more in the direction of avoiding a zip-archive that contained considerably more directories than files, which was the first thing I disliked about Maven in the beginning (but got fairly quick used to).

There nothing to fix here—anymore. So let's go on to the real stuff.

Test-cases are too large and are full of redundant code

Have a look at BowlingGameTest.java. It contains a few short test-cases but also three test-cases that are between 40 and 70 lines long, the largest containing 27 assertions!

When I was asked, why they were so long, I started blabbering something about Python doc-tests and telling stories with a test-case, and non-sense like that. I was taken off my guard; then trying to make up some reasons on the spot because I could not remember or reconstruct my original line of thinking. A better reaction in an job interview would be to either admit that you see the point, or alternatively be honest and say that it seemed reasonable then, but you cannot recollect, why, now.

In truth, I was too lazy to split the test-cases into separated entities. This would have meant to come up with meaningful names. And that would have taken time. Instead I grouped related tests under one name.

The problem with such bloated test-cases is, that you are never sure how many tests are actually failing. Because with JUnit test-cases it is all or nothing. As soon as the first assertion of a test-case is violated the whole test-case will be aborted. You will never now, whether the following assertions would hold, or not. This is not a problem if the assertion are strongly related and test a single aspect of the program. In an aggregated test-case this is rarely the case. Look at this:

@Test
public void gameProgress() throws BowlingException {
  BowlingGame game = new BowlingGame();

  for (int i = 0; i < 9; i++) {
    assertEquals(i + 1, game.getCurrentFrame());
    assertEquals(1, game.getCurrentRoll());
    game.addRoll(2);
    assertTrue(!game.isComplete());

    assertEquals(2, game.getCurrentRoll());
    game.addRoll(3);
    assertTrue(!game.isComplete());
  }

  assertEquals(10, game.getCurrentFrame());
  assertEquals(1, game.getCurrentRoll());
  game.addRoll(4);
  assertTrue(!game.isComplete());

  assertEquals(2, game.getCurrentRoll());
  game.addRoll(5);
  assertTrue(game.isComplete());

  // 10th frame is spare
  game = new BowlingGame();
  for (int i = 0; i < 9; i++) {
    game.addRoll(2);
    game.addRoll(3);
  }
  assertEquals(10, game.getCurrentFrame());
  …
}

…and so on for forty odd lines. There is, of cause, the problem of poor intelligibility. What does the first for-loop up to the comment actually test for? But I will go into this matter later. Clearly, the loop after the comment tests a completely independent aspect, as it creates a whole new BowlingGame object. And thanks to the comment the reader might guess what it actually does. Without it, the reader has to figure out the meaning of the test for himself. But that's the job of good, succinct method name, right?

Let's see what Uncle Bob has to say about the matter. Chapter 9 of «Clean Code» is dedicated to unit testing. Two rules apply for certain: minimize the number asserts per test (he is not advertising strict one-assert-per-test discipline), and a single concept per test. Fewer assertions make «those tests come to a single conclusion that is quick and easy to understand» (page 130). The single concept rule states basically what I wrote in the last paragraph.

So let's split the huge methods up, starting with «invalidRolls()», which originally looked like this:

@Test
public void invalidRolls() {
  BowlingGame game;
  boolean caught;

  game = new BowlingGame();
  caught = false;
  try {
    game.addRoll(-1);
  } catch (BowlingException x) {
    caught = true;
  }
  if (!caught)
    fail("A bowling exception should have been thrown.");

  game = new BowlingGame();
  caught = false;
  try {
    game.addRoll(11);
  } catch (BowlingException x) {
    caught = true;
  }
  if (!caught)
    fail("A bowling exception should have been thrown.");

  game = new BowlingGame();
  caught = false;
  try {
    game.addRoll(2);
    game.addRoll(9);
  } catch (BowlingException x) {
    caught = true;
  }
  if (!caught)
    fail("A bowling exception should have been thrown.");

  game = new BowlingGame();
  caught = false;
  try {
    game.addRoll(2);
    game.addRoll(-2);
  } catch (BowlingException x) {
    caught = true;
  }
  if (!caught)
    fail("A bowling exception should have been thrown.");
}

This one shows with in-your-face directness that laziness and ugliness go hand in hand. Each individual test-case of the four aggregated ones expects an BowlingException to be thrown. JUnit test can express this intend explicitly by annotating the test-case with «@Test(expected=BowlingException.class)». The aggregate test-case cannot make use of this because the first thrown exception would successfully terminate the test-case leaving all the following code un-executed. Refactoring the four kinds of invalid rolls into separate methods will not only make their nature more obvious, it will make the test-case a great deal shorter and more to the point, as the whole exception-catching and flagging construct disappears.

I started to extract the methods using the equally named refactoring tool of the Eclipse IDE, but so much code had become unnecessary that I just did one test-case and copied it three times. The result is a stunning improvement:

@Test(expected=BowlingException.class)
public void invalidRollBelowZero() throws BowlingException {
  BowlingGame game = new BowlingGame();
  game.addRoll(-1);
}

@Test(expected=BowlingException.class)
public void invalidRollAboveTen()  throws BowlingException {
  BowlingGame game = new BowlingGame();
  game.addRoll(11);
}

@Test(expected=BowlingException.class)
public void invalidSecondRollBelowZero() throws BowlingException {
  BowlingGame game = new BowlingGame();
  game.addRoll(2);
  game.addRoll(-2);
}

@Test(expected=BowlingException.class)
public void invalidFrameAboveTen()  throws BowlingException {
  BowlingGame game = new BowlingGame();
  game.addRoll(2);
  game.addRoll(9);
}

Next up is «gameInProgress()», part of which you already saw above. Again the code is highly redundant. First, nine frames are rolled in a loop without spare or strike. The last frame is special, as a spare or a strike allows for extra rolls to compute the bonuses. A strike, a spare, or none of the former make it three test-cases in total. Splitting by «extract method» is straight forward, yielding three almost identical tests, that look like the following:

@Test
public void tenthFrameIsStrike() throws BowlingException {
  BowlingGame game;
  game = new BowlingGame();
  for (int i = 0; i < 9; i++) {
    game.addRoll(2);
    game.addRoll(3);
  }

  assertEquals(10, game.getCurrentFrame());
  assertEquals(1, game.getCurrentRoll());
  game.addRoll(10);
  assertTrue(!game.isComplete());

  assertEquals(10, game.getCurrentFrame());
  assertEquals(2, game.getCurrentRoll());
  game.addRoll(10);
  assertTrue(!game.isComplete());

  assertEquals(10, game.getCurrentFrame());
  assertEquals(3, game.getCurrentRoll());
  game.addRoll(10);
  assertTrue(game.isComplete());
}

One of them differs a bit as it also makes some assertions during the rolling of the first nine frames. The latter test-cases do not do this as the executed code would be exactly the same. Clearly, the former test-case violates the «one concept per test-case» rule, as it test first the correct filling of the first nine frames, and only then its advertised concept, the rolling of the final frame. Resolving this puts us in a tight spot between minimizing redundancy and having one concept per test-case, because these are the options I see:
  1. Make the rolling of the first nine frames a separate test-case. That would solve the one-concept-per-test-case-issue. Still, rolling nine frames is required before testing the rolling of the tenth frame. Which would indicate to…
  2. isolate the rolling of the first nine frames as an utility operation. This utility cannot be a test-case itself. This would violate the I of the F.I.R.S.T. virtues of unit-testing that «Clean Code» declares: Fast, Independent, Repeatable, Self-Validating, and Timely (pp. 132). Independent means, no test-case must depend on another. This again implies that we have a test-case and a utility operation, both having virtually the same code. Which throws us into redundancy hell. The third option would be to…
  3. add some assertions to the utility operation, call it from a test-case that covers the rolling of the first nine-frame, and also use it to prepare the testing of the tenth frame. The price to pay for this is the overhead introduced by implicitly testing the rolling of the first nine frames over and over again.
But what is life without ambivalence? Hence, I went for option three as the lesser of the evils, and ended up with code like this:

protected BowlingGame rollNineFrames() throws BowlingException {
  BowlingGame game = new BowlingGame();
  for (int i = 0; i < 9; i++) {
    assertEquals(i + 1, game.getCurrentFrame());
    assertEquals(1, game.getCurrentRoll());
    game.addRoll(2);
    assertTrue(!game.isComplete());

    assertEquals(2, game.getCurrentRoll());
    game.addRoll(3);
    assertTrue(!game.isComplete());
  }
  assertEquals(10, game.getCurrentFrame());
  assertEquals(1, game.getCurrentRoll());
  return game;
}

@Test
public void firstNineFramesNoStrikeNoSpare() throws BowlingException {
  rollNineFrames();
}

@Test
public void tenthFrameIsStrike() throws BowlingException {
  BowlingGame game = rollNineFrames();

  game.addRoll(10);
  assertTrue(!game.isComplete());

  assertEquals(10, game.getCurrentFrame());
  assertEquals(2, game.getCurrentRoll());
  game.addRoll(10);
  assertTrue(!game.isComplete());

  assertEquals(10, game.getCurrentFrame());
  assertEquals(3, game.getCurrentRoll());
  game.addRoll(10);
  assertTrue(game.isComplete());
}

Now, should the tests that cover the rolling of the final frame be split up further? Uncle Bob advises so, because they still cover more than one concept. The first is the way the final frame is rolled: without clearing all pin, with a strike, or with a spare. The latter two continue the game. The second concept is the correct handling of the bonus rolls.

But is the splitting mandatory? I would argue, it is! But only if following a strike, spare, or none of the former, there would be more than one test-relevant way to roll the bonus balls. As a matter of fact, I realized there is at least one: after a spare in the last frame, a following «strike» should not allow for another bonus roll.

Dutifully I extracted three more utility operations for rolling ten frames: with the last frame being either a spare, a strike, or none of them.

Again the utility methods make assertions. That bothers me for the following reason. In their current form, the utilities test whether the game is complete, which frame is rolled and which roll within a frame is due. Now suppose, we decided that we would like to check the game score as well. Computing the game score is the most expensive operation of the application, and now it would be called again and again, most of the time testing the very same aspect of the application. Admittedly, the scoring function is not really expensive in computation complexity terms, in fact its complexity is O(1), i.e. constant time. But suppose it was not cheap, what then? We would violate the F of the F.I.R.S.T. virtues: fastness. I see no generally viable way out of this dilemma, so I let it rest for now.

Only, two test-cases are left to go. First I split «strikesAndSpares()» in two, right at the «And». This test-case checks the valid flagging of frames as strike or spares. Then I realized there is a third concept tested: neither spare nor strike was rolled. That made it three test-cases, that look like this one:

@Test
public void frameIsSpare() throws BowlingException {
  BowlingGame game = new BowlingGame();

  game.addRoll(5);
  game.addRoll(5);
  assertTrue(game.isSpare(1));
  assertTrue(!game.isStrike(1));
}

The last test-case «score()» feeds a list of rolls into a game. Then it checks where the score is computed correctly:

@Test
public void score() throws BowlingException {
  BowlingGame game = new BowlingGame(new int[] { 10, 10, 10, 10, 10, 10,
      10, 10, 10, 10, 10, 10 });
  assertEquals(300, game.getScore());

  game = new BowlingGame(new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
      0, 0, 0, 0, 0, 0, 0, 0 });
  assertEquals(0, game.getScore());

  game = new BowlingGame(new int[] { 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2,
      3, 4, 5, 1, 2, 3, 4, 5 });
  assertEquals(60, game.getScore());
  …

Eleven combinations of rolls are tested altogether. Again, multiple, though admittedly related, concepts are merged into one test-case. Again, comments are required to signify the role of the rolls. Again, the first failing assert will veil the outcome of the following. Nurse, scissors!

@Test
public void scoreAllZeroLastFrameStrikeExtraRollsStrikes()
    throws BowlingException {
  BowlingGame game = new BowlingGame(new int[] { 0, 0, 0, 0, 0, 0, 0, 0,
      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10, 10 });
  assertEquals(30, game.getScore());
}

@Test
public void midGameScoreInterruptedStrikeStreak() throws BowlingException {
  BowlingGame game = new BowlingGame(new int[] { 10, 10, 10, 5 });
  assertEquals(75, game.getScore());

The resulting eleven test-cases are crisp and short, but not without flaw. The names are awful! They all start with either «score» or «midGameScore». Clearly this is grouping prefix. Classes lend themselves much better for grouping, and since the BowlingGameTest has «grown» from five to twenty eight methods, it seems sensible to break the class up.

I created two classes «CompleteGameScoringTests» and «IncompleteGameScoringTests» and moved the methods by drag-and-drop in the Eclipse package explorer. The method names are much more readable now:

public class IncompleteGameScoringTest {

  @Test
  public void interruptedStrikeStreak() throws BowlingException {
    BowlingGame game = new BowlingGame(new int[] { 10, 10, 10, 5 });
    assertEquals(75, game.getScore());
  }
  …

Done! It is quite possible than further splitting of the BowlingGameTest class would be wise, but I defer this until later, when I know more about «Clean Code.»

Conclusion

Applying the rules of «Clean Code» to the test-cases of even such a simple piece of code as mine, already greatly improved its readability—and yes, its beauty. But there is further room for improvement. Because Robert C. Martin states that the most important thing about automated testing is that the tests are first-class citizen, just like the application code is, and must be kept clean by the same high standards (page 124). That's why I will frequently revisit my tests during the course of cleaning up my Bowling example.

No comments:

Post a Comment