Deciphering test feedback

Published in

Clarity AI Tech

6 min readFeb 12, 2022

You may find many statements on the Internet: “tests give you valuable design feedback”, “listen to the test feedback”, and so on. But you seldom find a “test feedback manual” telling you what is actually test feedback and how to react to it. This post gives you a concrete example (taken from a real-life codebase) of:

the beacons the test is trying to send
reacting to the feedback by changing the design

Make it work

❤️ Reach quickly a working solution covered with tests.

I’m working currently in the domain of ESG ratings. I had to add a feature to an existing service — delete old score versions after recalculation. The service was not covered with tests, so I wrote a minimum test to verify the feature. Having done that (”make it work”), I started to reflect on the design (”make it right”). I found this case especially interesting since: 1) it allowed me to smell some suspicious odors emanating from my implementation, 2) it’s a non-trivial example but with acceptable complexity.

The test of CalculateScoresService is listed below. In the production code, I added calls to persister and scorer deleteInactivemethods. I removed some dependencies required to execute the service for the sake of readability. Anyway, you’ll find the test repetitive and quite boring, but it’s fine. It should make you feel uneasy so it hopefully triggers a redesign action. I don’t include the source of the test counterpart (namely CalculateScoresService) because I want you to look at the production code only through the prism of the test.

Make it right

💚 Change the design. Or at least be conscious about the issues so they will influence your design decisions in future iterations.

Code smells

Code smells are indicators of some deeper problems in the code. Which ones can I spot by looking only at the test?

Duplication in test setup and verification

Differences between test cases are minimal:

the first test case uses a default profile for which I keep 2 last inactive versions
the second test case uses a non-default profile for which I don’t want to maintain version history

Test cases have minor differences, marked in red. Sequential coupling marked in green

To check rather simple conditional logic (keep 2 versions for a default profile, delete old versions for a non-default one) I have to repeat almost identical setup and verification steps.

Sequential cohesion

Sequential cohesion exists when a routine contains operations that must be per- formed in a specific order, that share data from step to step, and that don’t make up a complete function when done together.
Code Complete. 2nd ed., p. 168

There are two cases of sequential cohesion in my code:

Score deletion must happen before and after the recalculation and the new version activation. Mockito InOrder verification evidences that. If I changed the order of calls to collaborators, the service wouldn’t delete old versions correctly.
persister and versioner deleteInactive methods must be called together, and first I have to delete scores, then versions.

Repeated method signatures

ScorePersister and ScoreVersioner have the same API, they are called with the same arguments, and they perform a side effect (i.e. both have the same method signatures):

void deleteInactive(String profileId, int inactiveScoresToKeep)

Placement of responsibilities

Is persistence and version switching related to deletion? Should ScorePersister and ScoreVersionerbe responsible for deleting the score deletion? Or should it be a responsibility of an independent component?

How can I remove those smells?

I have duplication in my test, so my instinctive action is to change the test to DRY my code.

Introduce test helpers

I can extract repeated setup and verification code to test helper methods:

This solution makes the test worse as it was before. It goes against the high-level testing goal of “test as documentation”. To understand how the tested system reacts to the inputs, and how it collaborates with its peers, I have to jump between the test case and the setup/verification methods. Helper method names are quite meaningless and they don’t increase my understanding of the tested service.

Broadly speaking, I’m increasing accidental test complexity which I define as:

Any unnecessary burden introduced by us that makes it hard to understand how the production code works only by looking at the test and grasping it quickly, without heavy inside-of-your-head parsing and mental model generation.

Parameterized test

Remove test duplication converting two test cases into one parameterized test case (in my case with JUnit 5 @ParameterizedTest).

This solution looks elegant on the surface and may make you feel good about removing the duplication. But it does you a disservice — it removes the effect of the underlying issues (test duplication) but not the cause (code smells).

Both test refactoring squash duplicated test cases but they don’t fix the root cause sitting there in the production code. They are like doing superficial house repairs when the building structure is rotten. Next two sections I’ll present two ideas about strengthening the code foundations.

Extract class

Profile and version deletion seems to be cohesive enough to justify a separate collaborator. Moreover, I needed to write two test cases to verify the logic involved in the deletion.

Let’s try to merge the persister and versioner deletion plus the logic about deciding how many versions to wipe out into a new ScoreCleaner. I’m going to introduce first a test double to design the the contract with the mock. If I like the contract, I can proceed with the test-first implementation of the collaborator. If I’m not convinced yet I can revert the code. ScoreCleaner mock helps me to reach cheaply a design decision (it could be even more cheaper by performing this experiment with java.util.function.Consumer or a function type in a language that supports it ((String) → Unit in Kotlin).

With this redesign, I reduced sequential coupling inside CalculateScoreService of the deletion steps (in other words, I isolated it in a separate class).

Introduce template method

I’d like to ensure that deletion happens before and after score calculation. Although I expressed this requirement with the test, I could enforce it in the production code by splitting the workflow in two parts:

outer shell performing the cleanup
inner core orchestrating the actual calculation

There is already a design pattern for this — template method:

Define the skeleton of an algorithm in an operation, deferring some steps to subclasses. Template Method lets subclasses redefine certain steps of an algorithm without changing the algorithm’s structure.
Design Patterns: Elements of Reusable Object-Oriented Software, p. 325

I’d apply it in a modern form — instead of inheritance, I’ll pass the algorithm step as a lambda function that will be executed by the skeleton.

Recap

I started from a working code that you saw through the test mirror. The test highlighted a couple of issues with the production behaviour: duplication and sequential cohesion. It helped me to reflect on the placement of responsibilities — which component should take care of the deletion.

Then, I showed how to remove those smells — on the surface, by only extracting test helpers or parameterising the test, and at their roots — by shifting responsibilities and applying design patterns. The final design is quite neat — one short test case with less mock verification.

There is a final question — do I have act upon the test feedback immediately? No, you don’t. You may know that there are new requirements coming so it’s more reasonable to postpone design decisions. Your solution may a probe to check if it works at scale in a non-local environment. After getting some measurements you’ll either accept the solution and iterate on it, or you’ll abandon it (and hopefully remove the code). If you decide not to act, at least you know that there is amber light flashing from then test reminding you to take an action.