Tuesday, December 17, 2019

Dwindle major effort: Multi-bound tests part 2, introducing handles

In my previous post, I made a small inroad to discussing the multi-binding of tests. In that conversation, I glossed over a lot of details.

One of the most important details I skipped was how I abstracted everything about a sophisticated system so that variations in binding points were invisible to the tests. This discussion will probably span several entries.
One option would simply be to make the test context class manage all variation, as depicted in the previous article:

Test context providing access to various test surfaces
This has some downsides. For one thing, it would result in an overwhelmingly large interface. Furthermore, each concrete test context would be a gigantic nexus of coupling reaching out to the whole system.

Worse, though, is the fact that there is overlap among the test contexts in which test surfaces they tend to use.

Not an ideal design
I'm not a fan of enormous, duplicated classes, so I'm going to introduce another concept: A category of types I refer to as "handles".

In the design of the Dwindle test infrastructure, a handle is a class that represents a concern from the perspective of the tests, itself. Most importantly, it is an abstraction around that concern. Less importantly (but still of critical importance), it provides an opportunity to inject polymorphism into your test design.

Let's start with a simple concept: a count. Usually, a count is represented with some kind of number. Likewise, an amount of currency might also be modeled with some kind of number. Yet the things you do with these numbers can be very different.

Incrementing an amount of money, in production, is probably not a very meaningful operation. Incrementing a count may be very important (like # of unsuccessful network calls) or maybe it's a side-effect of modifying the collection of whatever is being counted (like # of players in a game). Conversely, you are unlikely to need the ability to compute the compound interest owed on a count but it's something you may need for money.

This might seem like another one of my digressions to you. After all, this is a well-understood point when it comes to production code. Those of you who already adhere to the practice of aggressively modeling things with abstractions in your production code might think I'm just saying to do what you already do.

In fact, there's a large contingent of developers who believe that these abstractions should exist and be used in tests.

That is not what I'm advocating.

I'm not saying we should make the abstractions we need for production code and then use them in our test code, as conventional wisdom may suggest.

What I've noticed, when making tests apply to more than one part of my code, is that tests require their own abstractions. After all, a test's needs may be very different than the needs of the production code they test.

Let's consider, again, the concept of a count. Imagine that the count being modeled is a player count - the number of players in a lobby or the resulting game.

From the game's perspective, the count is a derivative concept. It's so simple a derivative that I actually don't (presently) need an abstraction for it. A better way of putting it is that I only need the characteristics already provided by the int abstraction.

Yet, my tests very much need more than that. They need a lot more of what int already provides:
  • The ability to get the successor or predecessor counts, for testing boundaries.
  • The ability to compare two counts.
Moreover, I need something in my tests that I can't get out of an int derived from the Count property of a collection: the ability to set the count.

In my production system, the count is implied by modifying something else. In my tests, I need to set the count and have the content of that something else be the derivative property. For instance, I need the ability to write "Given there are 3 players" and have the result be that the game or lobby is populated with 3 players.

The number of players in a game is a different concept from the number of players in a test and each concept demands its own abstraction.

The importance of these abstractions is amplified by the need to bind a test to multiple surfaces in your application.

Consider the scenario from the entry which precedes this one:

Scenario: Invalid moves are rejected by server
Given client A authenticated with C
And client B authenticated with C
And game X was started on client A
And invalid move M was spoofed by client A
When game X is continued on client B
Then client B does not show the effect of move M

We need the following abilities:
  • Define some credentials.
  • Authenticate a client with specific credentials.
  • Start a new game on a client.
  • Define an invalid move based on the state of a client.
  • Spoof a move from a client.
  • Continue a game of a client.
  • Measure whether or not a client shows the effects of a move.
From the tests' perspectives, there are several different entities of note:
  • Credentials
  • Clients
  • Servers (implied)
  • Moves
  • Games
However, as one might imagine, these concepts all vary from one context to another. In fact, they vary independently of each other. For example...
  • When testing the client, the client being tested is a real Android or WebGL application and the server is modeled as a proxy over a RESTful endpoint.
  • When testing the API, both the Client and Server are modeled as proxies for RESTful endpoints.
  • Web testing as a Unit test, the Client the objects modeling the app with a mocked UI and the Server is the objects that contain the behavior for the back end. There's no network involved at all.
In those three cases, the Client can be one of three different things and the server can be one of two different things. Even though the test context is the driver for change, the Client and Server do not change in lockstep with the context.

The solution is to let each handle represent its own polymorphic category, allowing the test contexts to primarily be managers of resources and selectors of handles to those resources:

Test contexts become (mostly) abstract factories
While there is duplication with this design - in that certain contexts select the same kind of handle - there is no redundancy. The duplicated code doesn't actually represent the same intention and won't need to be maintained identically, which is the true origin of cost from redundant code.

The result is that test bindings are mostly coupled to handles. The only code that is coupled to the test contexts is the code that provisions a handle. The rest of the code just uses the handles they are given by SpecFlow.*

The relation between bindings and any given family of handles looks about like the following:

A family of handles in relationship to everything else; it doesn't matter which kind of handle
This basically recognizes the Bridge pattern in the problem. I have two categories of variation: How to interpret a step (Bindings) and how to access a part of the system (Handles). The first variation varies, largely, in how it makes use of the second variation. The second variation varies, largely, in how it realizes the intent expressed by each flavor of the first variation. That's a Bridge, baby.

For all these reasons, I tend to have a family of classes in my test infrastructure called "Handles". Their job is to provide access to the testable entities in a product with an interface and polymorphism that is convenient for the tests.



It should be noted that, while I'm writing about my own work, here, the first person I heard talking about multi-binding of tests was Amir Kolsky. It's hard to trace the origin of some ideas to its source but, in this case, for me, the source was him.

* More on how to use StepArgumentTransformation and an injected object registry to make SpecFlow pass handles to your bindings, later. Unless this footnote was enough for you. ;)