Exercise:

  • Write down all the types of testing you do.
  • Now write down all the types of testing you think you should be doing.

Chances are you have more in the second list than the first and you probably feel bad about it.

I didn’t say what I meant by type, but who cares? I’d hazard a guess you have put about 5 - 10 things in the first list and nearer 20 in the second.

At Starling Bank, we do continuous delivery of our platform, trusting automated tests to verify that our software is safe to deploy. Virtually all these automated tests are run in one of two contexts: “unit tests” and “integration tests”.

Any time anything goes wrong (a bug!) an engineer devises a way to test it automatically and it goes into one of those two contexts. So we lazily call it a unit test or an integration test.

Of course there are many more “types” of test we’re doing than this. Doing lots of different types of testing is important. But keeping the number of “ways” we run tests small is also important. Maintaining automated test suites is often as hard as, if not harder than, maintaining the software under test, so it’s worthwhile keeping the approach simple - especially when you’re trying to achieve much (i.e. build a bank) with comparatively modest resources (i.e. head count).

  • We have two “places” to look to ensure we’re verifying the properties we care about.
  • We have only two execution frameworks to understand and two sets of utility code

Probably “intra-service” and “trans-service” would better describe the set up, apart from our more traditional naming embodies a useful reminder to keep tests as small, focussed and orthogonal as possible (“unit”) and use more expensive tests only when required (“integration”).

Always look to get the best power-to-weight ratio out of your investment in testing. Test everything at the cheapest level possible and only there. This applies not just in the back-end but also in more specialised contexts like Android and iOSs project where you can test code both in and out of emulators.

Now that’s not all. Around the fringes of pure automation are a variety of areas where automation behaves more like power-assistance than automated regression, for example in high maintenance test types like Appium UI tests or Gatling load tests. Sometimes these are used on an ad hoc basis to answer specific business needs (can we verify we’re still supporting obsolescent UI usages while we evolve?) or technical concerns (how many transactions a second can we hit and still meet SLOs?).

It’s clearly valuable to have these become fully automated (non-interactive) test suites as well but the value must not be eclipsed by the incremental complication and overhead of increasing the number of ways of testing. And it’s an organisational question how well you can manage that - perhaps instituting a new type of automated test would actually require changes to organisation - exotic animals need specialised zoo-keepers to tend to them.

More often the most effective way to distil the ongoing value from endeavours like this is to cannibalise them, take parts of particular value and incorporate them into your existing test regimes and technologies.