During our recent run of talks at JavaOne and XP Days Germany, Benji and I talk about how we use a custom JUnit rule to quarantine non-deterministic failures - our main use case is to isolate webdriver tests we run against our ad-units, since automated browser tests can be affected by network traffic blips and can be unreliable.
Since we’re keen to give back to the open-source community we’ve recently released to GitHub a collection of custom rules that align with our development methodology, with the hope that others may find them useful too.
I’ve set out a few examples below to demonstrate how we use some of them.
Isolating non-deterministic tests with QuarantineRule
The quarantine rule is used to remove tests that fail sporadically from being a blocker to a deployment: if they fail every time then they still fail the build, but failing and then passing counts as a pass.
In this situation, we additionally send diagnostic information to ourselves by use of a QuarantineRuleLogger so we can try and fix the issues - at Unruly we have a custom logger to spam us with emails but the rule will work out of the box by logging to stdout.
The QuarantineRule takes an optional QuarantineRuleLogger as a constructor argument to customise where the diagnostic output is sent.
QuarantineRuleLogger is a single-method interface, so the constructor will take both lamba-expressions and method references, as above.
Ignoring tests until a specific date
We’re big fans of getting something into the production codebase as quickly as possible, so using the IgnoreUntil rule we can write our end-to-end acceptance tests as the first step but ignore them until a few days later.
We find this is better than using stock @Ignore from the junit core since it’s too easy to mark a test as ignored and forget about it.
Diagnosing tests that pass unreliably
Even though we have the QuarantineRule we don’t just put it on all tests that are failing intermittently, as these tests might have a concrete reason to be failing - maybe we’re accidentally relying on test ordering or a test database isn’t set up properly.
Before we quarantine a test, we annotate it with a ReliabilityRule - this is the most simple rule in our set as it just runs every test a fixed number of times and logs the state of each test run.
If we have suspicions about non-deterministic test failures then this will flag it up, but if it’s failing deterministically (but, for example, only on the first run) then we know this isn’t a good candidate for quarantine.
We hope to extract more of our useful rules in the near future, so watch this space! :)