Blog Article
Automated testing topics : Code Coverage monitoring for automated tests
06.02.2012 13:24 by Alexandra Schladebeck
Recently, I wrote a first blog entry in a new series of “automated testing topics” on using BIRT reports in GUIdancer. In this entry I’m going to continue, as promised, with the sometimes rather tricky aspect of code coverage.
Identifying coverage and risk
I’m sure that we’re not the only team who, once having established a good base of automated regression tests, wanted to know something about their effectiveness. How much do they cover? Where are our risks? Do new tests actually test new areas of the software? How does our safety net look when more code / more tests are added?
It seemed like a fair assumption that the simplest way of getting this kind of information automatically would be to add code coverage monitoring to our tests. As it seemed like something that could interest other Jubula / GUIdancer users, we also added it to the tool, and about a year ago, it became a standard feature of GUIdancer.
Using the JaCoCo integration in GUIdancer
The code coverage feature in GUIdancer uses JaCoCo (a Java Code Coverage library available under the Eclipse Public License). Code coverage monitoring can be added to an Application Under Test (AUT) in the AUT Configuration dialog for Swing and RCP AUTs.
Once code coverage has been activated, a few extra settings should be defined to declare:
- the installation directory for the AUT,
- the place where the source code for the AUT is (so that the code coverage report can be viewed line-for-line)
- a package pattern so that only your own code (and not every single library you use) is monitored
Once that’s done, you can run your test and get a nice shiny code coverage value and report that can be viewed in the Reporting Perspective:
The philosophical part…
So now you’ve got your report. It’s got some green lines, some red lines and some numbers. And if you’re in a similar situation to our team, you quickly have someone asking about these numbers and wanting to know what they mean. (I’m going to try not to joke about managers in *every* blog entry, but I think we all know who will be asking these kinds of questions
).
One of the first things I read about code coverage was this blog entry. Having now had some experience in various projects, it has become my favorite way of explaining to customers that the numbers gathered by code coverage are not the aim of the exercise. The absolute values delivered by a monitoring report can’t be used as a guide to quality, nor should they be formulated as an aim for the team.
It’s not about the absolute numbers
What code coverage numbers or percentages tell us is only what lines, methods, classes or decisions have been executed. It tells us nothing about whether they were executed correctly, or whether the test that caused these things to be executed was a good one. It can be far too easy to attach too much importance to a simple number removed from its context – and this can drive the wrong kind of behavior in our processes.
So what is it good for?
It’s nice to have nice shiny percentages for code coverage, but it’s not the absolute values of the individual percentages that we should be looking at. Instead, there are two other things we should be looking at.
The trend of the lines
The first thing of interest is to move away from individual numbers and look at the trend, most especially the development over time of amount of tests : code coverage : amount of code. The graph generated by GUIdancer shows the first two numbers as lines over time. The report below shows how code coverage can increase with the addition of tests.
Using graphs such as this, we can see whether a new test added increases the coverage or not. If so, then if we assume no large additions or subtractions from the amount of code, we can reason that the test has executed other lines than executed by previous tests. If the code coverage value does not increase, it could mean that new code has been added, thus reducing the overall percent of the code coverage. Looking at the detailed report would tell us whether new code was executed or not (more on this in the next section). Even if the amount of code had remained the same, and therefore new lines of code weren't actually executed, that doesn't necessarily mean that the new test isn’t performing an important use case or workflow through the application, just that the lines executed have remained the same.
What is being missed
Another use of code coverage over and above isolated values is the ability to look at what is not being tested. Don’t look at the green; look at the red. The red areas tell us what parts of the software we’re missing. We can look at e.g. class or method names and decide whether we can design tests that should cover these areas.
When a new test is added we can check whether it covers the areas we thought it would. Instead of just looking at a simple number, we should look at the details.
New questions
As it turns out, the questions about effectiveness and risk can’t really be answered by code coverage. What it can answer though, is "where are we not performing any testing?", "where are we performing too little testing?" and "what effects have my last changes (more/less code, more/less tests) had on the trend?". Based on the answers to these questions, we can design new tests. Code coverage can certainly help us to get better quality software, but perhaps not in the way that is often imagined.
Stay tuned…
The next blog entry in this series will be on Mylyn Integration for task-based working to reduce the amount of context-switching we have to do when automating tests. Until then, happy testing!









