Search This Blog

Loading...

Tuesday, January 22, 2008

Essential Difficulties with Automated UI Test

For developers familiar with unit tests, being able to get the immediate feedback from the tests is a great boost to the morale. The experience of writing tests (regardless of whether the tests are written first or last) is so rewarding that one naturally want to do it for every layer of the software, including the GUI. But GUI is not so amendable to unit tests; to create automated tests on it requires different tools and different strategies. As shown in the post Accidental Difficulties with Automated UI Test, the current level of GUI scripting tools is still very primitive and they are themselves a hindrance towards a fruitful GUI test automation.

Things were not so bad if the problems were caused only by the tools and nothing else. But GUI tests are inherently more difficult than unit tests. Even if the current UI test tools are a lot more advanced, creating high quality UI test cases will still be a lot harder than creating high quality unit test cases.

Here are some of the essential difficulties of GUI tests:
  1. Slow
  2. Environment dependent
  3. Rigid, not modularizable
  4. Fragile. The tests break if unexpected dialog boxes are introduced even though they are not bugs.
Let's go through them one by one.
  • Slow
There is no denying that running UI tests is just plan slow-- compare to unit tests, of course. In unit tests one can run hundreds of tests in a few seconds, but in GUI tests, to run a test itself would take about more than a few seconds at least. I am talking about testing a commercial application, not your usual toy application that bears no resemblance to the real world either in terms of scope or complexity.

Why UI tests take so long a time? It has all to do with the fact that running GUI components itself is slow.

The time taken to run the GUI tests simply discourages the tests from running more often than necessary. Typically, one can only run the tests before a build is to be created, at most. In contrast, one runs all the unit tests after one finishes a feature. The feedback time for code covered with unit tests is short, and hence one is emboldened to do whatever necessary refactoring needed to keep the design clean. But UI test is more like regression test whereby you run it to make sure you don't break anything.

It is therefore impossible to do TDD for GUI components. No wonder GUI testing is conspicuously absent from the TDD literature.

  • Environment dependent
The execution of GUI tests depends a lot on the environment on where they are run. Either the hardware condition or other applications at the time of running will greatly affect the test outcomes. This creates problem for consistent runs.

In a GUI test, it is common that it waits for sometime before launching the next action. But the duration for the wait is very subjective. In other words if at the point of script writing the computer is under tremendous workload, then the delay coded in the script will be longer than usual. The fact that delays between two actions are violate may sometimes cause the test to fail because of racing condition.

Likewise, other applications or services may disrupt the flow of the runs. UI test involves moving and clicking mouse all the time. An overeager Automatic Update Notification may hijack the flow and make a click at an unwanted position and hence defeat the test purpose immediately.

For once, a test runner can actually put up with this kind of annoyance and rerun the test under a clean condition. But if false positives keep on popping up, he (or the management) will eventually lose patience and kill the automation effort under various pretexts.

  • Rigid, not-modularizable
You have to test an GUI application as a whole, but not as separate components. This severely limits the combination of tests as breaking up a big application into smaller modules will facilitate testing.

In unit tests you can easily test each method to the finest detail. The benefit of this is that initially, when you are less sure of which portion of the code is causing problem, you can call a more general method to simulate the error. Later, when you start to debug in, you can gradually exclude the correct code and concentrate on that one small faulty function. Next time when you run your test, you don't have to run a list of general methods that will take a long time to complete. Instead, you can choose only to run only a list of small methods.

But in GUI testing, you can't simply untangle the whole screen into smaller UI controls and test them one by one. If you were to do it you would need to create separate Parent Forms to host it and only then to test it. But testing separated control is pretty much meaningless because for most of the time the bugs occur during the interaction of different controls.

The rigidity of GUI testing restricts itself from being execute more often than it should. Because it inevitably contributes to the slowness of the test runs.
  • Fragile
I think this is the worst problem faced by the GUI testing. The GUI tests can easily break whenever a developer make innocuous changes such as changing the caption of a dialog box, or the internal variable name of a control. And that is the best you can get away with.

In other scenario when the developer tester chooses to use screen coordinates to identify a control, the maintenance cost is even greater. The usage of screen coordinates is so unreliable that if your tests contain it, then you are better off discarding the tests.

In unit tests, the tests will also break if the interfaces are modified, but unit tests are usually fast to execute so the developers can fix the test cases before check-in their code. Or if a static typed language (such as C# or Java) is used, the compiler will itself detect the breaking changes. If instead the program is written in dynamic typed language one won't have this luxury. But still, one can still detect the breaking changes and fix the tests instantly.

Again, due to the slowness nature of GUI tests, no developer can run all of them every time after they compile their code. So the GUI test cases are left in the shattered state until a build is finished, compounding the opinion that the GUI tests is very, very fragile.

All these difficulties imply that developing GUI test scripts can never be easy and therefore requires the same discipline and rigor that is applied to the software development. In my opinion, GUI test automation is more on automation rather than on test. Therefore the quality of the test scripts resemble more of the quality of the programmer's rather than of tester's. Missing this point will have a disastrous effect on your test automation projects.

Like software design patterns, GUI test scripts have got its own patterns as well. But this is a topic for another day.


No comments: