Reducing Feature Test Time

Back in the day, we would spend weeks, sometimes months, manually testing applications for regressions after adding new or changing features. So, we started our Continuous Delivery journey with a new definition of done 4 years ago:

All new feature development must also have automated feature tests.

At the time we utilized CodedUI (Microsoft’s user interface automation/testing framework) and MSTest to replace the manual testers actions in the UI with automated tests.  After a few months our developers and testers became proficient in writing the automated UI tests and also began retroactively adding automation for all areas of the applications. The result was a very high percentage of automated regression testing.  In most cases, greater than 98% of our regression testing was automated in a nightly feature test run. 

Looking back this approach was an easy transition for the developers/testers to understand.  We simply replaced the manual test case executions with UI-based automation.

Today all of our actively developed applications have nearly all their regression tests automated with CodedUI or Selenium UI tests.  This has allowed us to move much more quickly than if we had performed manual regression testing.  These UI tests run nightly for each application and span anywhere from 4 – 16 hours each. Now we are ready to move even faster and these tests are the bottleneck.  Ideally we would like to run our automated regression tests in 30 minutes or less.

Time to Re-evaluate

This has led us to re-evaluate our testing strategy.  Although these tests have provided much value, they are long running and fragile because they have to go through a user interface and many intermediate boundaries.  In contrast, a unit test does not cross boundaries and evaluates small, isolated segments of code.  As a result, they run much faster and are far less fragile because of the decreased surface area of which they test.

Unit tests have much value but for larger systems we also need tests that verify integrations between components/systems.  UI tests can serve this need but we can also test integrations through code but skip the user interface.  We call these integration tests.  They allow us to test integrations between one or more systems but are not as fragile/slow as UI tests.

Many testing experts suggest an array of tests across the different types but in the appropriate proportions.  This is typically referred to as the testing pyramid.  TestingTriangle

The idea is most of  your tests are fast and stable unit tests.  Followed by a smaller number of integration tests that verify integrations across components/systems.  Next we have even fewer UI tests that evaluate the system through the user interface.  Lastly we have manual tests only if necessary for code/functionality that can’t be tested using code.  Each test type has value and in the appropriate proportions can run in the context of deployment pipeline as quality gates.

So going forward we are looking to replace our large sweet of UI tests with integration tests but how? 

Delete Tests

As you can imagine with the definition of done cited above, we have a lot of UI tests.  Many of these tests have never failed over the years.  They initial served their purpose but if they never fail over time are they really valuable?  If not, they become a liability of code that must be fed and cared for, taking time away from valuable work.

So we will move the tests that have never failed to a weekly test execution so they don’t interfere with our deployment pipeline and eventually delete them if they continue to never fail.  If we find one that does fail, we will add it back to the pipeline test execution. 

Obviously there is some risk that we will cause a regression and not catch it.  The good news is if that happens we can also fix it quickly with our pipeline.

Convert to Integration Tests

The rest of the UI tests we will attempt to convert to integration tests wherever possible.  We will always have some UI tests but hopefully far fewer. We will analyze the intent of each UI test to decide how to write the integration test and how many components to test the integration of. 

The more components your test touches the longer it will take and the more fragile it will be.  Striking the right balance of test surface area and coverage can be tricky.  Over time we will learn how best to handle this.

Still Learning

One of the principles of Continuous Delivery is continuous learning/improvement.  We aren’t sure this is the best way to handle these tests but its an experiment we are willing to try and then learn from. 

What is your experience with these test types?  Leave comments below to continue the conversation and/or provide feedback.

Leave a Reply