Skip to content
English
On this page

Continuous Testing

Your fast feedback efforts are in limbo without continuous feedback!

In the previous chapter, we discussed how adding tests in the right layers of the application accelerates feedback cycles. It is vital to receive such fast feedback continuously and not just in random outbursts to seamlessly regulate the application quality throughout the development cycle. This chapter is dedicated to exploring such a Continuous Testing practice.

Continuous Testing (CT) is the process of validating the application quality using both manual and automated testing methods for every incremental change and alerting the team when the change causes deviation from the intended quality outcomes. For example, when a piece of functionality deviates from the expected application performance numbers, the Continuous Testing process immediately notifies the team in the form of failing performance tests as they are run against every change. This gives the team an opportunity to fix issues at the earliest when they are still relatively small and manageable. A lack of such continuous feedback would leave the issue unnoticed for an extended period and, over time, cascade to deeper levels of the code resulting in more effort to prune the issues!

The Continuous Testing process relies heavily on the Continuous Integration (CI) practice to perform automated testing against every change. Adopting CI together with CT allows the team to do Continuous Delivery (CD) . Ultimately, the trio — CI, CD, and CT, make the team a high-performing team as measured by the four key metrics! We’ll define these metrics at the end of the chapter, but together lead time, deployment frequency, meantime to restore, and change fail percentage provide insights about the quality of the team’s delivery practices.

This chapter will equip you with the skills required to establish a Continuous Testing process for your team. You will learn the CI / CD / CT processes and strategies to achieve multiple feedback loops on various quality dimensions. A guided exercise to set up the CI server and integrate the automated tests is included.

Introduction to Continuous Integration

Martin Fowler describes Continuous Integration as “a software development practice where members of a team integrate their work frequently, usually, each person integrates at least daily - leading to multiple integrations per day.” Let’s take a look at an example to understand the benefits of following such a practice.

Two teammates, Allie and Bob, started independently developing a login and home page. Work started in the morning and by noon, Allie had finished a basic login flow, and Bob had completed a basic home page structure. They both tested their respective functionalities on their local machines and continued work. By the end of the day, Allie completed the

Login functionality by making the application land on an empty home page after successful login since the home page is not available to her yet. Similarly, Bob completed the home page functionality by hardcoding the user name in the welcome message since the user information from the login is not available to him.

The following day, they both report their functionalities as ‘done'! Is it really done? Who among the two is responsible for integrating the pages? Should they play a separate integration user story for every such integration scenario across the application? If so, will they be ready to expense the duplicated testing efforts involved in testing the integration story? Or should they delay testing till the integration is done? These are the questions that get implicitly addressed with Continuous Integration practice. When CI practice is followed, Allie and Bob will share their work progress throughout the day (after all, both had a basic skeleton of their functionalities ready by noon); Bob would have added the necessary integration code to abstract the user name after login (e.g., from a JSON or JWT token) and similarly, Allie would have made the application land on the actual home page after successful login. The application would have been really usable and testable then!

It may seem like a small additional cost to integrate the two pages the next day with this example. However, when code is accumulated and integrated later in the development cycle, firstly, teams will need to shell out big bucks on integration testing. Secondly, as testing is delayed they may find more issues — some, entangled with the other, resulting in rework, sometimes even rewriting the entire approach. This will lead to fear of integration in teams, which is an unspoken accompaniment of delayed integration!

The Continuous Integration practice essentially tries to reduce such integration risks and saves the team from ad hoc rework and patches. It doesn’t entirely eliminate integration defects but makes it easier to find and fix them early when they are just budding.

The CI / CT / CD Process

Let’s look at the Continuous Integration and testing processes in detail now and later see how they connect to form the Continuous Delivery process. The CI / CT process relies on four individual components:

  • The version control system (VCS) is the component that holds the entire application codebase and serves as a central repository for all team members to pull the latest version of the code and integrate their work continuously.

  • The automated functional and cross-functional tests that validate the application.

  • The CI server which automatically executes the automated tests against the latest version of the application code for every additional change.

  • The infrastructure that hosts the CI server and the application.

The Continuous Integration and testing workflow begins with the developer, who, as soon as they finish a small portion of functionality, pushes their changes into a common version control system (e.g., Git, SVN). The VCS tracks every change submitted to it. The changes are then sent through the Continuous Testing process where the application code is fully built and automated tests are executed against it by a CI server (e.g., Jenkins, GoCD). When all the tests pass, the new changes are considered fully integrated. When there are failures, the respective code owner fixes the issues at the earliest. Sometimes, changes are reverted back from the VCS until the issues are resolved. This is mainly to prevent others from pulling the latest code with issues and integrating their work on top of it.

BENEFITS OF VCS

Ever imagine how teams would have shared their code when there were no VCS? Some teams used shared drives, and others directly patched their code to a central server that hosted the entire codebase! Such was the pain that led to the discovery of the first-ever VCS in the 1960s called the SCCS (Source code Control system). Since then, VCSs have gotten richer with newer features that took away a lot of pain points and offered immense benefits for easier work integration.

A few significant benefits are as follows:

  • VCS keeps track of every version of code pushed into it — be it addition, deletion, or modification of code in a separate database. This serves as a long-term history of changes and hence significantly eases root cause analysis of issues.

  • Since the versions are maintained independently, VCS allows teams to roll back to a previously working version of the application when there are issues.

  • Changes in VCS can be tied to a user story or a defect card. This gives the team the ability to trace the changes back to a user story and understand the context behind the code written and the evolution of a feature over time.

  • Sometimes, teams may have to work on a common area of code to build their features. VCS allows team members to create branches 2 of the main codebase, build on top of it, and merge them to the main codebase a little later. However, a long-living feature branch is an antipattern.

Frameworks

The new change Cn triggers a separate pipeline in the CI server. Each pipeline is composed of many sequential stages. The first is the build and test stage, which builds the application and runs automated tests against them. The automated tests refer to all the micro and macro tests discussed in Chapter 3 and the tests that assert on the application’s quality dimensions such as performance, security, etc., which we will discuss in the upcoming chapters. Once this stage is complete, the test results are indicated to Allie.

In this case, Allie’s code has been successfully integrated, and she proceeds with her login functionality. Later in the day, Bob pushes commit Cn+1 for the home page feature after pulling the latest changes Cn from the common VCS. Now Cn+1 is a snapshot of the application codebase including both Allie’s and Bob’s new changes. This triggers the build and test stage in CI. The tests when run against Cn+1 ensure that Bob’s new changes haven’t broken any of the previous functionalities, including Allie’s latest commit, as she has also added the login tests. Luckily, he hasn’t. However, we see in Figure 4-1 that Allie’s changes as part of commits Cn+2 & Cn+3 have broken the integration, and the tests have failed. She needs to fix them before proceeding with any of her work as she has introduced a bug into the common VCS. She can push her fix as another commit, and the process will continue.

Imagine the same workflow in a large-scale distributed team, and you can understand how CI makes it a breeze for all team members to share their progress and integrate their work seamlessly. Also, in large-scale applications, there are typically several interdependent components thatwarrant exhaustive integration testing, and the Continuous Testing process provides the much-needed confidence on the finesse of their integration! With that kind of confidence gained from the fully automated integration and testing processes, the team is placed in a privileged spot to push their code to production whenever demanded by the business: the team can do Continuous Delivery.

Continuous Delivery can be defined as the discipline where the team follows the Continuous Integration and testing processes to keep their application production-ready at any time. Additionally, it dictates having an automated deployment mechanism that can be triggered in a single click to deploy to any environment — be it QA or production. Figure 4-2 shows the Continuous Delivery process.

Frameworks

As we can see in Figure 4-2, the Continuous Delivery process encompasses the CI / CT processes along with the self-service deployment pipelines. The self-service deployment pipelines are stages configured in the CI server as well. They perform the task of deploying the ‘chosen’ version of the application artifacts to the required environment.

The CI server lists all the commits with their test results status. Only if the tests have passed for a commit (or a set of commits), the CI offers the option to deploy that particular application version (V). For example, let’s say Allie’s team wants to receive feedback from the business on the basic login functionality pushed as part of commit Cn. They can push the ‘Deploy Vx’ button, as seen in Figure 4-2, and choose the UAT environment. This will deploy only the changes made until that point to the UAT environment, i.e., Bob’s Cn+1 and later commits will not be deployed. As we can see, the commits Cn+2 & Cn+3 are not available for deployment as the tests have failed.

This kind of Continuous Delivery setup solves many critical issues, but one the most — launching the product features to the market at the right time. Often delays in feature release to market results in loss of money and customers to competitors. Additionally, from the team’s point of view, the deployment processes become fully automated, reducing the dependency on individuals with context on the deployment day to do their magic. Automating deployments also reduces the risk of incompatible libraries, missed configurations, and inadequate documentation, giving the freedom for anyone to make a hassle-free deployment to any environment.

CONTINUOUS DEPLOYMENT VS. CONTINUOUS DELIVERY

Continuous Deployment is different from Continuous Delivery. Continuous Deployment is to have automated deployment pipelines that push every commit to production automatically after the Continuous Testing process. In other words, the feature you committed just now is available to real end-users in production immediately. Whereas practicing Continuous Delivery is to be ready anytime to push the application to production with a self-service deployment option. Continuous Delivery is suitable in cases where businesses have launch dates for features. Sometimes, companies even make public announcements on feature inauguration.

Principles and Etiquette

Now that we have discussed the CI / CD / CT processes, it is important to call out that these processes can attain fruition only if all the team members follow a set of well-defined principles and etiquette. After all, it is an automated way to collaborate on their work— be it automated tests, application code or infrastructure configurations. The team should establish these principles at the beginning of their delivery cycle and keep reinforcing them throughout. Here is a minimum set of principles/etiquette a team will have to uphold for their success:

Do frequent code commits

Making frequent code commits and pushing them to VCS as soon as they finish a small piece of functionality so that it is tested and made available for others to build on top of it.

Always commit self-tested code

Whenever a new piece of code is committed, it should be accompanied by automated tests in the same commit. Martin Fowler calls this practice 'self-testing code.’ For example, as we saw earlier, Allie committed her login functionality along with login tests in the same commit. This ensured that her commit was not broken when Bob committed his code next.

Adhere to Continuous Integration certification test

Each team member should ensure their commit passes the Continuous Testing process before moving on to the next set of tasks. If the tests fail, they need to fix them immediately. According to Martin Fowler’s Continuous Integration certification test, the team should repair a broken build and test stage within ten minutes. If they cannot fix it, they should revert their broken commit, leaving the code stable (or green).

Do not ignore/comment the failing tests

In the rush to make the build and test stage pass, team members should not comment/ignore the failing tests. As evident are the reasons for why this should not be practiced, you will always find this happening in teams.

Do not push to a broken build

The team should not push their code when the build and test stage is broken (or red). Pushing their work on top of an already broken codebase will lead to tests failing again. This will further burden the team with the additional task of finding which changes originally broke the build.

Take ownership of all failures

When tests fail in an area of code that someone didn’t work on, but it fails because of their changes, the responsibility of fixing all of it is on them. If needed, they could pair with the individual with the required knowledge but ultimately get them fixed before moving onto their next task. This practice is essential because, often, the responsibility of fixing the failed tests is tossed around, causing a delay in resolving the issues. Sometimes, the tests are eliminated from running in the CI for days asthe issue is not fixed. This results in the Continuous Testing process giving incomplete/false feedback for the changes pushed during that open window.

Many teams also adopt stricter practices for their own benefits, such as mandating all the micro and macro tests to pass on local machines before pushing the commit to VCS, failing the build and test stage if a commit does not meet the test coverage threshold, publishing the commit’s status (pass or fail) with the name of the individual who made the commit to everyone on their communication channels such as Slack, playing loud music in the team area whenever a build is broken from a dedicated CI monitor, and so on. Also as a QA on the team, I keep an eye on the status of the tests in the CI and if they are getting fixed on time. Fundamentally, all these measures are intended to streamline the team practices around CI / CT processes. The foremost measure among all is to empower the team with the knowledge around not only 'the how’ but also ‘the why’ of the process!

Continuous Testing Strategy

Now that you know the processes and principles, the next step is to apply strategies custom to your project needs.

In the earlier section, the Continuous Testing process was demonstrated with a single build and test stage that runs all the tests and gives feedback in a single loop. You can also accelerate the feedback cycle with two independent feedback loops: one that runs the tests against the static application code (e.g., all the micro-level tests), and the other that runs the macro-level tests against the deployed application. This, in a way, is a slight shift-left where we leverage the micro-level (unit, integration, contract) tests’ ability to run faster than the macro-level tests (API, UI, end-to-end) to get faster feedback. Figure 4-3 shows a Continuous Testing process with two stages.

Frameworks

As seen in Figure 4-3, a common practice is to combine the application compilation with the micro-level tests as a single stage in CI. This stage is traditionally called the build-and-test stage. When the team adheres to the test pyramid we discussed in Chapter 3, the micro-level tests validate a broad range of application functionalities. As a result, this stage helps to receive quick and prominent feedback on the commit. The build-and-test stage essentially should be swift enough to finish execution within a few minutes so that the team will wait for it to complete before moving on to the next task as per their agreed principles and etiquettes. If it takes longer, the team should find ways to improve it. For example, parallelizing the build and test stage for each component instead of a single stage for the entire codebase and other commonly prescribed CI / CD industry principles minutes from commit to being ready for self-service deployment with ~470 micro and macro level tests.

Also, when the time taken for this feedback loop is less, the team can still stick to their principles of fixing the issues found in the Continuous Testing process before moving on to the next task. But when it takes longer, the team tends to ignore the failing tests and track them as defect cards for fixing them later. This is definitely harmful to the team as it means they are integrating their new code on top of defects, and the new code is not thoroughly tested either as the tests are ignored. Hence, the team should continue to monitor and adopt ways to quicken the two feedback loops using techniques like parallelizing the test run, implementing the test pyramid, removing duplicate tests, and refactoring the tests to remove waits and abstract common functionalities. The Continuous Testing process can be extended to cross-functional requirements (CFR) testing as well as seen in Figure 4-4.

Frameworks

For instance, teams can run the automated performance, security, accessibility tests either as part of the two feedback loops or configure separate stages subsequent to the acceptance tests stage in the CI server, as seen in Figure 4-4. You will also learn their respective shift-left strategy in the upcoming chapters. This way, you can accomplish your goal of receiving fast feedback continuously on multiple quality dimensions!

CONTINUOUS INTEGRATION VS. CONTINUOUS TESTING

As the term implies, the Continuous Integration process ends with the build and test stage, i.e., a commit is considered integrated only when it passes the micro-level tests (at least the unit tests).

The Continuous Testing process refers to validating the holistic application behavior, including its functional and cross-functional aspects for every commit with the goal that it is ready for Continuous Delivery. In fact, Continuous Testing doesn’t stop with executing automated tests alone. It encompasses the manual exploratory testing efforts for every commit after self-service deployment. The Continuous Testing process also requires the team to automate the scenarios found during exploratory testing to call the functionality/commit to be technically ‘done.’

Additional ways you can strategize the Continuous Testing process are splitting the tests into smoke tests and nightly regression tests, as seen in Figure 4-5.

Frameworks

Smoke testing is a term borrowed from the electrical engineering world where electricity is passed after the circuit is completed to assess the end- to-end flow. When there are issues in the circuit, there will be smoke, hence the name ’smoke testing.’ Similarly, you can choose the tests that cover the end-to-end flow of every feature in the application to form the smoke tests pack and only run them as part of the acceptance test stage. This way, you can get a high-level signal on the status of every commit quickly. As seen in Figure 4-5, the commit is ready for self-service deployment after the smoke test stage.

When you choose smoke testing, you have to complement it with the nightly regression. The nightly regression stage is configured in the CI server to run the entire test suite once every day when the team is off work (e.g., scheduled to run at 7 pm every day). The tests are run against the latest codebase with all the day’s commits. The team has to make a habit of analyzing the nightly regression results the first thing the next day and prioritize defects/failures. Sometimes, it may require test script changes, and that has to be prioritized so that the Continuous Testing process gives the right feedback for the upcoming commits that day.

You can apply these two strategies to split both the functional and cross- functional tests. For example, you can choose to run the performance load test for a single critical endpoint as part of every commit and run the remaining performance tests as part of nightly regression (performance tests are discussed in Chapter 9). Similarly, you can run the static code security scanning tests as part of the build-and-test stage and run the functional security scanning as part of the nightly regression stage (discussed in Chapter 8). As obvious as it may be, the caveat in such an approach is that the feedback is delayed by a day. Consecutively, there is a delay in fixing the feedback as well, i.e., the issues are tracked as defects and fixed later. As a result, you should be careful while choosing the type of tests to run as smoke and nightly regression. Also, note that categorizing as smoke tests applies only to the macro level and cross-functional tests; all the micro- level tests still run as part of the build and tests stage.

When you choose smoke testing, you have to complement it with the nightly regression. The nightly regression stage is configured in the CI server to run the entire test suite once every day when the team is off work (e.g., scheduled to run at 7 pm every day). The tests are run against the latest codebase with all the day’s commits. The team has to make a habit of analyzing the nightly regression results the first thing the next day and prioritize defects/failures. Sometimes, it may require test script changes, and that has to be prioritized so that the Continuous Testing process gives the right feedback for the upcoming commits that day.

You can apply these two strategies to split both the functional and cross- functional tests. For example, you can choose to run the performance load test for a single critical endpoint as part of every commit and run the remaining performance tests as part of nightly regression (performance tests are discussed in Chapter 9). Similarly, you can run the static code security scanning tests as part of the build-and-test stage and run the functional security scanning as part of the nightly regression stage (discussed in Chapter 8). As obvious as it may be, the caveat in such an approach is that the feedback is delayed by a day. Consecutively, there is a delay in fixing the feedback as well, i.e., the issues are tracked as defects and fixed later. As a result, you should be careful while choosing the type of tests to run as smoke and nightly regression. Also, note that categorizing as smoke tests applies only to the macro level and cross-functional tests; all the micro- level tests still run as part of the build and tests stage.

Most often, when the application is young, you will have the privilege to run all the tests for every commit. When the application starts to grow along with your tests, you should simultaneously plan for pipeline optimization, tests parallelization, and optimization efforts to reduce the overall CI execution time so that as much as possible, you can delay going the smoke & nightly regression way.

Benefits

When you are making efforts to undertake such a rigorous Continuous Testing process in your team, you will reap a heap of benefits, too — a few of which we touched upon earlier. Here is a consolidated list of benefits (represented in Figure 4-6) that can get your team motivated on your Continuous Testing strategy:

Frameworks

Common Quality Goals

By following the Continuous Testing process, all team members will be aware and work towards a common quality goal — in terms of both functional and cross-functional quality aspects — as their work is continuously evaluated against those goals. This is a concrete way to build quality-in.

Early defect detection

Every team member gets feedback on their commit both in terms of functional and cross-functional aspects immediately. This gives them the opportunity to fix the issues when they have the relevant context as opposed to coming back to the code a few weeks later.

Ready to Deliver

Since the code is continuously tested, the team is always in a ready-to- deploy state to any environment.

Enhanced Collaboration

It is easier to collaborate with distributed team members in sharing their work, especially in getting to know whose commit caused which issues as against escalation via mails and pings on issues.

Combined delivery ownership

The delivery ownership is placed equally on all team members instead of just the testing team or senior developers, as everyone is responsible for ensuring their commit is ready for deployment.

If you have been in the software industry for a while, you will surely know how hard it is to achieve some of these benefits otherwise!

The Four Key Metrics

Fantastic! Now, let’s see how all the efforts spent on setting up the CI / CD / CT processes and strictly following the etiquette/principles are entirely worthy as it results in qualifying your team as a ‘high-performing’ team using the four key metrics!

In the book ‘Accelerate’, Jez Humble, Gene Kim, and Nicole Forsgren define four parameters to measure the performance of software delivery teams in terms of the tempo at which they release software and the stability with which they are released. These four parameters — lead time, deployment frequency, mean time to restore, change fail percentage are called the four key metrics (4KM). Google’s DevOps Research and Assessment (DORA) team formulated these parameters based on their research. The research data also help quantify whether a software team is a high, medium, or low-performing team based on these four metrics. You will observe as we proceed that adopting an excellent Continuous Testing strategy along with the CI / CD techniques will push your team to be a high performer.

Here are the four key metrics:

Lead Time

The time taken from code committed to code being ready for production deployment.

Deployment Frequency

The frequency at which the software is deployed to production or an app store.

Meantime to restore

The time taken to restore any service outages or any kind of failures.

Change fail percentage

The percentage of changes released to production that require subsequent remediation, such as rollbacks to a previous version, hot fixes, or degradation in service quality.

The first two metrics, the lead time and the deployment frequency, expose the delivery tempo of the team. They measure how quickly a team can deliver value to end-users and how frequently they add value to end-users. However, in the rush to deliver value to customers, the team should not compromise the stability of the software. The last two metrics validate this. The mean time to restore and change fail percentage discloses the stability of the software being released. In today’s world, software failures may be inevitable, and these metrics measure how easy it is to restore the failure and how many times there are such failures due to new releases. As we can see, the four key metrics give a clear picture of the software team’s performance by measuring their delivery tempo and their ability to deliver with quality and stability.

As per the DORA research, the four key metrics for a high-performing team are represented in Table:

As we discussed earlier, one of the main benefits of having a rigorous CI / CD / CT process is that your team will be able to deliver value to customers‘on-demand.’ Similarly, as mentioned in an earlier example, when you place the automated tests in the right application layers, you can get your code tested as part of the Continuous Testing process and made ready for deployment within an hour, i.e., your lead time will be less than an hour. Also, by getting good coverage of functional and cross-functional requirements automated and run as part of the Continuous Testing process, your change fail percentage will be well within the quoted buffer of 0-15%!

It is also researched that a high-performing team, in turn, contributes to an organization’s success — in terms of profit, share price, and other criteria. And when the organization does well, they take care of their employees better, don’t they?!