Skip to content
English
On this page

Automated Functional Testing

Automated testing is the practice of using tools instead of humans to perform user-like actions on an application and verify its expected behavior. The practice has been around since the 1970s, and the techniques and tools in this space have evolved over many decades. For example, in the 1970s, most of the applications such as weather prediction and other scientific and engineering applications were written with FORTRAN. RXVP was invented to perform automated testing on those applications. In the 1980s, when PCs evolved, AutoTester was introduced for automated testing. In the 1990s, when the worldwide web was invented, test automation tools like Mercury Interactive QuickTest became popular, and also Apache JMeter, an automated load testing tool, was invented. With more and more advancement in web technologies, the 2000s saw the birth of Selenium! The software industry started understanding the essence of testing further in the last two decades and invented numerous tools to ease their daily jobs.

We now have ‘n’ number of tools to automate all layers of a given application — the UI, services, DB, and so on, which we shall also discuss in this chapter. All the more, today, AI/ML technologies are deployed to enhance the experience further and ease the current decades’ automated testing challenges!

The apparent reasons for such continuous innovation are that automated testing significantly reduces the cost of testing and enables software teams to get faster feedback on their application quality sooner than manually testing the application. To elaborate on these advantages in detail, let’s take a project where you only perform manual testing throughout your application development and see how automated testing compares in the same situation.

Let’s say, on average, each feature in your application has 20 test cases, and you take 2 minutes per test case to execute them or 40 minutes to test one feature manually. Whenever a new feature is developed, you need to test its integration with the existing features and also ensure the existing features are not broken due to the new changes — a practice referred to as regression testing. The risk of not doing regression testing early enough is that you will find integration bugs only during release testing, which is very late in the cycle and might delay the release timeline. So, in the example, regression testing along with new feature testing will take 80 minutes when there is a second feature, 120 minutes when there is a third feature, and so on.

Soon enough, when your application has to go live with 15 features, you will have to plan for 600 minutes of testing time. Sometimes, a mature application has to work across different versions of services. Based on the number of supported versions, your testing time will increase proportionally. For example, if the services have two versions, your application testing time will become 1200 minutes for every release. Additionally, if you find bugs and, depending upon their nature (e.g., a bug that requires a change in DB schema), you might end up spending another 1200 minutes testing the application before going live! This cycle will continue with an increase in testing time as new features get added to every release.

The businesses that don’t invest enough in automated testing tackle this problem by increasing their manual testing capacity. But they still fail to get faster feedback compared to automated testing. For example, even if there are 12 people parallelly testing the previously discussed application, you still need 100 minutes to finish testing; whereas, automated tests in the right layers can run much faster and give quick feedback. It’s also important not to forget that if you have automated tests, you don’t have to assemble your twelve teammates at midnight to test the urgent production defect fix before releasing — even if you dare to, manual testing can be error-prone as it majorly depends on the quality of the test cases documentation and execution.

Of course, there is a cost to creating automated tests and running them regularly. However, it’s a cost that needs to be compared against the value of delivering the product quickly and frequently to the market, the expense of manual testing time and capacity, and finally, the confidence it gives the team during development and while fixing production issues.

In summary, a recommendation for the businesses that still rely only on manual testing is that you need both manual and automated testing to deliver a high-quality product and a wise strategy to balance them. In simple words, the strategy could be: use your capacity to perform manual exploratory testing to discover new test cases and automate them all to cater to regression testing.

At the mention of automated functional testing, some organizations gather efforts to amp up their macro levels tests in the higher layer of the application, i.e., start adding more and more UI-driven end-to-end functional tests and entirely miss focusing on the micro-level tests in the lower layers of the application. For example, one of the teams I was consulting had ~200+ UI-driven end-to-end functional tests, and it took 8 hours every day to complete running, only to fail in the end due to the brittle nature of the macro-level tests. This clearly is an antipattern as it destroys the goal of getting fast feedback from performing automated testing, and secondly, it doesn’t provide stable feedback. This is why teams need to add micro and macro-level tests as part of their automated functional testing efforts. We shall now discuss the different types of micro and macro-level tests.

Introduction to Micro and Macro Tests Types

In order to understand the different micro and macro level tests, you need to observe four of their traits essentially: the scope at which they operate, the purpose they fulfill, their swiftness in giving feedback, and the efforts needed to create and maintain them. Based on this understanding, you can tailor the automated testing efforts for your project by including or eliminating some of them. We will take the same example eCommerce application as seen in figure to understand these different test types.

Frameworks

The eCommerce application have three application layers, as you can observe from the figure, which are the eCommerce UI, the RESTful services (authentication, customer and order service), and the DB. The UI interacts with the services to process information, and the services communicate with the database to store/retrieve relevant information. The application also integrates with an external product information management service (PIM) and the downstream systems such as the warehouse management system and so on in order to fulfill the orders.

A typical user flow on the application would be: the user enters their credentials in the eCommerce UI, the credentials are sent to the auth service for verification, and on successful login, the user searches for products and places orders from the eCommerce UI. The responsibility of the order service is to receive the orders placed by the user, validate the product information against the external vendor PIM service and pass it onto the warehouse management system to trigger the delivery processes.

With such components and their responsibilities, figure shows the different micro and macro-level tests required at appropriate layers to fulfill the automated functional testing needs for this application holistically. Let’s unfold them one by one.

Unit Tests

You will find the unit tests in all of the services and also in the UI layer of the application in the figure. Unit tests aim to create safety nets at micro levels of the application. They validate the smallest portion of an application’s functionality. For example, unit tests verify a method’s behavior in a class. This is the level to add automated tests for most of the basic input validation, not the UI layer.

Let’s say we have a method return_order_total(item_prices) in the order service of the eCommerce application, which returns the totalorder amount. The following are some of the unit tests that can be added to verify its behavior:

  • Return the total amount when items have negative prices due to discounts.

  • Return total amount when item_prices value is empty.

  • Return total amount when item prices are corrupted, such as alphabets, symbols, etc.

  • Return total amount when item prices are sent with different money formats in case the application supports localization.

  • Return a properly-rounded total amount with fixed decimal values.

Unit tests reside well coupled within the application code base and are written by developers. In teams that follow test-driven development (TDD), developers write unit tests before the application code, make them fail, and then add application code just enough to fix the test. This practice helps curb unwanted and untested logic in the code. JUnit, TestNG, NUnit are some of the commonly adopted unit testing frameworks in the backend. Similarly, Jest, MochaJS, Jasmine are a few front-end unit testing frameworks.

Unit tests are the fastest to run. Since they reside inside the developer code base, they are easier to create and maintain. They are usually run as part of the application build stage on the local developer machine aiding shift-left and giving quick and early feedback.

Integration Tests

In most medium and large-scale web applications, there are quite a few integration points within internal components such as the services, UI, databases, caches, file systems, and so on, which may be distributed across network and infrastructure boundaries. Sometimes, applications also integrate with external third-party services and storage drives too. In order to test if all these integration points work as expected, you need to writeintegration tests that run against the actual integrating systems. The focus of such integration tests should be to essentially verify the positive and negative integration flows and not the detailed end-to-end functionality. As a result, they should be as small as a unit test.

In the eCommerce application example, the order service integrates with internal components such as the eCommerce UI, the database, and other services to exchange information. Similarly, it also integrates with the external vendor product information management (PIM) service and the downstream systems. So, we need to write integration tests in each service to verify if it can properly communicate with other dependent services and with the DB. And specifically, in the order service, integration tests should be added to verify the integration with the external PIM service and the downstream systems.

Integration tests can be written using the same unit testing frameworks along with specific tools to simulate the integration — for example, JUnit along with SpringBoot Data JPA 1 features can be used to write DB integration tests. Integration tests also reside tightly coupled in the application code, making them relatively easier to create and maintain for the developers. Their swiftness depends on the time taken by the external system to respond; therefore, it can be slower than unit tests, which run in complete isolation.

Contract Tests

Sometimes, integration tests may not be viable if the integrating services are also under development. This is usually the case mostly in large-scale application development, where multiple teams work independently on different services. In such projects, teams agree on a standard contract for every service and work with the stubs of the dependent services until they are ready. However, when stubs are used, there is a caveat — you wouldn’t know if the actual integrating service’s contracts changed! You will continue to build new features on top of broken contracts until you figure that out during actual integration testing with real services later at the end ofthe development cycle. This is one of the primary reasons to have contract tests.

Contract tests are written to validate the stubs against the actual contracts of the integrating service and to provide feedback continuously to both teams as they progress with development. Contract tests don’t necessarily check for the exact data returned by the integrating service but rather only focus on the contract structure. In the eCommerce application, contract tests can be added to validate the external vendor PIM service’s contract so that whenever it changes, we can change the order service features accordingly.

Contract tests can also be written between the eCommerce UI and the services. The end-to-end workflow of contract testing involves collaboration between teams and is discussed in detail later in the chapter. Tools like Postman and Pact enable automation of this workflow. Contract tests, in general, run very fast as their scope is still as small as verifying the contract structure. They also reside well coupled with the application codebase and hence are relatively easier to create and maintain for the developers, although not as simple as unit tests. The additional complexity is because of the end-to-end setup that requires collaboration from both teams.

Service Tests

APIs need to be treated as products themselves and tested thoroughly independent of the UI behavior. This is the focus of service tests. Services essentially handle all the domain-specific logic such as the business rules, error criteria, retry mechanisms, data storage, and so on. They reject invalid requests after validating their structure and value format.

These are some of the scenarios you can add as service tests. This is where macro-level testing begins as it covers integrations, domain workflows, and so on. In the eCommerce application, some of the order service tests can be the following:

  • Verify that only an authenticated user can create a new order.

  • Verify that an order gets created only if the items are available at the point of creation.

  • Verify that orders cannot be created if the input request format is invalid.

Similarly, every service needs to have service tests for all the endpoints in it.

Service tests reside as a separate codebase sometimes. However, it is better kept as part of the service component itself to get fast feedback. It is slightly complex to create and maintain compared to unit tests as it involves a real test data setup in the DB. Usually, they are owned by the testers in the team. They run quicker than the UI-driven end-to-end tests and slightly slower than the previous three micro-level tests (unit, integration, and contract). Tools like RestAssured, Karate, Postman can be used to automate API tests.

UI Functional Tests

UI-driven functional tests are run on an actual browser and mimic the user actions on the application. These tests give us feedback on the integration between multiple components, such as integration between services, UI, and DB. These macro-level tests should focus on validating all the critical user flows. One example of a critical user flow in the eCommerce application is searching for a product, adding the product to the cart, paying for the product, and getting order confirmation, which can be added as a UI functional test. When writing such tests, avoid validating the same detailed functionalities covered as part of the lower level tests again, as this will be redundant and result in increasing their execution time. For instance, verifying the order total for different combinations of item prices should be covered as part of unit tests and needn’t be validated again as part of the UI functional test.

UI functional tests are usually kept outside the application code as a separate codebase. This mostly comes under the tester’s purview, although combinedly owned along with developers. They take longer to run. They also tend to be brittle as it depends on the entire application stack’s behavior, including the infrastructure, network, and so on, to be stable. Additionally, they require considerable maintenance efforts compared to other types of tests as failures could be anywhere across the entire application, such as a change in an element ID or delay in page load, or unavailability of services due to environment issues.

Tools like Selenium and Cypress are quite popularly adopted to write automated UI tests. We have exercises for both of them in the chapter.

End-To-End Tests

As the name suggests, end-to-end tests should validate the entire breadth of your domain workflow, including downstream systems. In the eCommerce application, after the order is placed on the website, the downstream systems such as the warehouse management system, third-party shipping partner services, and so on, actually fulfill the order. This end-to-end domain flow needs to be tested for proper integration.

Based on the application context, the UI functional tests themselves tend to become end-to-end tests. If not, have separate end-to-end tests using a combination of UI, service, and DB testing tools to cover the entire integration flow. Obviously, these tests take the longest time to run and require more care in maintaining them as they need a stable environment and test data setup across various systems. The intent of these tests is to know if all the components are integrated properly end-to-end and not to test the components’ functionalities. So you can just have a few tests that will activate all your components.

That covers all the micro and macro tests types and their four essential traits. The next section discusses an automated functional testing strategy that is widely adopted as the north star by software teams. With that, you will be equipped to define an automated functional testing strategy custom to your project needs.

Automated Functional Testing Strategy

A one-liner strategy that can be applied to automated testing is: add tests to validate the right scope of the functionality in the right layers of the application such that it yields the fastest feedback to the team! This is crystallized wisely by Mike Cohn with a visual cue in his book Succeeding with Agile in 2009 as the test pyramid concept. The test pyramid recommends having a broad bucket of micro-level tests and gradually reducing the macro-level tests as its scope increases. For example, if you have 10x unit and integration tests, you should have 5x service tests and only x UI-driven tests. This recommendation when visualized on paper would almost take the shape of a pyramid and thereby called the test pyramid. The obvious reason for such a recommendation is that as the scope of the tests increases, they take more time to run and cost more to write and maintain.

A typical test pyramid for a service-oriented web application such as our example eCommerce application may look like figure.

Frameworks

I have seen the test pyramid work in practice, and so have many practitioners. One prominent example I can quote is that after we transformed the earlier mentioned project that had ~200 UI-driven end-to- end tests to adhere to the test pyramid, they were able to get feedback within 40 minutes with ~470+ tests!

Yet another part of the automation strategy should be to have a way to track the automation coverage in order to ensure there is no backlog. Test management tools like TestRail, project management tools like Jira, or as simple as an excel sheet can be adopted for tracking the automation coverage of test cases. Tracking automation coverage is essential as, most often, teams omit or set aside the automation efforts from the user story’s scope due to various reasons leading to delayed and incomplete feedback. As a side effect, teams lose confidence in the automation suite itself. So track all the test cases and ensure they are automated. An ideal practice and the one that many Agile teams follow is to call a user story “done” only if all of its micro and macro tests are automated.

Perspectives

We have delved deeply and broadly into the functional test automation space so far. Before closing the chapter, I would like to draw your attention to antipatterns in automated functional testing, automation test coverage, and specifically, what it means to have 100% automation coverage.

Antipatterns & Tips to Overcome

Even though you would have spent heaps of effort in crafting the right automated functional testing strategy and implementing the test frameworks in the right layers, your automated functional testing task has only just begun. Throughout the delivery timeline, you should continue to watch for antipatterns in automated functional testing as the team progresses with developing more and more tests. In my observation, it is easy to fall prey to these antipatterns with the delivery buzz and thus being watchful for the early symptoms becomes crucial. Let’s discuss a couple of antipatterns — the ice cream cone and the cupcake

Frameworks

The Ice Cream Cone

When you invert the test pyramid, it looks like a cone, referred to as an ice cream cone antipattern, i.e., there are more macro-level UI-driven tests and very few micro-level tests. You can sense the ice cream cone antipattern when you observe some of these symptoms in the project:

  • Waiting for a long period to get feedback from the tests run.

  • Catching defects later in the cycle, sometimes only during the release testing stage.

  • Elaborate manual testing is required to give feedback despite having automated tests.

  • Frustration in the team with the automated tests as the diligent efforts in automating the UI flows have not been fruitful in giving the right results.

TIP: The earliest sign at which you can prevent your team from drifting steeply towards this antipattern is when you find regression defects during manual story testing. Do a root cause analysis immediately, and fix your team practices early

The Cupcake

When you duplicate tests in multiple layers, each layer in your test pyramid becomes a flat same-sized slice — overall, looking like a cupcake. This kind of disorganization generally happens when you have siloed teams of developers and automated testers. For example, the developers would have added unit tests to verify all the invalid login inputs, and the testers would still add the same tests in the UI layer.

You can sense this antipattern when your team takes a long time to release even a tiny feature. Also, you might notice blame games — as one role willexpect the other role to have added appropriate tests whenever there is a bug.

TIP: A simple way to avoid this antipattern is to have a short discussion among the relevant roles in a team to determine which tests are expected to be written in which layer. The right avenue for such discussion could be the story kick-off meeting and documenting the result of the discussion in the story cards itself.

100% Automation Coverage!

Teams usually track the automation coverage percentage as a metric, and a high percentage is often considered a validation of their good software development practices. The automation coverage percentage is calculated by capturing all the application test cases, marking them as automated or not, and using simple mathematics to call a percentage number. Teams also set their goals to achieve 100% automation coverage with good intentions. While doing so, let’s also keep certain pointers in mind.

CODE COVERAGE AND MUTATION TESTING

Code coverage is different from automation test coverage. Code coverage tells us if there are lines of code that would not be executed by the existing unit tests. In other words, they tell us if there are untested lines of code. Code coverage tools like JaCoCo, Cobertura are integrated in the CI build pipelines to fail the build when the code coverage percentage is less than a certain threshold to prevent the untested code from percolating further. However, high code coverage doesn’t necessarily warrant all the test cases being automated.

In order to find the missed test cases in unit testing, a technique called mutation testing is employed. Mutation testing changes the application’s code and checks if the tests fail. For example, when there are void method calls, it removes the calls and runs the tests. The mutation is said to be ‘killed’ if the tests fail and ‘survived’ if not. PITest is one such popular mutation testing tool. It can be added as a Maven dependency and executed from the command line. It lists the test cases that survived along with an overall mutation score for the application. Mutation testing, though very effective, is time-consuming; hence, it has to be used wisely.

The first pointer on automation coverage percentage is that it doesn’t guarantee a bug-free application even if you have 100% coverage! The percentage is simply a measure of how many ‘known’ test cases are automated — you will probably discover unknown cases later. So, it is important to call out this key difference to the business and your team as it may lead them to question the reliability of the automation test suite and the efforts spent on them when a critical bug is found. It is also crucial to make them understand that the expected outcome from tracking this metric is to disclose the automation backlogs and plan capacity in the iterations to complete them. You could also use the tracking wisely to observe if your team is drifting towards one of the antipatterns.

The second pointer is when you are tracking automation coverage, observe whether all the areas of the application have automation coverage. Especially when you are developing large-scale applications where different teams work on various components, your coverage percentage will still be high (say >80%) even when one module has zero tests, as the other modules have high test coverage percentage.

The penultimate pointer on 100% automation coverage is that you should include both functional and cross-functional test cases while calculating it. Most often, cross-functional test cases don’t contribute to the percentage resulting in bugs later (we will learn more about cross-functional test cases automation in the upcoming chapters).

And finally, while you should aim to automate all the test cases, sometimes, depending upon the nature of the application, environments, automation costs, etc., it may be impossible to achieve 100% automation coverage. In such cases, you should track the non-automated test cases properly and add them to your manual testing list. That said, you should not have a tallmanual test cases list — subtly recalling the 1200 minutes of release testing from the beginning of the chapter!

The greater benefits of all these meticulous tracking and ensuring proper automation coverage will start showing as the project grows; especially, when it extends over a few years. As they say, code outlives people, and often, the automated tests end up being the only trustable living documentation of the application functionalities. Thus your efforts in writing good automated tests will prove to be an appreciable investment not only for the project but for you and your future teammates.