Skip to content
English
On this page

Infrastructure as Code

Infrastructure as Code (IaC) has become an integral part of various careers in ICT, particularly in relation to the cloud. Software like Kubernetes, Docker, Chef, and Puppet lead the charge in utilizing IaC in daily operations. The HashiCorp suite fits perfectly into this genre. Tools like Terraform, Vault, and Vagrant can be used to create fully automated environments for realizing automation. In this chapter, we will introduce the basic principles of Infrastructure as Code from a theoretical perspective. This is fundamental for better design and implementing the automation in real applications.

If we want to define IaC, this line could articulate it: “Infrastructure as Code, IaC, is the process to manage and provision computer via machine-readable code.” IaC has grown in tandem with the growth in popularity of the cloud. For instance, for IaaS (Infrastructure as a Service), Infrastructure as Code supports IaaS but is actually separate from it.

Infrastructure as a Service is one of the four types of the cloud defined by the NIST document 800-145. This type of cloud has the capacity to provision storage, network capability, processing, and another fundamental computing resource. These resources can be used to run arbitrary software like operating systems and software created ad-hoc.

Infrastructure as Code has the following three advantages:

  • Cost reduction
  • Faster execution when you release the infrastructure
  • Reducing errors during the release of the infrastructure

The first aspect we can improve with IaC is cost reduction. IaC reduces costs by defining the infrastructure in a file. This essentially allows us to generate in the same hardware, the definition of a different architecture and infrastructure combination without needing to buy new hardware any time. It can then be released automatically.

In case of a challenge, you can roll back to a previous configuration. Without this facility and specific software, the engineer would have to spend days or even weeks to release the infrastructure, create the VM, and configure it, all of which can be negated by using IaC. The software for IaC can be released in a matter of hours, creating new infrastructure. This capability can lead to faster execution. Since the IaC is released by the computer, this manifests the advantage of reducing errors during the release.

IaC for your infrastructure usually essentially involves writing a file that defines the infrastructure. This can in turn be included in a CI/CD cycle, followed by a test stage plus code review. This drives an uptick in the quality of the code and potentially reduces errors during the release of the infrastructure.

Principles and Goals for IaC

When a company decides to embark on a journey to implement IaC, obviously some changes to smooth the transition are necessary to achieve this goal. The following team procedures or changes to mindsets and mantras are usually necessary:

  • Support or change in the IT infrastructure becomes the norm and is not seen as an obstacle.
  • Making a change to the system is a routine, not a problem.
  • The release becomes a simple and repetitive task, and allows the engineer responsible for the decommissioning of the infrastructure to do some more important tasks.
  • The technical team is able to define and release its own infrastructure.
  • The infrastructure can be easily rolled back in case there is an issue.
  • The infrastructure follows the principle of CI/CD, which means the infrastructure is released continuously in small pieces and not in one big release.

The adoption of IaC requires the following principles:

  • Every system must be reproducible.
  • Every system must be disposable.
  • Every system must be consistent.
  • Every system must be repeatable.
  • The design of the system always changes.

These principles are the basis on which you can build your IaC project. We will briefly describe these principles so you can better understand how to put them in place and how they help your IaC process.

Every System Must Be Reproducible

To facilitate good IaC, the entire infrastructure must be robust enough to spin up and down within a certain time frame all of the infrastructure, or a small piece of it. Because the system is defined in a machine-readable file, the infrastructure can be easily installed multiple times. All details about hostname, network name, etc. should be in the configuration file used for the IaC in order to allow for easy repetition.

Every System Must Be Disposable

When we talk about IaC, we imply dynamism in that all facets of the infrastructure such as resize, removal, etc. can be effected. These changes cannot influence the correct functionality of the infrastructure. The infrastructure must be reliable and consistent across all changes. It becomes particularly important to have a dynamic infrastructure that can be easily disposed of in case of failure without creating an issue for other infrastructural modules.

Every System Must Be Consistent

When a system is created using IaC, it is essential, for example, that two servers configured with the same script are the same. The only difference can be the IP address or the server name.

This principle prevents inconsistency such as when we create a server in GCP or AWS. We define the server using a template; this template is used for defining the size of the HDD, the number of CPUs, etc. This consistency is important in order to avoid disruption of the services.

Every System Must Be Repeatable

The core principle of IaC is to have a repeatable process. It is very important to apply the other principles of IaC. When we define an IaC file, we need to define a file that can be repeated and the result must be identical for every run execution. If the process is not repeatable, as in changes manifest during execution, we will have instability.

The Design of the System Always Changes

With a conventional infrastructure, all changes incur unnecessary cost overruns plus delays. Due to the nature of IT, no one can foresee how frequently changes will need to be made to the infrastructure. IaC, especially in conjunction with a cloud environment, can result in faster turnarounds. Automation with proven configurations tested in lower environments enforces this.

Implementing IaC

For a successfully implement IaC, a defined process needs to be followed. The cornerstone of IaC is the definition file. The definition file is used to define every single piece of the infrastructure such as servers, databases, networks, etc. Various tools such as Chef, Ansible, or Terraform written in a common DSL language facilitate this.

resource "aws_instance" "hcl_example" {
ami               = "ami-21f78e11"
availability_zone = "us-east-1"
instance_type     = "a1.medium"

tags {
   Name = "HCL-EC2"
}
}
resource "aws_ebs_volume" "ebs_volume" {
availability_zone = "$us-east-1"
size              = 1
  tags {
   Name = "EBS-volume"
}
}
resource "aws_volume_attachment" "hcl_vol_attachment" {
device_name = "/dev/sdh"
volume_id   = aws_ebs_volume.ebs_volume.id
instance_id = aws_instance.hcl_example.id
}

Presents a piece of a configuration file from Terraform. A resource is defined, in this case as an AWS instance. The code defines all of the information needed for the instance using tags. This code organization helps you isolate different pieces of infrastructure and makes versioning simpler. Defining infrastructure in a file helps create documentation for the project. Documentation can be easily extracted from the definition file. This also greatly enables testing.

Automatic code release means human interaction is reduced or totally absent, in this context allowing for continuous testing. Another advantage of IaC is that large changes can be broken down into smaller pieces, reducing downtime. Reliability and security are also improved since a small, continuous process produces fast feedback or results. This feedback in turn enhances quality since a quick rollback is possible in case of errors or bugs.

Dynamic Infrastructure and the Cloud

A dynamic infrastructure is a platform that provides computing resources such as memory, disk space, or network resources that can be programmatically defined. Reading this definition, you can easily visualize the cloud and dynamic infrastructure as quick collaborators. Good examples are public IaaS such as AWS or GCP.

Not only does the public cloud allow the creation of a dynamic infrastructure, but this dynamic infrastructure can be created using software for virtualization, namely OpenStack.

Any dynamic infrastructure must have some specific characteristics:

  • Must be programmable
  • Must be on-demand
  • Must be self-service

Analyzing these characteristics, we find a striking similarity with the NIST definition of the cloud. Dynamic infrastructure can be defined as a small cloud. The NIST defines the cloud as

  • On-demand and self-service
  • Broad network access
  • Resource pooling
  • Rapid elasticity
  • Measured service

In a dynamic infrastructure, resource pooling is not immediately apparent or necessary. However, the sharing of infrastructure across different users or business units is a natural evolution.

The first characteristic of the dynamic infrastructure is the programmability of the infrastructure. To define the dynamic infrastructure, we need an API to create and maintain the infrastructure itself. This API can have a REST interface or another to interact with it, which allows the creation of the infrastructure. Every public cloud normally has an SDK that facilitates interaction to define and create the infrastructure

programmatically. Another important characteristic for dynamic infrastructure is that the infrastructure must to be on-demand. Essentially this means we can destroy the infrastructure as needed, which is important from a budgetary and flexibility point of view.

Connected with on-demand is self-service. This capability allows the infrastructure to alter itself. This capability allows the user to improve the infrastructure or define the infrastructure based on its necessity. A user must to be able to design the infrastructure based on their specific needs. For example, a user may need more storage than most.

Different Types of Dynamic Infrastructures

There are different types of dynamic infrastructures. It is important to understand the different types and definitions to better identify what is best for the current project and what is prudent from a business sense. As stated, the definition of a dynamic infrastructure is essentially similar to the cloud definition. There are three main definitions:

  1. SaaS, Software as a Service: This is software shared across different users. An example of SaaS is Gmail or Office365.

  2. PaaS, Platform as a Service: This is a platform used to create software and hence to develop and host the software.

  3. IaaS, Infrastructure as a Service: This is a platform for creating a basic infrastructure.

A dynamic infrastructure is essentially IaaS. The definition and functionality for IaaS is used to define the basic infrastructure of the system, which in turn allows for the creation of software based on said infrastructure. IaaS can be private or public; the difference is the level of access. In a private IaaS, the provider allows the user to build it all from scratch. The provider offers the basic components and the user builds their own infrastructure. Examples of IaaS are Google Cloud, AWS, and Azure.

An important decision about the dynamic infrastructure is the business case for private or public cloud infrastructure.

The first consideration is the security of the data. The security of the data is different for a public infrastructure versus a public infrastructure. Concerns mostly revolve around the type of data and its corresponding legislation. In a public cloud infrastructure, the data center is not inside the company and therefore we have no control over the data. We do not know who has access to the data and we don’t know the level of security for the data. Conversely, in a private infrastructure, we retain full control of the data and access is restricted due to our control of physical security. One possible solution is to have a hybrid solution. In this configuration, it is possible to maintain the data inside the company and have in the public infrastructure only the OS and the network management segments. Another consideration to make is scalability relative to necessity. With a public infrastructure, it is easy to scale the infrastructure up and down and pay only for what is needed. This is important when the system needs to address a huge number of requests connected with a specific momentary event. The last point to consider is the cost of the infrastructure and the total cost of the operation. A private infrastructure can have higher costs than a public infrastructure. The expense to consider is not only connected with the cost of the initial hosting but also ongoing maintenance.

Tools for IaC

To design and maintain IaC, it is important to have the right tools. There are a wide range of tools available. It usually comes down to familiarity. The options generally boil down to software like Cloud Formation, Openstack, or Terraform. All Infrastructure as Code processes are based on the automation principles described earlier in the chapter. Some basic requirements are

  1. Scriptable interface
  2. Support for the CLI (command line interface)
  3. Must be reliable
  4. Must have an external configuration file

The first requirement of a good tool is to have a scriptable interface. To enable full automation, the tool must be configured via code using an API or the internal CLI. For example, in Terraform it is possible to use HCL to define the infrastructure, allowing the smooth creation of the actual configuration.

The first requirement drives the second and third requirements. In terms of support for the CLI, for faster maintenance the CLI allows the system administrator to execute commands directly on the infrastructure created. Rapid changes can also be facilitated through the CLI.

The third requirement, reliability, in this context refers to consistent results performed by the script on each iteration. To achieve this requirement, some scripts characteristics can be defined:

  1. Reliable: The script must return the same result any time we execute with the same value. This is important for building trust.

  2. Input and output check: The script must check the output for each input. If there is an issue with the result, then it must return a clear message to the user. This lends itself to better security and maintenace.

  3. Clear failure: The script must have a clear condition in case of an error. This is important for having a clear status about the code and for understanding what’s happening when the code is running.

  4. Parameterized: The script must be able to accept some parameter or argument to configure an execution, such configuring the path for the log or the user executing the process.

These characteristics are not new for developers but are a new concept for some system administrators. These characteristics help you write a good and stable script for Infrastructure as Code.

Defining IaC

The main component of Infrastructure as Code is the definition file. The definition file is used to create and maintain the infrastructure. The scope of the definition file is to describe what type of infrastructure you want to build and its parameters. The format for a definition file is a text file, such as a Vault definition file.

storage "consul" { (1)
  address = "127.0.0.1:8500"
  path    = "vault"
}
listener "tcp" { (2)
  address     = "127.0.0.1:8200"
  tls_disable = 1
}
telemetry { (3)
  statsite_address = "127.0.0.1:8125"
  disable_hostname = true
}

If you analyze this configuration file, it is possible to identify the basic characteristic of the definition file: sections (1), (2), and (3) define the basic components for the infrastructure. In this case, it’s storage, listener, and telemetry. Every section has the value for the specific component of the infrastructure to configure. So, if you analyze the first section,

storage "consul" {
  address = "127.0.0.1:8500"
  path    = "vault"
}

it is possible to see the address and path. The address is the IP address for the server, and where is stored the data is the path. A definition file has another advantage in that the file is auto-documented. This is essential in order to better understand what the file does.

Releasing IaC

It is necessary to adopt some engineering practices to realize better maintenance of the infrastructure. It is possible to identify some basic patterns for the different sections of the definition file. These patterns can be split into the following different phases:

  1. Provisioning
  2. Management
  3. Updating
  4. Definition

These four phases are needed to implement Infrastructure as Code. Provisioning is the process of building an element of infrastructure such as a server. The provisioning is one of the basic processes of Infrastructure as Code. To be effective, provisioning must respect some characteristics:

  1. On-demand
  2. Define once
  3. Transparent

On-demand means the infrastructure can be easily rebuilt without undue effort. The infrastructure can be executed and deployed without any constraints. Define once specifies that every element in IaC must be defined once and then can be rebuilt n-time(s) as needed to design the infrastructure. This characteristic is important because in case of error we can easily rebuild the entire infrastructure multiple times without changing the result of the operation. Transparent is the requirement that each element in the definition file must be easy to read and modify.

Management is the process for managing and maintaining the servers. This process also involves the process for saving and versioning the definition file. Update is the creation of a new piece of software or updating the existing infrastructure with a new version for the OS, for instance.

One of the most important changes driven by Infrastructure as Code is to enable the continuous release of the infrastructure. The release of the infrastructure is used mostly to synchronize the infrastructure, correcting its state. There are essentially two patterns for synchronizing the infrastructure:

  1. Pushing the change
  2. Pulling the change

These two patterns are used to apply the changes in the infrastructure. The main difference between these two patterns is who initiates the change. With the pushing pattern, there is a central node that manages the changes and pushes the scheduled times. For example, every 20 minutes the central software node pushes the changes to the clients connected to that node. With the pulling pattern, the client checks with the central server if there is a change to receive.

Pushing vs. Pulling

Pushing and pulling are two methods used to update and install the new infrastructure. There are pros and cons to both methods. Push uses a central management server to push the changes to the client. This configuration requires a client running on all nodes. The client receives the configuration to apply on the host. The client normally communicates with the server using an SSH channel. Ansible uses the SSH daemon in conjunction with Python. The advantage of push is having a centralized server to push the configuration for the infrastructure. One disadvantage is in the architecture design since to effectively design a loosely coupled architecture, it is important to reduce dependencies.

In comparison, the pull method has an agent on each host which at a specific time checks for a configuration from a central repository. One of the advantages is security. This method requires the host to connect to a central system. This helps with security concerns because the central system talks with the client via just one port and one user. In the push pattern, the system can have more than one user and port to configure and thus much more to secure. Deciding whether to use a push or pull system requires an understanding of the underlying architecture. A pull-based system is more scalable than a push-based system.

This depends on the configuration of the system. A pull-based system doesn’t rely on only one node to pull down the configuration. This means it is possible to add a number of nodes and scale the architecture. On the other hand, the push system has a central node to manage the nodes. It is not really possible to scale up the architecture compared with a pull system. To scale up the push system, the design needs more than one central node. This means having different parts of the infrastructure, such as getting the configuration from a different server location. This can be, for example, a split based on the geographical area, but this means we need to be sure all of the central nodes have exactly the same configuration. However, this increases the complexity and requires more resources of the push system.

Engineering Practices for IaC

The central caveat behind Infrastructure as Code is to utilize the same principles used for software engineering. The main focus of a software engineer practice is to enhance the quality of the system. This is done following these practices and principles:

  1. To delivery working code at an early stage
  2. To delivery said code in a continuous manner
  3. To build code of a high quality
  4. To build a small piece of code at every iteration in a continuous cycle
  5. To ensure that each release achieves the highest quality
  6. To obtain constructive feedback for every release
  7. To expect change

These principles are inspired by the Agile Manifesto, which drives change and hence the development of Infrastructure as Code. Applying these principles in your IaC development helps you introduce and develop high quality code. The main focus is to improve the quality of the system.

Improving the System Quality

One of the most important engineering principles is system quality. Having good quality is key to having a system that is maintainable and scalable. To build the quality on your system, it is necessary to put in place some practices in your development life cycle:

  1. Use version control.
  2. Unit test the code.
  3. Build a CI.
  4. Comment the code.

These practices are common for software development. In IaC, the quality is improved, which makes the system more stable and easy to scale or maintain. It is easy to commit this practice to the IaC definition since IaC is defined using a file.

Quality of software is not only concerned with the expected response of functions but also with the capacity to maintain code, comment the code, and unit test the code. All of this helps maintain the code for changes or maintenance. To enhance overall quality in the system, all of practices listed above must be utilized at each iteration. The result is to enable fast feedback on each iteration to analyze whether your infrastructure is stable and reliable.

Version control is important (versioning the infrastructure) because the system is a text file. It is easy to version this file in a source control like Git. To have a file under version control enables another engineer to perform a code review. This improves the quality of code and it also disseminates knowledge of the code, inducing a positive discussion.