Building A Robust and Efficient AWS Cloud Infrastructure with Terraform and GitLab CI/CD.

Last Updated on October 5, 2024 by Editorial Team

Author(s): Julius Nyerere Nyambok

Originally published on Towards AI.

Historically, cloud infrastructure management involved manual configuration on web consoles or command line interfaces. This approach was prone to human errors, inconsistencies, and maintaining version controls. The growing complexity of cloud environments and the demand for faster, more reliable, and reproducible infrastructure management practices highlighted the need for a more efficient solution.
Infrastructure-as-code (IaC) is a DevOps practice that uses code to define and deploy infrastructure. Terraform by HarshiCorp is an IaC tool that allows you to define and provision cloud resources using a declarative language called HashiCorp Configuration Language.
In this article, we will deploy resources on AWS through Terraform and create a CI/CD pipeline on Gitlab to automate the deployment process.

Building A Robust and Efficient AWS Cloud Infrastructure with Terraform and GitLab CI/CD. — Figure 1: Terraform basic flowchart

Part I: Introduction

In this project, we will define the AWS infrastructure, write terraform code that defines our AWS infrastructure, build our infrastructure, and automate our infrastructure creation using GitLab CI/CD pipelines so that when a change is made, the pipeline will run the terraform commands, and update the infrastructure. You require the following tools for this project:

AWS account and a user account — Preferred cloud computing resources provider that offers a free tier.
AWS CLI — A command line interface to authenticate our AWS credentials.
Terraform — Infrastructure-as-code tool to deploy cloud resources via code. You can follow this tutorial to install it.
GitLab account — To store our code in a repository and create our CI/CD pipeline.
Any code editor you prefer i.e VS Code.

Here is the link to the GitLab repository which I have successfully mirrored on my GitHub.

GitHub – Jnyambok/Terraform-CI-CD-Pipeline: AWS infrastructure that consists of a VPC and Amazon…

AWS infrastructure consists of a VPC and Amazon EC2 instance deployed through Terraform.Repository mirrored from…

github.com

Part II: Infrastructure Definition

A Virtual Private Cloud (VPC) is a private, isolated section of the AWS cloud where you can launch resources. It’s akin to a private data center within the public cloud that allows you to customize the configuration, including subnets, routing tables, and security groups.
An Elastic Compute Cloud (EC2) instance is a virtual server in the cloud that provides on-demand computing capacity and resources like CPU, memory, and storage.
A security group is a firewall configuration for your services that defines what ports on the machine are open to incoming traffic.

Imagine you want to create an application on AWS. You would first create a VPC to provide a private network for your web application. Then, you would launch EC2 instances within the VPC to run your application. The VPC, through a security group, would define the network configurations for the EC2 instances to ensure they communicate with each other and the outside world. This infrastructure is what we will build.

An Amazon Machine Image (AMI) is a template for creating EC2 instances. It contains the software and configuration information required to launch an instance. Think of it as a pre-packaged set of instructions for building a virtual server.

Part III: Terraform structure definition and configuration.

Terraform projects are typically structured like this:

Figure 4: Terraform structure definition

In Terraform, modules are reusable blocks of infrastructure code that encapsulate and organize related resources into a single unit making your configurations more modular. Our VPC and EC2 configurations are defined within folders within our project. These are our modules. We have three main files that are defined in the root module.

main. tf — This is the primary Terraform configuration file. When the file is within a module, it defines the resources you want to provision i.e virtual machines, databases, and containers. When the file is in the root folder, it acts as a messenger between modules to pass vital information.
provider. tf (optional) — This file configures Terraform providers to interact with specific cloud platforms or services.
variable. tf (optional) — This file helps you define reusable variables with types and optional default values. It’s useful if you have a large cloud infrastructure.

I have provided this basic template for this project on my GitHub. Go ahead and git pull this repository. Go through Terraform’s basic syntax to understand.

Let’s begin by configuring our Virtual Private Cloud. Navigate to your /vpc/main.tf and paste this block.

##We will create 1 vpc , 1 subnet and 1 security group


# A VPC is a private, isolated section of the AWS cloud where you can launch resources
resource "aws_vpc" "myvpc" {
 cidr_block = "10.0.0.0/16"
 enable_dns_hostnames = true
 enable_dns_support = true

 tags = {
 Name = "myvpc"

 }
 
}


#A subnet is a division of a VPC that defines a range of IP addresses.
resource "aws_subnet" "pb_sn" {
 vpc_id = aws_vpc.myvpc.id
 cidr_block = "10.0.1.0/24"
 map_public_ip_on_launch = true
 availability_zone = "eu-north-1a"


 tags = {
 Name = "pb_sn1"
 }


}


#A security group is a virtual firewall that controls inbound and outbound traffic to resources within a VPC.
resource "aws_security_group" "sg" {
 vpc_id = aws_vpc.myvpc.id
 name = "my_sg"
 description = "Public Security"


 # Ingress refers to incoming traffic to a resource within the VPC. It specifies which ports and protocols can be accessed from outside the VPC.
 #This rule allows inbound SSH traffic (port 22) from any IP addres
 ingress {
 from_port = 22
 to_port = 22
 protocol = "tcp"
 cidr_blocks = ["0.0.0.0/0"]
 }


 #Egress refers to outgoing traffic from a resource within the VPC. It specifies which ports and protocols can be accessed from within the VPC

 egress {
 from_port = 0
 to_port = 0
 protocol = "-1" #This specifies that the rule applies to all protocols (TCP, UDP, ICMP, etc.).
 cidr_blocks = ["0.0.0.0/0"] #This indicates that the rule applies to all destination IP addresses (the entire internet 
 }
 
}


#In essence, this rule grants the security group complete outbound connectivity, allowing it to communicate with any resource on the internet. This might be useful for certain scenarios, but it's generally considered a security risk as it exposes the resources within the security group to potential threats

We have configured a VPC, a subnet that provides a range of IP addresses, and a security group. Let’s configure our EC2. Navigate to your /ec2/main.tf and paste this block.


#An AMI (Amazon Machine Image) is a template for creating EC2 instances. It contains the software and configuration information required to launch an instance


resource "aws_instance" "server" {
 ami = "<find-a-suitable-ami"
 instance_type = "m5.large" #free tier limits
 subnet_id = ""
 security_groups = "" 

 tags ={
 Name = "myserver"
 }


}

You require an AMI ID that fits the basic requirements for this infrastructure. I found a good AMI to use from the AMI Catalog on AWS. Copy the AMI ID (ami — xxxxxxxx). Paste it on the ami tag in your code.

We require a subnet ID and the security group from our vpc module to use in our EC2 module. We will go to our /vpc/output.tf. and paste the following block.

output "pb_sn" {
 value = aws_subnet.pb_sn.id
}

output "sg" {
 value = aws_security_group.sg.id
 
}

This file suggests that there are variables we are expecting from our VPC’s main configuration file and we pass them to other modules. In /ec2/variables.tf, we define the variables expected from the other modules. Paste this block:

variable "sg"{

}


variable "sn" {
 
}

The main. tf in our root folder is where we pass important information between the two modules. We pass the variables we created in our /vpc/output.tf and assign them to the variables in /ec2/variables.tf. Navigate to your root folder’s main. tf and paste:

#This is the route module where will be passing all of the modules


module "vpc" {
 source = "./vpc"
}


module "ec2" {
 source = "./ec2"
 sn = module.vpc.pb_sn
 sg = module.vpc.sg
}

State management is vital when working with Infrastructure as Code. State management prevents simultaneous state file writes, ensuring data integrity, synchronization, performance, and consistency. Terraform state is a JSON file recording the configuration of your infrastructure, enabling Terraform to track changes and maintain consistency. Usually, terraform requests a lock, checks for existing locks, and applies changes if no lock exists. We do this in a file called the backend. tf.

Imagine you’re working in a team of developers on the same infrastructure. Naturally, shared access is paramount when it comes to these locks by the team for versioning and backup in case a state isn’t configured and a roll-back is required. We can do this with an AWS S3 Bucket which acts as the storage location for your Terraform state file (terraform. tfstate), which holds metadata about your infrastructure resources and Dynamo DB, which provides a locking mechanism to prevent simultaneous modifications.

Navigate to your AWS homepage and search for s3. Create an S3 bucket as shown below.

Figure 5: Creating a general purpose S3 bucket

Create a folder within the s3 bucket and give it a name that will act as our state in our ./backend.tf file

Navigate to your AWS homepage and search for DynamoDB. Create a table with default configurations as shown below. A mistake I made was when naming the partition key. Unless configured manually, the partition key is set to default as “LockID”. Name your partition key “LockID” and save it.

In our root, create a backend.tf file and paste the following block:

terraform {
 backend "s3" {
 bucket = "mystatebucket99" #your S3 bucket's name
 key = "state" #the folder you created in your S3
 region = "eu-north-1" #your aws region
 dynamodb_table = "mydynamotable" #your dynamo DB table name
 }

}

We require the necessary permissions to write into our newly created resources. Navigate to your IAM and the following permissions as shown below:
– AmazonEC2FullAccess
– AmazonDynamoDBFullAccess
– AmazonS3FullAccess

Once that is set up, your project will look like this:

Part IV: Project initialization

We need to connect to our AWS account through the CLI. Navigate to your user profile and take note of your access credentials. Navigate to your CLI and enter the required credentials:

aws configure - you will be prompted to enter your credentials

The following terraform commands are what we will use in our pipeline to configure :

terraform init — Performs backend initialization, child module installation, and plugin installation.
terraform validate — Validates the configuration files in your directory and does not access any remote services.
terraform plan — Shows what actions will be taken without actually performing the planned actions.
terraform apply -auto-approve — Create or update infrastructure depending on the configuration files. Adding auto-approve applies changes without having to type “yes” to the plan
terraform destroy — Destroy the infrastructure.

Part V: CI/CD pipeline creation and deployment

We will create a CI/CD pipeline on GitLab for automation. Create a GitLab repository (make it public) and copy the link to the repository. Add a .gitignore and copy these contents into it. Navigate to your CLI and push your code to your repository.

git remote add origin <your-repository-link>
git remote -v -- used to view the configured remote repositories for your Git project


git checkout -b mlops -- creating a branch
git commit -m " Initial commit"
git push -u origin mlops

It’s always a good practice to push your code to a branch and merge later.

Figure 11: Merging your branch to the main

We need to create two variables on GitLab from AWS: an access key and a secret access key. Navigate to your IAM -> Users -> your created user -> Security Credentials -> Access keys. Proceed to generate an access key.

Figure 12: Creating your user access keys

Navigate to your GitLab homepage and save the keys you created as shown below. MY_AWS_KEY is the Access key while the MY_AWS_ACCESS_KEY is the Secret Access key.

Navigate to your GitLab repository. Create a new file and name it .gitlab-ci.yml. This file guides your pipeline on the necessary steps to take.

Once created, copy this block of code:

image:
 name: registry.gitlab.com/gitlab-org/gitlab-build-images:terraform
 entrypoint:
 - '/usr/bin/env'
 - 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'

#This code tells the GitLab CI/CD pipeline to use a Docker image containing 
#Terraform and sets up the environment for the execution of Terraform commands
#within the container. This allows you to run your Terraform scripts without
#needing to install Terraform directly on the GitLab Runner machine

variables:
 AWS_ACCESS_KEY_ID: ${MY_AWS_KEY}
 AWS_SECRET_ACCESS_KEY : ${MY_AWS_ACCESS_KEY}
 AWS_DEFAULT_REGION: "eu-north-1"

#initializes your variables

before_script:
 - terraform --version
 - terraform init

#instructs the steps to take before the pipeline begins running
#i.e checking the version and backend initialization


stages:
 - validate
 - plan
 - apply
 - destroy


validate:
 stage: validate
 script:
 - terraform validate

plan:
 stage: plan
 script:
 - terraform plan -out="planfile"
 dependencies:
 - validate
 artifacts:
 paths:
 - planfile

#creates a planfile and stores it in artifacts

apply:
 stage: apply
 script:
 - terraform apply -input=false "planfile"
 dependencies:
 - plan
 when: manual

#the "when" only makes this stage possible after manual intervention

destroy: 
 stage: destroy
 script:
 - terraform destroy --auto-approve
 when: manual

Save your file and commit your changes. You will be prompted to launch the pipeline. Launch it and voila! The magic begins.

Once completed, navigate to your AWS homepage and you will see your infrastructure has been created.

We have successfully built a robust and efficient AWS Cloud Infrastructure with Terraform and GitLab CI/CD.

Thank you for reading my article.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Building A Robust and Efficient AWS Cloud Infrastructure with Terraform and GitLab CI/CD.

Author(s): Julius Nyerere Nyambok

Part I: Introduction

GitHub – Jnyambok/Terraform-CI-CD-Pipeline: AWS infrastructure that consists of a VPC and Amazon…

AWS infrastructure consists of a VPC and Amazon EC2 instance deployed through Terraform.Repository mirrored from…

Part II: Infrastructure Definition

Part III: Terraform structure definition and configuration.

Part IV: Project initialization

Part V: CI/CD pipeline creation and deployment

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Why Knowledge Graphs Are the Missing Piece in AI Agent API Discovery

The Complexity of Self-Driving Cars Explained Simply

Bridging Symbolic AI and Deep Learning: How Knowledge Graphs are Revolutionizing ResNets

LAI #93: Smarter Model Choices, Multi-Agent Systems, and Cutting Through AI Noise

Who Wins Purview vs Rogue AI in Data Control

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Building A Robust and Efficient AWS Cloud Infrastructure with Terraform and GitLab CI/CD.

Author(s): Julius Nyerere Nyambok

Part I: Introduction

GitHub – Jnyambok/Terraform-CI-CD-Pipeline: AWS infrastructure that consists of a VPC and Amazon…

AWS infrastructure consists of a VPC and Amazon EC2 instance deployed through Terraform.Repository mirrored from…

Part II: Infrastructure Definition

Part III: Terraform structure definition and configuration.

Part IV: Project initialization

Part V: CI/CD pipeline creation and deployment

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement