Written by: Everton Morais & Ritwick Gupta

Continuous Integration and Continuous Deployment (CI/CD) is an increasingly popular approach to software development that promises faster delivery of high-quality code. At Nuvalence, we often recommend this approach for some of our most impactful cloud transformations. While CI/CD offers many benefits, it also brings some challenges for developers. One of the main issues that developers face while using CI/CD is related to testing. It’s a critical step in the software development process, and it becomes even more crucial with the adoption of CI/CD. 

We recently ran into a great example of these challenges with one of our clients, whose environments were prone to deployment issues like component collisions. Read on to learn more about the problems they had and how we helped solve them.

Background

While working for this client, we were part of a platform team that was tasked with creating an architecture for state management service, which enabled other teams to create and manage state machines and state objects externally in a centralized service. We used Terraform for infrastructure management on Google Cloud Platform, a CI/CD pipeline for infrastructure and code deployment, and trunk-based development. Developers used the same Terraform script to deploy their infrastructure and code to sandbox so they could test everything before merging to main.

The plan was to have the same application running in different environments (we considered each environment a different project). The goal was to use the same code, only changing the specific parameters for each environment.

Screen grab of different environments organized as different projects.
Exhibit 1: Different environments organized as different projects

We had the main.tf script creating all components into our cloud: Google Cloud’s Cloud Run with Service Account and a Pub/Sub.

resource "google_service_account" "service_account" {
  account_id   = "app-car-management-service"
  display_name = "Service Account for the Car Management Service"
}

resource "google_pubsub_topic" "pubsub_topic" {
  name = "app-car-management-topic"
}

resource "google_pubsub_subscription" "pubsub_subscription" {
  name                 = "app-car-management-subscription"
  topic                = google_pubsub_topic.pubsub_topic.name
  ack_deadline_seconds = 10
}

resource "google_cloud_run_service" "service" {
  depends_on = [google_cloud_run_service.service, google_pubsub_topic.pubsub_topic]

  name     = "app-car-management-service"
  location = var.gcp_region

  template {
    spec {
      service_account_name = google_service_account.service_account.email

      containers {
        image = "us-central1-docker.pkg.dev/${var.gcp_project_id}/nuvalence-examples/car-management-service"

        env {
          name  = "TOPIC_NAME"
          value = google_pubsub_topic.pubsub_topic.name
        }
      }
    }
  }

  traffic {
    percent         = 100
    latest_revision = true
  }
}

These were the variables:

variable "gcp_project_id" {
 type = string
}

variable "gcp_region" {
 type    = string
 default = "us-east4"
}

Each environment folder was set up with variables according to the environment, referencing the main.tf file on the folder. Following is an example of the main-dev.tf for the dev environment:

module "base" {
 source = "../.."

 gcp_project_id  = "playground-dev"
 gcp_region      = "us-central1"
}

This is the deployment of Cloud Run when Terraform Apply ran in the dev folder:

Screen grab displaying deployment of Cloud Run when Terraform Apply ran in the dev folder.
Exhibit 2: Deployment of Cloud Run when Terraform Apply ran in the dev folder

Considering each environment was in a different project, there was no issue with components sharing the same name because they were unique per project. However, testing the same Terraform script with multiple components in the same project/environment where two or more developers worked together was hard to manage.

The Problem(s)

Terraform scripts provide us with a great, consistent way to deploy and manage infrastructure using code (IaC). Issues occur when developers working on different features want to test their code by deploying on the same environment. For example:

  • Component Collision: Most components cannot have the same name when deployed to the same project.
  • Component Override: Using the same deployment within the team to test changes can be hectic. Others can override important changes, and the script change management can be hard to deal with.
  • Parallelism: If the team decides to keep a single Terraform deployment for “testing purposes,” developers will need to sync on the usage of the deployment, slowing down the development process.

The Solution: A Custom Stack

Using the power of Terraform, we made some slight changes to leverage our Terraform script to produce multiple independent deployments, avoiding the problems listed earlier.

Considering the same SpringBoot application, here’s how we did it.

The Prefix Variable

A prefix (unique identifier) was created for each team member working on the same code. It could be a name or initials; the most important thing was for it to be unique per person. We first needed to create a Terraform variable for the prefix.

variable "gcp_project_id" {
 type = string
}

variable "gcp_region" {
 type    = string
 default = "us-east4"
}

variable "prefix" {
 type    = string
 default = "app"
}

The default was app, which means that for dev, stage, and prod deployments, we didn’t need to define a unique identifier for the prefix. When a deployment had a different prefix from the given app, we called it a custom stack.

Terraform Script Changes

After creating the prefix variable, we needed to ensure all components use it so that the names wouldn’t collide. Now, the Cloud Run for dev is app-car-management-service. If Joey, one of the team members, wanted to deploy a custom stack for himself, he would make the changes so that his Cloud Run is called joey-car-management-service. Same thing if Ross ran a Terraform custom deployment for him, it would be called ross-car-management-service.

The final main.tf script looked like this:

resource "google_service_account" "service_account" {
  account_id   = "${var.prefix}-car-management-service"
  display_name = "Service Account for the ${var.prefix} Car Management Service"
}

resource "google_pubsub_topic" "pubsub_topic" {
  name = "${var.prefix}-car-management-topic"
}

resource "google_pubsub_subscription" "pubsub_subscription" {
  name                 = "${var.prefix}-car-management-subscription"
  topic                = google_pubsub_topic.pubsub_topic.name
  ack_deadline_seconds = 10
}

resource "google_cloud_run_service" "service" {
  depends_on = [google_cloud_run_service.service, google_pubsub_topic.pubsub_topic]

  name     = "${var.prefix}-car-management-service"
  location = var.gcp_region

  template {
    spec {
      service_account_name = google_service_account.service_account.email

      containers {
        image = "us-central1-docker.pkg.dev/${var.gcp_project_id}/nuvalence-examples/${var.prefix}-car-management-service"

        env {
          name  = "TOPIC_NAME"
          value = google_pubsub_topic.pubsub_topic.name
        }
      }
    }
  }

  traffic {
    percent         = 100
    latest_revision = true
  }
}

The Custom Stack Environment

We created a custom folder in the envs folder to house our infrastructure configuration for our custom stack. We added the custom folder to .gitignore to avoid checking it in. As we did for dev, stage, and prod, we set up the values of environment variables for this new deployment. The main-custom.tf script looked like this:

module "base" {
 source = "../.."

 gcp_project_id = "playground-dev"
 gcp_region     = "us-central1"
 prefix         = "joey"
}

As you can see, we included a value for the prefix variable, which is joey. That means every component related to Joey’s deployment would start with joey- on the playground-dev project.

Screen grab displaying a custom folder housing the custom stack's infrastructure configuration
Exhibit 3: A custom folder houses the custom stack’s infrastructure configuration

Terraform Apply

To deploy the custom stack, we ran Terraform commands in the folder that contained this main-custom.tf. After applying these changes, Joey’s custom stack was created.

Screen grab of Joey’s custom stack.
Exhibit 4: Joey’s custom stack

Let’s say Ross also wanted to have his own Terraform deployment. Since we are not committing the custom files to version control, he would just need to create his own main-custom.tf file with his prefix information:

module "base" {
 source = "../.."

 gcp_project_id = "playground-dev"
 gcp_region     = "us-central1"
 prefix         = "ross"
}

After Ross applied the terraform script, the playground-dev project would have three of each component: one for the app (which was the dev deployment), one for Ross, and the other one for Joey.

Prefixes:

  • app-: This was the official component for the dev environment. 
  • joey-: Custom stack for Joey. All components related to Joey’s stack had the same prefix.
  • ross-: Same thing for Ross. All components related to Ross’ stack start with ross-.

Service Accounts

Screen grab of Service accounts for the dev environment and both custom stacks
Exhibit 5: Service accounts for the dev environment and both custom stacks

Pub/Sub

Screen grab of Pub/Sub topics for the dev environment and both custom stacks
Exhibit 6: Pub/Sub topics for the dev environment and both custom stacks

Cloud Run

Screen grab of Cloud Run services for the dev environment and both custom stacks
Exhibit 7: Cloud Run services for the dev environment and both custom stacks

Conclusion

Terraform provides engineering teams with the ability to deploy and manage infrastructure using scripts. However, teams can run into issues when multiple developers are testing different features/versions on the same environment. We solved this problem by using resource prefixes that can enable developers to deploy custom stacks for testing purposes and work in parallel. This solution is really easy to implement, provided that one has a good understanding of their Terraform scripts. We estimate this reduced our testing efforts by around 90% by eliminating the need to manage resource conflicts. While we used Google Cloud Platform as our cloud provider for this instance, we could achieve the same results with any other cloud provider.

Let’s talk about your future.

GET IN TOUCH