Terraform AWS Account Bootstrap

Our goal with this article is to provision a brand new AWS account with a Terraform state backend and a Terraform user, and/or role(s), that can be used to manage AWS resources. In the past, I've done terrible things like give full AWS admin access for a single IAM user. From a best practices standpoint, it's better to user IAM assumable roles and a basic user that can assume roles to establish Terraform access to the AWS environment.

Manual Steps

Several AWS resources must be created before integration of Terraform can be achieved. Let's run through those. For housekeeping, I tag manually created resources with Terraform =false so that I can easily find resources and add them to Terraform later, where prudent. It's possible that you can somewhat automate this process if you have scale issues, but for me I just needed to accommodate two AWS accounts.

Create IAM User

First, we need a user that can use the STS Assume Role functionality. Create a Programatic access user of your choice and use the built-in IAM policy generator to create a policy. The policy below allows the user to assume any role in the account. We will tighten this up later.‌

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "arn:aws:iam::<account>:role/*"
        }
    ]
}

Your user ARN should look something like:‌

arn:aws:iam::<your_account_id>:user/tf_user

Create Terraform S3 Backend Bucket

Before we can properly create more IAM role policies, we need a S3 bucket object in which those role policies can target. We will use the bucket name and ARN later. It will look something like the following:‌

arn:aws:s3:::<your_terraform_backend_bucket_name>

Create IAM Policies and Roles

Now we need to create some useful roles in which the Terraform user can assume. We will need two roles, one for the S3 Terraform state backend and one for AWS resource provisioning and resource management.

But first, we need some custom policies to attach to the roles. Below is an example of the policy required by Terraform for S3 backend operations [2].

IAM Custom S3 Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucketVersions",
                "s3:ListBucket",
                "s3:GetBucketVersioning",
                "s3:PutBucketVersioning"
            ],
            "Resource": [
                "arn:aws:s3:::<your_s3_backend>",
                "arn:aws:s3:::<your_s3_backend>/*"
            ]
        }
    ]
}

Now we can create the IAM role and attach the above policy. We end up with an ARN for the S3 Terraform backend role, like so:‌

arn:aws:iam::<your_account_id>:role/tf_s3_backend_role

Now, for the tricky part. We need to create an assumable role for Terraform provisioning operations. For now, I am going to cheat and give the role the AdministratorAcess pre-canned, AWS policy. We will scale this level of access back in another article as it's a fussy process but ideally we want to use the least amount of privilege as possible. With that, we end up with a role for provisioning, like so:‌

arn:aws:iam::<your_account_id>:role/tf_provisioning_role

Cleanup User IAM Policy

Go back and modify the IAM policy for your IAM user to explicitly use the role ARNs we just created. Your IAM user policy should look something like this:‌

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": [
                "arn:aws:iam::<your_account_id>:role/tf_s3_backend_role",
                "arn:aws:iam::<your_account_id>:role/tf_provisioning_role"
            ]
        }
    ]
}

Setup Terraform with Terragrunt

Now we need to inform Terraform how to use the new S3 backend. For my purposes, I use the Terragrunt binary to generate the S3 backend configuration per Terraform workspace. It works pretty well. Below is the contents of a file named terragrunt.hclwhere our Terragrunt configuration resides. More about Terragrunt can be found here: https://terragrunt.gruntwork.io/docs/getting-started/quick-start/#keep-your-backend-configuration-dry.‌

generate "s3_backend" {
  path      = "tg-backend.tf"
  if_exists = "overwrite_terragrunt"
  contents = <<EOF
terraform {
  backend "s3" {
    access_key  = "${get_env("TF_VAR_aws_access_key_id")}"
    secret_key  = "${get_env("TF_VAR_aws_secret_access_key")}"
    bucket      = "${get_env("TF_VAR_s3_backend_bucket")}"
    key         = "terraform.tfstate"
    region      = "us-west-2"
    encrypt     = true
    role_arn    = "arn:aws:iam::${get_env("TF_VAR_account_id")}:role/tf_s3_backend_role"
    external_id = "<random_identifier_can_be_anything>"
  }
}
EOF
}

From the above file contents, we can see that sensitive items are pulled from the shell's environment variables. We can also see our previously configured S3 backend role being referenced. This configuration works well for solo developers or installed in some CI/CD process. Now, we should be able to terragrunt init, like so:‌

>terragrunt init 
Initializing modules...

Initializing the backend...

Initializing provider plugins...
- Reusing previous version of hashicorp/aws from the dependency lock file
- Using previously-installed hashicorp/aws v3.65.0

Terraform has been successfully initialized!

After init, we can try to apply to ensure it all works:‌

> terragrunt apply

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # module.vpc_usw2_network_public.aws_internet_gateway.this[0] will be created
  + resource "aws_internet_gateway" "this" {
      + arn      = (known after apply)
      + id       = (known after apply)
      + owner_id = (known after apply)
      + tags     = {
          + "Name" = "vpc-us-west-2-public"
        }
      + tags_all = {
          + "Name" = "vpc-us-west-2-public"
        }
      + vpc_id   = (known after apply)
    }

 <...redacted...>
 
  # module.vpc_usw2_network_public.aws_vpc.this[0] will be created
  + resource "aws_vpc" "this" {
      + arn                              = (known after apply)
      + assign_generated_ipv6_cidr_block = false
      + cidr_block                       = "10.0.1.0/24"
      + default_network_acl_id           = (known after apply)
      + default_route_table_id           = (known after apply)
      + default_security_group_id        = (known after apply)
      + dhcp_options_id                  = (known after apply)
      + enable_classiclink               = (known after apply)
      + enable_classiclink_dns_support   = (known after apply)
      + enable_dns_hostnames             = true
      + enable_dns_support               = true
      + id                               = (known after apply)
      + instance_tenancy                 = "default"
      + ipv6_association_id              = (known after apply)
      + ipv6_cidr_block                  = (known after apply)
      + main_route_table_id              = (known after apply)
      + owner_id                         = (known after apply)
      + tags                             = {
          + "Name" = "vpc-us-west-2-public"
        }
      + tags_all                         = {
          + "Name" = "vpc-us-west-2-public"
        }
    }

Plan: 8 to add, 0 to change, 0 to destroy.

Do you want to perform these actions in workspace "prod"?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: 

Looks like it all works!

References

  1. https://8thlight.com/blog/mike-knepper/2021/05/11/minimally-privileged-terraform.html
  2. https://www.terraform.io/docs/language/settings/backends/s3.html