Using Serverless framework and AWS sts:assume-role to cross deploy to different AWS accounts

In order to assume a role in another account, the owning account needs to grant a ‘trust relationship’ to those allowed to assume the role. This can be done by referencing an IAM username or role for those in the other account that are allowed to assume this role.

You can do this in the Console using the Trust Relationship tab:

A Policy to grant access to to a specific IAM user looks like:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::ACCOUNT-ID:user/USER-ID"
      },
    "Action": "sts:AssumeRole"
    }
  ]
}

To assume this role, use the ‘aws sts assume-role’ cli command:

aws sts assume-role --role-arn arn:aws:iam::ACCOUNT-ID:role/ROLE-NAME --role-session-name SESSION-NAME

If this is successful, you’ll see a response that grants temporary values for the following AWS credentials that can be used after this point:

  • AccessKeyId
  • SecretAccessKey
  • SessionToken

The returned values can be used to set env vars to use with the CLI and other AWS SDK apps:

  • export AWS_ACCESS_KEY_ID=
  • export AWS_SECRET_ACCESS_KEY=
  • export AWS_SESSION_TOKEN=
  • export AWS_DEFAULT_REGION=

For Servlerless to deploy into another account, if you attempt a Serverless deploy at this point, you’ll see errors like:

User: arn:aws:sts::ACCOUNT-ID:assumed-role/ServerlessLambdaDeployRole/lambdadeploy is not authorized to perform: cloudformation:CreateStack on resource: arn:aws:cloudformation:us-east-1:TARGET-ACCOUNT-ID:stack/deploy-demo/*

In this case cloudformation:CreateStack is missing from the assumed role. If you incrementally attempt to find what additional permissions you’ll need to deploy, you’ll also need to add:

  • cloudformation:DescribeStackEvents
  • cloudformation:DescribeStackResource
  • cloudformation:ValidateTemplate
  • cloudformation:UpdateStack
  • cloudformation:DeleteStack
  • apigateway:POST
  • iam:CreateRole
  • iam:PutRolePolicy

ValidateTemplate appears to throw an error unless the Resource is for a wildcard of ‘*’ and not anything more specific, otherwise you’ll see this error:

Error: The CloudFormation template is invalid: User: arn:aws:sts::ACOUNT-ID:assumed-role/ServerlessLambdaDeployRole/lambdadeploy is not authorized to perform: cloudformation:ValidateTemplate

To grant permissions for ValidateTemplate specify a Resource of “*”

{
"Sid": "CreateCloudFormationStackValidate",
"Effect": "Allow",
"Action": [
"cloudformation:ValidateTemplate"
],
"Resource": "*"
}

The STS temporary credentials will expire after 1 hour, so if you see this error:

An error occurred (ExpiredToken) when calling the AssumeRole operation: The security token included in the request is expired

then you’ll need to rerun the ‘aws sts assume-role’ command again. If you previously set the session token in AWS_SESSION_TOKEN, you’ll need to set it back to blank (along with AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY) before you run the command again. When you get the refreshed values, remember to set the env vars with the updated values.

At this point, if you’ve run ‘aws sts assume-role’ and you’ve set the env vars for the returned temp credentials, you’ll be able to run a ‘serverless deploy’ and deploy into the other account where you’ve assumed this new role with the permissions to deploy. This should include permissions for creating a new Lambda and API Gateway, if you’re deploying anything else from your serverless config, you’ll need to add those permissions to the role you’re assuming.

Using different profiles with aws cli

With the AWS CLI you can configure a number of named profiles, with credentials for different accounts. These are stored in ~/.aws/configuration

To show the currently in use profile, use

aws configure list

To view all configured profiles:

aws configure list-profiles

The default profile is used when you don’t specify a named profile. The others are used when you pass the –profile profilename parameter

Summary: What you need to deploy a Docker container to AWS ECS Fargate

In my previous post I walked through a couple of tutorials to deploy a test Docker container to AWS ECS Fargate. As a summary, here’s the various parts that you need to have in place to deploy a Docker container using Fargate:

  • A Docker image, deployed to a Docker repository, e.g. either Docker Hub, or AWS ECR
  • A VPC with either a public or private subnet (or both)
  • A Security Group to define what traffic is allowed in and out to your running Container
  • A ELB Load Balancer, assuming you’re running more than 1 instance of a container and are not accessing a single instance directly with a public IP
  • An ECS Cluster
  • An ECS Task Definition
  • An ECS Fargate Service Definition to create the running instance of your task

Deploying Docker containers to AWS ECS Fargate

The interesting feature of AWS ECS Fargate is that it’s ‘serverless for containers’. Serverless broadly means you don’t need to be concerned with the provisioning and maintenance of the servers or compute that are running your code. With Fargate, you don’t have to provision compute for your Docker Containers, AWS manages the compute for you.

If you’re working with Docker containers, AWS have multiple runtime options, each with their own pros and cons:

  • running Docker on your own EC2 instances – the roll your own approach, you provision instances and manage everything yourself
  • AWS ECS with EC2 launch type – you still need to provision a pool of available EC2 instances on which AWS will run your containers
  • AWS EKS – managed Kubernetes
  • AWS ECS with Fargate launch type – you don’t need to provision any compute (e.g. EC2), AWS manages the compute for you

I’m taking a look at AWS ECS Fargate to see what it takes to deploy a Docker container.

An ECS cluster needs a VPC in which your container instances will run, with at least 1 public or private subnet. Steps to create a new VPC with subnets is covered here.

Following these steps from the VPC section in ECS tutorials using the AWS Console I created:

  • an Elastic IP to associate with my cluster for public access
  • a new VPC with 1 private subnet and 1 public subnet

I created these with the VPC Wizard using this option:

Apparently your public subnet doesn’t get assigned a public IP by default, so follow these steps in the guide to change this default behavior:

When you select your public subnet, this option is under Actions here:

Select this option:

My public subnet was created in AZ us-west-2a and my private subnet is also in the same AZ. The guide recommends creating 1 additional public and private subnets in a different AZ high for availability.

To create a ECS Fargate cluster you can use the AWS CLI like this:

aws ecs create-cluster --cluster-name your-fargate-cluster-name

This will return some stats about your newly created cluster, like:

"clusterName": "fargate-cluster1",
"status": "ACTIVE",
"registeredContainerInstancesCount": 0,
"runningTasksCount": 0,
"pendingTasksCount": 0,
"activeServicesCount": 0,
"statistics": [],
"tags": [],
"settings": [
{
"name": "containerInsights",
"value": "disabled"
}
],
"capacityProviders": [],
"defaultCapacityProviderStrategy": []

However, I’m not sure at this point how to configure the new cluster to specify the VPC and subnets I just created, so for my first cluster I’m going to use the ECS wizard in the AWS Console first, and then come back to the CLI later.

Using the wizard I selected the Networking Only option with Fargate:

I don’t need to select the ‘Create VPC’ option because I’ve already created one:

Turns out there aren’t any options to associate the VPC at this point, the tasks are associated to your VPC and subnets when you create them next. So using the CLI step earlier would create the cluster exactly the same.

You need to define an ECS task definition that defines the task that will run on the ECS cluster. Following the tutorial here, the example JSON file provided as an example looks like this:

{
"family": "sample-fargate",
"networkMode": "awsvpc",
"containerDefinitions": [
{
"name": "fargate-app",
"image": "httpd:2.4",
"portMappings": [
{
"containerPort": 80,
"hostPort": 80,
"protocol": "tcp"
}
],
"essential": true,
"entryPoint": [
"sh",
"-c"
],
"command": [
"/bin/sh -c \"echo ' Amazon ECS Sample App
body {margin-top: 40px; background-color: #333;}
Amazon ECS Sample App
Congratulations!
Your application is now running on a container in Amazon ECS.
' > /usr/local/apache2/htdocs/index.html && httpd-foreground\""
]
}
],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "256",
"memory": "512"
}

Since we’re deploying a Docker container, we need to specify a Docker image to pull some somewhere. This example provides the name of a Docker container to pull from Docker Hub, in this case httpd:2.4. To. deploy your own apps, you configure your own dockerfile for your app, and publish it to a Docker repo like Docker Hub, or AWS ECR.

Register this task definition with:

aws ecs register-task-definition --cli-input-json file://task-def.json

When –cli-input-json reads your config file, it will open is whatever is your default editor in your shell. On my Mac in zsh it appears to open the file in vim with a ‘:’ prompt at the bottom of the screen, and pressing ‘q’ quits the editor and continues registering the Task Def.

You can list registered Task Definitions with:

aws ecs list-task-definitions

By default, your ECS service will only have a private IP, and would typically be exposed publicly via an ELB. You can configure the task to get allocated it’s own public IP by adding this config:

"networkConfiguration": {
"awsvpcConfiguration": {
"assignPublicIp": "ENABLED",
"securityGroups": [ "sg-12345678" ],
"subnets": [ "subnet-12345678" ]
}
}

This is where we we specify the subnets that were created earlier. I’m going to publicly expose this container, so I’m associating it with the 2 public subnets I created (added to the above config snippet).

I also need a Security Group for the config, so I’ll create that too and allow incoming traffic on port 80.

It’s not obvious from the docs where this NetworkConfiguration section gets specified, but it doesn’t go in the Task Definition json, it gets passed when you create the Service using the Task Definition.

To create a Service, use this cli command:

aws ecs create-service --cluster fargate-cluster --service-name fargate-service --task-definition sample-fargate:1 --desired-count 1 --launch-type "FARGATE" --network-configuration "awsvpcConfiguration={subnets=[subnet-abcd1234],securityGroups=[sg-abcd1234],assignPublicIp=ENABLED}"

Using this command to plug in the subnet ids and Security Group id, from the ECS Console you’ll now see you have service running! If you drill down to the task you can find the assigned public IP. Hit the IP to call the service! Since we’re running an httpd container with a sample web page, we see:

Awesome, up and running!