Scalable ECS Cluster

You are viewing article number 6 of 12 in the series Scalable Self-Hosted GitHub Runners on AWS Cloud

The GitHub Actions runner containers will run in a dedicated ECS cluster.

To simplify the process of managing compute, i.e capacity, Amazon ECS cluster auto scaling will be enabled. Using this method, the scaling in/out of EC2 instances within the cluster is managed by Amazon. All we need to supply are a few basic parameter values during the process of enabling the feature.

ECS Cluster Creation

Create a cluster named GitHub-Actions-Runners:

aws ecs create-cluster \
--region us-east-1 \
--cluster-name GitHub-Actions-Runners

EC2 Launch Template

Amazon Linux 2023 will be specified as the Amazon Machine Image (AMI) to use in the launch template.

ECS Optimized AMI (Amazon Linux 2023)

To retrieve the ECS optimized “Amazon Linux 2023” image_id from ssm for region us-east-1, run the following :

aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2023/recommended --region us-east-1

Output:

{
    "Parameters": [
        {
            "Name": "/aws/service/ecs/optimized-ami/amazon-linux-2023/recommended",
            "Type": "String",
            "Value": "{\"ecs_agent_version\":\"1.77.0\"\
                ,\"ecs_runtime_version\":\"Docker version 20.10.23\",
            ...
                 \"image_id\":\"ami-09e56bbbf745f34c3\"
            ...
            ...

}

This returns an image_id = ami-09e56bbbf745f34c3.

Userdata

Create the following file to generate the Userdata for the template.

cat <<'UDATA' > $HOME/userdata.txt
#!/bin/bash
cat <<'EOF' >> /etc/ecs/ecs.config
ECS_CLUSTER=GitHub-Actions-Runners
EOF
UDATA

The Userdata value will need to be supplied as base64 encoded input into the CLI command for creating the template.

Launch Template JSON

The launch template allows instance profile (IAM role), which contains the permissions required for the instance to interact with associated services.

For the example which follows, instance profile ecsInstanceRole, was specified. Instructions for creating the profile can be found at https://docs.aws.amazon.com/AmazonECS/latest/developerguide/instance_IAM_role.html. Instance profiles that lack the necessary permissions can result in undesired behaviour, for example, failure of an EC2 container to launch in a specified cluster/ASG.

Create a JSON file with the launch template data.

cat << EOF > $HOME/launch-template-data.json
{
  "ImageId": "ami-09e56bbbf745f34c3",
  "IamInstanceProfile": {
    "Name": "ecsInstanceRole"
  },
  "InstanceType": "t2.micro",
  "TagSpecifications": [
    {
      "ResourceType": "instance",
      "Tags": [
        {
          "Key": "Name",
          "Value": "AmazonLinux2023"
        }
      ]
    }
  ],
  "UserData": "$(cat $HOME/userdata.txt | base64 -w 0)"
}
EOF

Create the Template

To create the launch template named, GitHub-Actions-Launch-template:

aws ec2 create-launch-template \
--launch-template-name GitHub-Actions-Launch-template \
--launch-template-data file://$HOME/launch-template-data.json

Auto Scaling group (ASG)

Create ASG

To create an ASG named GitHub-Actions-asg, specifying the template we just created, run the following:

aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name GitHub-Actions-asg \
    --launch-template LaunchTemplateName=GitHub-Actions-Launch-template,Version='$Latest' \
    --min-size 0 \
    --max-size 5 \
    --vpc-zone-identifier "subnet-f7f6ec4d,subnet-74b07fb6"

In the above:

  • minimum number of instances is set to 0
  • maximum number of instances is set to 5
  • the subnets subnet-f7f6ec4d, subnet-74b07fb6 belong to the Default VPC and reside in different availability zones

Retrieve ASG ARN

Make a note of the ASG’ arn–it will be required for setting up Managed cluster scaling (in the next section).

To list the arn of for the ASG:

aws autoscaling describe-auto-scaling-groups \
    --auto-scaling-group-name GitHub-Actions-asg

output:

AutoScalingGroups": [
{
...
...
 "AutoScalingGroupARN": "arn:aws:autoscaling:us-east-1:xxxxxxxxxxxx:autoScalingGroup:211b111c-8f76-1115-b6f4-765463g4f7e:autoScalingGroupName/GitHub-Actions-asg"
...

Enable Amazon Managed Cluster Scaling

A capacity provider will need to be created for Managed Cluster scaling to work.

CLI Capacity Provider Definition File

Create the JSON definition file for the capacity provider. The new provider’ name is GitHubActionsCapacity.

cat << EOF > $HOME/capacity_provider.json
{
    "name": "GitHubActionsCapacity",
    "autoScalingGroupProvider": {
        "autoScalingGroupArn": "arn:aws:autoscaling:us-east-1:xxxxxxxxxxxx:autoScalingGroup:211b111c-8f76-1115-b6f4-765463g4f7e:autoScalingGroupName/GitHub-Actions-asg",
        "managedScaling": {
            "instanceWarmupPeriod": 300,
            "maximumScalingStepSize": 1,
            "minimumScalingStepSize": 1,
            "status": "ENABLED",
            "targetCapacity": 80
        },
        "managedTerminationProtection": "DISABLED"
    },
    "tags": [{
            "key": "Name",
            "value": "GitHubActionsCapacity"
        }
    ]
}
EOF

Attributes related to Managed scaling are:

...
"managedScaling": {
"instanceWarmupPeriod": 300,
"maximumScalingStepSize": 1,
"minimumScalingStepSize": 1,
"status": "ENABLED",
"targetCapacity": 80
},
...

Descriptions for each of these can be found in the official ECS documentation. In the above, managedScaling["status"] has been set to “enabled”, indicating that Amazon will manage the compute/scaling.

Create Capacity Provider

Create a capacity provider named GitHubActionsCapacity.

aws ecs create-capacity-provider \
    --cli-input-json file://$HOME/capacity_provider.json

Output:

{
    "capacityProvider": {
        "capacityProviderArn": "arn:aws:ecs:us-east-1:xxxxxxxxxxxx:capacity-provider/GitHubActionsCapacity",
        "name": "GitHubActionsCapacity",
        "status": "ACTIVE",
        "autoScalingGroupProvider": {
            "autoScalingGroupArn": "arn:aws:autoscaling:us-east-1:xxxxxxxxxxxx:autoScalingGroup:211b111c-8f76-1115-b6f4-765463g4f7e:autoScalingGroupName/GitHub-Actions-asg",
            "managedScaling": {
                "status": "ENABLED",
                "targetCapacity": 80,
                "minimumScalingStepSize": 1,
                "maximumScalingStepSize": 1,
                "instanceWarmupPeriod": 300
            },
            "managedTerminationProtection": "DISABLED"
        },
        "tags": [
            {
                "key": "Name",
                "value": "GitHubActionsCapacity"
            }
        ]
    }
}

Cluster Default Capacity Provider

Set the new capacity provider we just created as the default capacity provider strategy for the cluster:

aws ecs put-cluster-capacity-providers \
  --cluster GitHub-Actions-Runners \
  --capacity-providers GitHubActionsCapacity \
  --default-capacity-provider-strategy capacityProvider=GitHubActionsCapacity,weight=1

Output:

{
    "cluster": {
        "clusterArn": "arn:aws:ecs:us-east-1:xxxxxxxxxxxx:cluster/GitHub-Actions-Runners",
        "clusterName": "GitHub-Actions-Runners",
        "status": "ACTIVE",
        "registeredContainerInstancesCount": 0,
        "runningTasksCount": 0,
        "pendingTasksCount": 0,
        "activeServicesCount": 0,
        "statistics": [],
        "tags": [],
        "settings": [
            {
                "name": "containerInsights",
                "value": "disabled"
            }
        ],
        "capacityProviders": [
            "GitHubActionsCapacity"
        ],
        "defaultCapacityProviderStrategy": [
            {
                "capacityProvider": "GitHubActionsCapacity",
                "weight": 1,
                "base": 0
            }
        ],
        "attachments": [
            {
                "id": "*******-*****-****-****-************",
                "type": "as_policy",
                "status": "PRECREATED",
                "details": [
                    {
                        "name": "capacityProviderName",
                        "value": "GitHubActionsCapacity"
                    },
                    {
                        "name": "scalingPolicyName",
                        "value": "ECSManagedAutoScalingPolicy-********-****-****-****-************"
                    }
                ]
            }
        ],
        "attachmentsStatus": "UPDATE_IN_PROGRESS"
    }
}	

Verify Resources via AWS Console

Using the AWS console, check the resources created in previous sections.

Autoscaling Group

Details

  • ASG created and associated with Launch template
  • capacities are as expect
    • Minimum=0
    • Maximum=5
    • Desired : in the case of Managed Cluster scaling, the desired instance count is automatically adjusted to maintain the targetCapacity specified for the capacity provider
  • notice how a tag has been automatically created for the ASG, with key=AmazonECSManaged
  • this tag should not be deleted or edited
    • refer to https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cluster-auto-scaling.html
Autoscaling Group Details

Scaling Policy

For the ASG, an automatically generated policy should appear with a prefix name of ECSManagedAutoScaling-.

Managed Scaling Policy

Instances

New instances should appear under the ASG’ Instance Management’.

Asg Instances

Activity History

Under the ASG’ Activity history, you should notice instances being added/removed, with the cause being related to the Amazon Managed scaling policy (ECSManagedAutoScalingPolicy). This policy initially scaled the ASG out from 0 to 2 instances, before subsequently scaling down from 2 instances down to 1.

Asg Activity History

Cluster

  • new cluster with associated capacity provider
  • “Managed Scaling” is set to “yes”
  • one container instance is running (as described in previous section)
Cluster Capacity Provider
Series Navigation<< Build/Push Runner Image using CodeBuildEventBus and Schema Discover for Webhook Events >>