Dynamic Runner Sizing (CPU/Memory)

You are viewing article number 8 of 8 in the series GitHub just-in-time (JIT) self-hosted Runners on AWS Fargate

GitHub just-in-time (JIT) self-hosted Runners on AWS Fargate

Overview of JIT GitHub Runners on AWS Fargate
ECS Fargate Cluster
Standard SQS Queue
ECR/Docker Image for JIT Runner
ECS Fargate Task
EventBridge Rule/Targets
Putting the Infrastructure to the Test
Dynamic Runner Sizing (CPU/Memory)

During the creation of the Fargate task, the CPU and memory allocations (task size) were set to the following values:

CPU: 1 vCPU
Memory: 3 GB

This “static” sizing may prove to be insufficient for more resource intensive workflow jobs.

In this post, we will explore an approach for dynamically overriding task CPU and Memory values based on the name of the workflow job label(s).

The aim is to derive a naming convention that maps each label to a specific runner size. EventBridge rule(s) can then be created, with corresponding event pattern(s) to filter for the label name received in the payload. A TaskOverride is then applied to allocate the label-specific CPU and Memory values.

We’ll be writing EventBridge rules which include patterns that filter on the labels event field (an array). It is strongly advised that event matching behaviour be reviewed before writing more complex multi-value patterns.

Label Naming Convention for Task Sizing

The following table shows the valid combinations of CPU and Memory supported by Fargate.

CPU value	Memory value	Operating systems supported for AWS Fargate
256 (.25 vCPU)	512 MiB, 1 GB, 2 GB	Linux
512 (.5 vCPU)	1 GB, 2 GB, 3 GB, 4 GB	Linux
1024 (1 vCPU)	2 GB, 3 GB, 4 GB, 5 GB, 6 GB, 7 GB, 8 GB	Linux, Windows
2048 (2 vCPU)	Between 4 GB and 16 GB in 1 GB increments	Linux, Windows
4096 (4 vCPU)	Between 8 GB and 30 GB in 1 GB increments	Linux, Windows
8192 (8 vCPU) Note: This option requires Linux platform `1.4.0` or later.	Between 16 GB and 60 GB in 4 GB increments	Linux
16384 (16vCPU) Note: This option requires Linux platform `1.4.0` or later.	Between 32 GB and 120 GB in 8 GB increments	Linux

Assume we decide to adopt the following label naming convention for requesting a specific runner size,

vcpu-ccccc-mem-nnnnnn

where ccccc represents CPU value, and nnnnnn the amount of Memory.

A basic task/runner size to label mapping could look something the following:

Runner size	CPU	Memory	Label
small	512	2048	vcpu-512-mem-2048
medium	2048	4096	vcpu-2048-mem-4096
large	4096	8192	vcpu-4096-mem-8192

So, to request a small runner for a GitHub workflow job, we would include the corresponding label from the above table to runs-on:

name: SMALL sized runner

jobs:
  execute-on-small-runner:
    runs-on: [..., vcpu-512-mem-2048]
    steps:
      - uses: actions/checkout@v4
...
...

Likewise, to request either a medium, or large runner, we would simply substitute the appropriate labels, i.e.,

name: MEDIUM sized runner

jobs:
  execute-on-medium-runner:
    runs-on: [..., vcpu-2048-mem-4096]
    steps:
      - uses: actions/checkout@v4
...
...

name: LARGE sized runner

jobs:
  execute-on-large-runner:
    runs-on: [..., vcpu-4096-mem-8192]
    steps:
      - uses: actions/checkout@v4
...
...

EventBridge Rule(s)

Three (3) EventBridge rules are required—one for each task size (small, medium, large).

We will walkthrough the steps required for creating the rule for our small runner.

Most of the configuration will be identical to the previously configured rule, with the exception of the rule name, event pattern and input transformer.

Small Runner Rule

Rule name: trigger-jit-self-hosted-runner–small.

Event Pattern

{
  "detail": {
    "organization": {
      "login": ["foo-organisation"]
    },
    "workflow_job": {
      "status": ["queued"],
      "labels": ["vcpu-512-mem-2048"]
    }
  },
  "detail-type": ["workflow_job"],
  "source": ["github.com"]
}

Configure Input Transformer

Input path

The input path value remains unchanged.

{"org":"$.detail.organization.login"}

Input Template

First, we confirm the JSON format for the ECS run task –overrides:

{
  "containerOverrides": [
    {
      "name": "string",
      "environment": [
        {
          "name": "string",
          "value": "string"
        }
      ]
    }
  ],
  "cpu": "string",
  "memory": "string"
}

Referring back to the existing Input Template, and adding the overrides for CPU and Memory values, our revised template becomes:

{
    "containerOverrides": [
        {
            "name": "runner",
            "environment": [
                {
                    "name": "ORGANIZATION",
                    "value": "<org>"
                },
                {
                    "name": "SQS_QUEUE_NAME",
                    "value": "git-actions-wf-job-queue"
                }              
            ]
        }
    ],
    "cpu": "512",
    "memory": "2048"
}

Series Navigation<< Putting the Infrastructure to the Test