ECS and EventBridge - Good way of creating and Tracking Jobs / Tasks

Chapter 1 - BackgroundJobs and Available Options

Background Jobs are an essential part of product engineering. Especially in products where long-running operations needed to be offloaded so they do not slow down the customer-facing interface (APIs).

To do so there are a couple of options available. The most obvious one is to use a persistent Queue. REDIS and the module written around Redis provide a wonderful interface to achieve this. An example of this can be Kueue, bull, bee-queue, etc.

This works with a simple approach to store the operation and its metadata inside Redis. Continuously watch the Queue, and execute the next operation with the metadata as soon as the queue is empty or the number of active operations gets lower than mentioned concurrency.

The other way is to leverage the queue protocol like MQTT or AMQP or systems developed on top of them like EMQ, Nats, etc. This system provides more control over the flow of messaging with the ability such as QoS. You can again use Redis for the underlying storage, but they prefer to be more flexible and allow you to provide different storage variants.

The challenge with both approaches is where they will run their jobs. You see they do offload the primary interface which is client-facing. But they need compute to execute the instruction store in them.

Now, this is not about scale, but more about cost, management, and speed. The scale can be figured based on different criteria and rules. But you need at least one constant Instance running all the time, it can be either the instance running the API (seems unlikely) or a separate instance.

Then comes the other things that need to be figured like

How longs jobs will take to complete (depends on how many concurrent jobs are running and how powerful is an instance)

How we will ensure the jobs are not executed more than once when there are multiple instances watching the queues

How will you ensure the cost is in control but at the same time it allows us to scale.

Most of you will be able to answer the above questions one or another way, but what we are trying to do here is find another way of approaching the above problem.

Chapter 2 - ECS Fargate and EventBridge

I have mentioned ECS Fargate a couple of times in the previous article, and I am mentioning it again here. Fargate provides us the compute on the go, but what we need with FARGATE is a way of tracking the status of our JOB (which is a FARGATE task).

AWS Event bridge allows us to track the status of different AWS resources using its internal event bus. We are going to use the same to track our fargate task.

The approach looks like this.

We have two Serverless Function

funcCreateJob - Triggers an ECS Job

funcEventBridge - Triggerd by EventBridge which stores the status in Database

CreateJob (Lambda)

This is a simple function that uses AWS SDK to execute runTask with the environment variable. The runTask function will create an ECS task inside your cluster.

The parameter need for runtask are

{

cluster: process.env.ECS_CLUSTER,

taskDefinition: process.env.ECS_TASK_DEFINITION,

      networkConfiguration: {

        awsvpcConfiguration: {

          assignPublicIp: "ENABLED",

          subnets: process.env.SUBNET.split(","),

        },

}
}

Cluster - The name of the ECS cluster

TaskDefinition - The name of task definition with version

Subnets - The subnets ids

EventBridge (Lambda)

This is the simple serverless function that will receive ECS task status with its metadata in the event object. You use the data inside the event object to further update your database

console.log(JSON.stringify(event));
const taskParts = event.detail.taskArn.split("/");

const jobId = taskParts[taskParts.length - 1];
const status = event.detail.lastStatus

The last part of the chain is to link your eventBridge function with the AWS Event Bridge Service.

Doing so requires you to go to AWSEventBrdige. And perform following two task

1) Define pattern

2) Select Target Resource (Which will be lambda in our case)

And that's it. You will be receiving all the updates of FARGATE task on the lambda as the FARGATE task updates or changes its status.

You can attach custom metadata with the environment variable when you are creating the task. The same metadata will be received by the event bridge Lamdba when it is triggered.

That's all for this article. Let me know if there are any questions/suggestions.

Thank you. Peace.