{"id":57715,"date":"2023-07-30T10:06:52","date_gmt":"2023-07-30T04:36:52","guid":{"rendered":"https:\/\/www.tothenew.com\/blog\/?p=57715"},"modified":"2024-01-02T18:23:29","modified_gmt":"2024-01-02T12:53:29","slug":"boosting-ecs-task-monitoring-with-cloudwatch-input-transformer","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/boosting-ecs-task-monitoring-with-cloudwatch-input-transformer\/","title":{"rendered":"Boosting ECS Task Monitoring with CloudWatch Input Transformer"},"content":{"rendered":"<h2 id=\"h.bnr8tpcruv6z\" class=\"c3 c20\"><span style=\"text-decoration: underline;\"><span class=\"c7\">Introduction<\/span><\/span><\/h2>\n<p class=\"c3\"><span class=\"c2\">In the fast-paced world of application delivery, ensuring the health and reliability of our ECS tasks is crucial. Without a reliable alerting mechanism, there\u2019s a risk of overlooking critical task failures that can have a bad impact on our production environment. Just imagine a situation where application tasks fail silently, resource constraints go unnoticed, or container failures go unattended. This can result in costly downtime and leave end users frustrated.<\/span><\/p>\n<p class=\"c3\"><span class=\"c2\">But fear not! In this brief article, we will delve into the journey to revolutionize our ECS Cluster monitoring capabilities. By actively detecting and addressing task failures, we can minimize downtime, optimize resource utilization, and ensure a seamless experience for our end users.<\/span><\/p>\n<p class=\"c3\"><span class=\"c6\">Join us as we dive deep into the world of\u00a0<\/span><span class=\"c11\">Amazon EventBridge<\/span><span class=\"c6\">,\u00a0<\/span><span class=\"c11\">CloudWatch Input Transformers<\/span><span class=\"c6\">, and\u00a0<\/span><span class=\"c11\">SNS (Simple Notification Service)<\/span><span class=\"c6\">. We will unravel the step-by-step implementation of task failure alerts, illuminating the path to actionable and targeted notifications. Get ready to witness the magic unfold as we unlock the true potential of ECS task monitoring.<\/span><\/p>\n<h2 id=\"h.90ukvmspyxkw\" class=\"c3 c20\"><span style=\"text-decoration: underline;\"><span class=\"c7\">Problem Statement<\/span><\/span><\/h2>\n<p class=\"c3\"><span style=\"text-decoration: underline;\"><strong><span class=\"c23 c11\">Lack of Task Failure Alerts in ECS Cluster Impacts Production Applications.<\/span><\/strong><\/span><\/p>\n<p class=\"c3\"><span class=\"c11\"><span style=\"text-decoration: underline;\">Description<\/span>:<\/span><span class=\"c6\">\u00a0Our production ECS Cluster experienced a critical issue when tasks within our main application started failing unexpectedly. Unfortunately, we discovered that we were not receiving any alerts specifically for task failures, relying only on the 5xx alert generated by the Application Load Balancer. This absence of task failure alerts hindered our ability to promptly identify and address the root cause of the failures, prolonging the impact on the application&#8217;s capabilities.<\/span><\/p>\n<h2 id=\"h.xypx1rnihyub\" class=\"c3 c20\"><span style=\"text-decoration: underline;\"><span class=\"c7\">Architecture<\/span><\/span><\/h2>\n<p class=\"c19 c28\"><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2024\/01\/3bXgYP1wJcIxMh_BKtj2s2tGBJD0eoxlknm9l1mWtVUStx3Gw52UXDXry6O-Kl3nv0crjoUcjlE-EHhD4ayv2Oo7yEzeaU_kR_G5z0L_NjO98aT_zqodikqn6zx9L0pP_eSHqHrqukNDSTIv0NdoqrA.png\" alt=\"\" \/><\/p>\n<p class=\"c3\"><span class=\"c23 c11\">Understanding the Components:-<\/span><\/p>\n<p>1. <strong>Amazon Simple Notification Service (SNS):<\/strong> SNS is a fully managed pub\/sub messaging service that allows you to publish, subscribe, and send messages to various endpoints. In our case, we will use SNS to send notifications whenever an ECS task failure happens.<\/p>\n<p>2. <strong>Amazon EventBridge:<\/strong> AWS EventBridge is a serverless event bus that makes it easy to connect different AWS services together and trigger actions based on events. We will utilize EventBridge to capture and process ECS task failure events.<\/p>\n<p><strong>3. CloudWatch Input Transformer:<\/strong> CloudWatch Input Transformer is a feature of Amazon CloudWatch Events that allows you to extract, modify, and combine fields from incoming events before sending them to targets like SNS topics or AWS Lambda functions. Using the Input Transformer&#8217;s power, we will parse ECS task failure events and extract useful details for our alerting purpose.<\/p>\n<h2 id=\"h.c1fe3px5hxxe\" class=\"c3 c20\"><span style=\"text-decoration: underline;\"><span class=\"c7\">Prerequisites<\/span><\/span><\/h2>\n<p><span class=\"c2\">1. In this demo, we will be utilizing the ECS Fargate Cluster. Please ensure that you have an active ECS Fargate Cluster running.<\/span><\/p>\n<p><span class=\"c2\">2. Additionally, you will require a task definition and an ECS service. Make sure your task is up.<\/span><\/p>\n<h2 id=\"h.f4b3qunh7z4o\" class=\"c3 c20\"><span style=\"text-decoration: underline;\"><span class=\"c24\">D<\/span><span class=\"c7\">eployment Of Alerting Mechanism:<\/span><\/span><\/h2>\n<p><span class=\"c6\">1. In the AWS Management Console, search for &#8220;<\/span><span class=\"c11\">SNS<\/span><span class=\"c2\">&#8221; in the services search bar and click on &#8220;Simple Notification Service&#8221; when it appears.<\/span><\/p>\n<p><span class=\"c6\">2. Click the &#8220;Topics&#8221; section in the left navigation pane in the SNS console<\/span><span class=\"c2\">.<\/span><\/p>\n<p><span class=\"c2\">3. Provide the name of your topic in the &#8220;Name&#8221; field. You can choose a display name that helps you identify the purpose of the topic.<\/span><\/p>\n<p class=\"c3 c8\"><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2024\/01\/P2p-sQk5_TklhwPlYBuzRZuOPkuRIQBvpxXtb4mx1C-vjmLhXuWNfR6TAqsVQFQvSKQpioalQ8CJXRFK4mL6HbVsHDmGxeAfGYkQBf0kL0bsK8xfD-rnGwSIf1ffiKklhoNr9dbgsGkuVcuKnFB79Es.png\" alt=\"\" \/><\/p>\n<p><span class=\"c6\">4. Click on the &#8220;<\/span><span class=\"c11\">Create topic<\/span><span class=\"c2\">&#8221; button to create the SNS topic.<\/span><\/p>\n<p class=\"c3 c8\"><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2023\/07\/Boosting-ECS-Task-Monitoring-with-CloudWatch-Input-Transformer.png\" alt=\"\" \/><\/p>\n<p><span class=\"c6\">5. Select the topic by clicking on its name. In the topic, details view, click on the &#8220;<\/span><span class=\"c11\">Create subscription<\/span><span class=\"c6\">&#8221; button. Choose the protocol as &#8220;<\/span><span class=\"c11\">Email<\/span><span class=\"c6\">&#8221; from the dropdown menu. Enter the email address where you want to receive the alerts in the &#8220;<\/span><span class=\"c11\">Endpoint<\/span><span class=\"c2\">&#8221; field.<\/span><\/p>\n<p class=\"c3 c8\"><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2023\/07\/Boosting-ECS-Task-Monitoring-with-CloudWatch-Input-Transformer-1.png\" alt=\"\" \/><\/p>\n<p><span class=\"c2\">6. Check the inbox of the subscribed email address for a confirmation message from AWS SNS. Click &amp; subscribe to the notifications.<\/span><\/p>\n<p><span class=\"c6\">7. Once confirmed, the subscription status will change to &#8220;<\/span><span class=\"c11\">Confirmed<\/span><span class=\"c2\">&#8221; in the SNS console.<\/span><\/p>\n<p class=\"c3 c8\"><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2023\/07\/Boosting-ECS-Task-Monitoring-with-CloudWatch-Input-Transformer-2.png\" alt=\"\" \/><\/p>\n<p><span class=\"c6\">8. Now, \u00a0 event rule. Open the CloudWatch console by visiting <\/span><span class=\"c6\"><a class=\"c30\" href=\"https:\/\/www.google.com\/url?q=https:\/\/console.aws.amazon.com\/cloudwatch\/&amp;sa=D&amp;source=editors&amp;ust=1688039023354929&amp;usg=AOvVaw02LFrGZQwQBimFxY4kJfb2\">\u00a0<\/a><\/span><span class=\"c6 c29 c50\"><a class=\"c30\" href=\"https:\/\/www.google.com\/url?q=https:\/\/console.aws.amazon.com\/cloudwatch\/&amp;sa=D&amp;source=editors&amp;ust=1688039023355231&amp;usg=AOvVaw0iTYxjsVxY3xA0MiBNS1x2\">https:\/\/console.aws.amazon.com\/cloudwatch\/<\/a><\/span><span class=\"c6 c14\">.<\/span><\/p>\n<p><span class=\"c2\">9. In the navigation pane, click on &#8220;Events&#8221; and then click on &#8220;Create rule&#8221;. Provide a name and an optional description for your rule. Click on &#8220;Next&#8221; to proceed with the configuration of your rule.<\/span><\/p>\n<p class=\"c3\"><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2023\/07\/Boosting-ECS-Task-Monitoring-with-CloudWatch-Input-Transformer-3.png\" alt=\"\" \/><\/p>\n<p><span class=\"c34\">10. Under the creation method, select &#8220;Custom event pattern&#8221; (JSON EDITOR).<\/span><\/p>\n<p><span class=\"c6\">11. Since, <\/span><span class=\"c6 c56\">By default, all three states(Running, stopped &amp; Pending) will be tracked by the Cloudwatch rule but <\/span><span class=\"c2\">we only want stopped tasks alerts only. Enter the following pattern in the custom event pattern tab. Click on next.<\/span><\/p>\n<table class=\"c42\">\n<tbody>\n<tr class=\"c47\">\n<td class=\"c53\" colspan=\"1\" rowspan=\"1\">\n<pre class=\"c19\"><span style=\"color: #993366;\"><span class=\"c0 c17\">{\r\n\u00a0<\/span><span class=\"c1 c17\">\"detail\"<\/span><span class=\"c0 c17\">: {\r\n\u00a0 \u00a0<\/span><span class=\"c1 c17\">\"lastStatus\"<\/span><span class=\"c0 c17\">: [<\/span><span class=\"c1 c17\">\"STOPPED\"<\/span><span class=\"c0 c17\">],\r\n\u00a0 \u00a0<\/span><span class=\"c1 c17\">\"stoppedReason\"<\/span><span class=\"c0 c17\">: [{\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1 c17\">\"anything-but\"<\/span><span class=\"c0 c17\">: {\r\n\u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1 c17\">\"prefix\"<\/span><span class=\"c0 c17\">:\u00a0<\/span><span class=\"c1 c17\">\"Scaling activity initiated by\"<\/span><span class=\"c0 c17\">\r\n\u00a0 \u00a0 \u00a0}\r\n\u00a0 \u00a0}]\r\n\u00a0},\r\n\u00a0<\/span><span class=\"c1 c17\">\"detail-type\"<\/span><span class=\"c0 c17\">: [<\/span><span class=\"c1 c17\">\"ECS Task State Change\"<\/span><span class=\"c0 c17\">],\r\n\u00a0<\/span><span class=\"c1 c17\">\"source\"<\/span><span class=\"c0 c17\">: [<\/span><span class=\"c1 c17\">\"aws.ecs\"<\/span><span class=\"c0 c17\">]\r\n}<\/span><\/span><\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p class=\"c3 c8\"><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2023\/07\/Boosting-ECS-Task-Monitoring-with-CloudWatch-Input-Transformer-4.png\" alt=\"\" \/><\/p>\n<blockquote>\n<p class=\"c3\"><em><span class=\"c9\">Note:<\/span><span class=\"c6 c35\">&#8211;<\/span><span class=\"c6 c35\">One thing you might notice is that we have used prefix matching with\u00a0<\/span><span class=\"c6 c35\">anything-but<\/span><span class=\"c2 c35\">\u00a0to ignore alerts when tasks are stopped during deployment or autoscaling. This approach ensures that we don&#8217;t receive alerts for task stops initiated during deployment, as it doesn&#8217;t make sense to be alerted for those specific cases.<\/span><\/em><\/p>\n<p><em><span class=\"c2 c35\">In the provided custom event pattern, `anything-but` and `prefix` are used as matching rules for the `stoppedReason` field. Here&#8217;s what both of them mean:<\/span><\/em><\/p>\n<ul class=\"c10 lst-kix_20xfrx301e5e-0 start\">\n<li class=\"c3 c8 li-bullet-0\"><em><span style=\"text-decoration: underline;\"><strong><span class=\"c9\">anything-but<\/span><\/strong><\/span><span class=\"c2 c35\">: This is a logical operator used in EventBridge patterns. It specifies that the condition should be true for any value of the field except for the specified value or pattern. In this case, it means that the `stoppedReason` field should not have a prefix match of &#8220;Scaling activity initiated by&#8221;.<\/span><\/em><\/li>\n<li class=\"c3 c8 li-bullet-0\"><em><span style=\"text-decoration: underline;\"><strong><span class=\"c9\">prefix<\/span><\/strong><\/span><span class=\"c2 c35\">: This is a comparison operator used in EventBridge patterns. It checks if the value of the field starts with the specified prefix. In the given pattern, it checks whether the `stoppedReason` field starts with the prefix &#8220;Scaling activity initiated by&#8221;. If there is a match, it will be excluded from triggering the alerts.<\/span><\/em><\/li>\n<\/ul>\n<\/blockquote>\n<p><span class=\"c6\">12. After configuring the custom event pattern, click on &#8220;Next&#8221; to proceed. Then, select the SNS Target for the alert. Choose the SNS Topic that you created in Step 4 as the target for the alert.<\/span><\/p>\n<p><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2024\/01\/fQwuc741UVisexPVD3-TIU2RH7qov5ZQknJ0ZhWciFmCvsx-cn_9Si4JlgghgYXUDe14Z_mIMrFtXEYEbNyEsFx1HwsRdz1ImhY1Y0pkWmyGj_jOk2cKlaT2X6adOfZhusMrqE5B4euoCzqWvg5cH1U.png\" alt=\"\" \/><span class=\"c48\"><br \/>\n<\/span><\/p>\n<p><span class=\"c34\">13. A<\/span><span class=\"c33 c37\">fter selecting the SNS Target, click on &#8220;Additional settings&#8221;. This section will utilize the CloudWatch input transformer to transform the event and extract the required values in the desired format.<\/span><\/p>\n<p><span class=\"c32\"><br \/>\n<\/span><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2024\/01\/FxXGUGQS-0QKkrdLvSM7zkmNphHm2gbDOriTXfoG20ROlG4LPzIBykS3JwUAx19EhGmIUurRwZYjJ6Hqi9vkXHXH8g9UG6skAd06PIkij0IhNnJD9ZMGN2BoMABBsXZWMQZsX58tXybBFdSaUKd-1Zk.png\" alt=\"\" \/><\/p>\n<p><span class=\"c2\">14. Click on &#8220;Configure input transformer&#8221; under the &#8220;Additional settings&#8221; section.In the &#8220;Target input transformer&#8221; field, enter the following input path:<\/span><\/p>\n<table class=\"c42\">\n<tbody>\n<tr class=\"c40\">\n<td class=\"c54\" colspan=\"1\" rowspan=\"1\">\n<pre class=\"c19\"><span class=\"c5\" style=\"color: #993366;\">{<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">  \"TASK_ARN\": \"$.detail.taskArn\",<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">  \"PROBLEM\": \"$.detail-type\",<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">  \"STOP_CODE\": \"$.detail.stopCode\",<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">  \"STOPPED_REASON\": \"$.detail.stoppedReason\",<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">  \"STOPPED_TIME\": \"$.detail.stoppedAt\",<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">  \"AZ\": \"$.detail.availabilityZone\",<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">  \"SERVICE\": \"$.detail.group\",<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">  \"ECS_CLUSTER_ARN\": \"$.detail.clusterArn\",<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">  \"REGION\": \"$.region\"<\/span>\r\n<span class=\"c5\" style=\"color: #993366;\">}<\/span><\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span class=\"c2\">15. Under the Template section, please enter the following content:<\/span><\/p>\n<table class=\"c42\">\n<tbody>\n<tr class=\"c40\">\n<td class=\"c53\" colspan=\"1\" rowspan=\"1\">\n<pre class=\"c19\"><span style=\"color: #993366;\"><span class=\"c6 c21\">\"ECS TASK FAILURE ALERT\"<\/span>\r\n<span class=\"c6 c21\">\"Problem: &lt;PROBLEM&gt;\"<\/span>\r\n<span class=\"c6 c21\">\"Region: &lt;REGION&gt;\"<\/span>\r\n<span class=\"c6 c21\">\"Availability-zone: &lt;AZ&gt;\"<\/span>\r\n<span class=\"c6 c21\">\"ECS Cluster Arn: &lt;ECS_CLUSTER_ARN&gt;\"<\/span>\r\n<span class=\"c6 c21\">\"Service Name: &lt;SERVICE&gt;\"<\/span>\r\n<span class=\"c6 c21\">\"Task Arn: &lt;TASK_ARN&gt;\"<\/span>\r\n<span class=\"c6 c21\">\"Stopped Reason: &lt;STOPPED_REASON&gt;\"<\/span>\r\n<span class=\"c6 c21\">\"Stop Code: &lt;STOP_CODE&gt;\"<\/span>\r\n<span class=\"c6 c21\">\"Stopped Time: &lt;STOPPED_TIME&gt;\"<\/span><\/span><\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span class=\"c2\">16. This template defines the format of the alert message that will be sent to the SNS topic. The placeholders enclosed in angle brackets (&#8220;&lt;&gt;&#8221; symbols) will be replaced with the actual values extracted from the event payload. Once you have entered this template content, click on &#8220;Next&#8221; to proceed.<\/span><\/p>\n<p><span class=\"c6\">17. You can optionally give tags to your Cloudwatch rule.<\/span><\/p>\n<p><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2024\/01\/PCXHyMewr0lD8GAkSEBq_yOc3CPL_9JI_VOODLE95I8CKYdPYQ0nrDbbsW1AYnKnXq-dDgmq_XQT8BKk_iOKM39dNxAJ1N0yGLFSdONMWh2SQgsNtFwjTT2L_6tD6stVJUblL3rLYbI6-K2pB6KSOrA.png\" alt=\"\" \/><\/p>\n<p><span class=\"c43\">18. Verify everything and click on Create the rule. It\u2019ll look like <\/span><span class=\"c6\">this<\/span><span class=\"c37 c33\">.<\/span><\/p>\n<p class=\"c3 c8\"><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2024\/01\/cNyWbL5uCFtLFjJpSVZKiKn9zom7jE4gtS62QlO9M42xSQkq7PQqYF0GxrjK4pmdJPUucbCpau8K7818KcFToPGzuHdeERruSzbnaaf6db3X0scpqZc9SyNGhN6SQObImXal8KSo-qnYmye10RWR2vE.png\" alt=\"\" \/><\/p>\n<p class=\"c3\"><span class=\"c2\">Now, whenever an ECS Fargate task fails, you will receive a notification via email, as shown in the picture below.<\/span><\/p>\n<p class=\"c3\"><img decoding=\"async\" title=\"\" src=\"\/blog\/wp-ttn-blog\/uploads\/2024\/01\/pKZpTgSmW8DDLCyiKXInZcvcwkupjvHrJIx7049bEGJBT-On9C94vfYzG1PlxNTb-Jq38QnTh1qtV6UTKCWqGit4lcuGB-P1LDLh8QWc_u5bZIUMBJsuOqrxLDbEwhik7xaLIAHefqFTF6WpYw-RC0w.png\" alt=\"\" \/><\/p>\n<h2 id=\"h.7ead8e3zvt9c\" class=\"c3 c20\"><span style=\"text-decoration: underline;\"><span class=\"c24\">Bonus Section<br \/>\n<\/span><span class=\"c11 c29\">Automating the setup using Terraform<\/span><\/span><\/h2>\n<p class=\"c3\"><span class=\"c2\">If you like to automate the setup process using infrastructure-as-code, Terraform can be a useful tool. With Terraform, you can declare and manage your AWS resources. Let&#8217;s see how we can implement the ECS task failure alerting mechanism using Terraform:<\/span><\/p>\n<p class=\"c3\"><span style=\"text-decoration: underline;\"><span class=\"c23 c11\">Step 1: Install Terraform<\/span><\/span><\/p>\n<ul class=\"c10 lst-kix_x9yb6k8x1vn1-0 start\">\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c6\">Download and install Terraform from the official website:<\/span><span class=\"c6\"><a class=\"c30\" href=\"https:\/\/www.google.com\/url?q=https:\/\/www.terraform.io\/downloads.html&amp;sa=D&amp;source=editors&amp;ust=1688039023361250&amp;usg=AOvVaw2wio_hY5zR8lE8AXnxHLDn\">\u00a0<\/a><\/span><span class=\"c6 c26\"><a class=\"c30\" href=\"https:\/\/www.google.com\/url?q=https:\/\/www.terraform.io\/downloads.html&amp;sa=D&amp;source=editors&amp;ust=1688039023361475&amp;usg=AOvVaw3nxBjnJic7EP59CKV8eWDj\">https:\/\/www.terraform.io\/downloads.html<\/a><\/span><\/li>\n<\/ul>\n<ul class=\"c10 lst-kix_wr040l7mljb8-0 start\">\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c2\">Make sure to add Terraform to your system&#8217;s PATH.<\/span><\/li>\n<\/ul>\n<p class=\"c3\"><span style=\"text-decoration: underline;\"><span class=\"c23 c11\">Step 2: Initialize Terraform<\/span><\/span><\/p>\n<ul class=\"c10 lst-kix_7mc62maw3zlf-0 start\">\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c2\">Create a new directory for your Terraform project.<\/span><\/li>\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c2\">Open a terminal and navigate to the project directory.<\/span><\/li>\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c6\">Run the command\u00a0<\/span><span class=\"c11\">terraform init<\/span><span class=\"c2\">\u00a0to initialize the project. Terraform will download the necessary provider plugins.<\/span><\/li>\n<\/ul>\n<p class=\"c3\"><span style=\"text-decoration: underline;\"><span class=\"c11 c23\">Step 3: Create a Terraform Configuration File<\/span><\/span><\/p>\n<ul class=\"c10 lst-kix_nx3f5pfe230z-0 start\">\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c6\">Create a new file named\u00a0<\/span><span class=\"c11\">main<\/span><span class=\"c6\">.tf<\/span><span class=\"c2\">\u00a0in your project directory.<\/span><\/li>\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c2\">Make sure you have enough permissions &amp; necessary access rights to run Terraform commands.<\/span><\/li>\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c6\">Copy and paste the following Terraform code into\u00a0<\/span><span class=\"c11\">main.tf<\/span><span class=\"c2\">.<\/span><\/li>\n<\/ul>\n<table class=\"c39\">\n<tbody>\n<tr class=\"c51\">\n<td class=\"c45\" colspan=\"1\" rowspan=\"1\">\n<blockquote>\n<pre class=\"c19\"><span style=\"color: #993366;\"><span class=\"c12\">######################### Provider Configuration ###################<\/span><span class=\"c4\">\r\nprovider \"aws\" {<\/span><span class=\"c0\">\r\n\u00a0region =\u00a0<\/span><span class=\"c1\">\"us-west-2\"<\/span><span class=\"c0\">\u00a0# Replace with your desired region\r\n}\r\n\r\n<\/span><span class=\"c12\">########################## SNS #####################################<\/span><span class=\"c4\">\r\n\r\nresource \"aws_sns_topic\" \"ecs_task_failure_sns\" {<\/span><span class=\"c0\">\r\n\u00a0name =\u00a0<\/span><span class=\"c1\">\"ecs_task_failure_sns\"<\/span><span class=\"c0\">\r\n}\r\n\r\n<\/span><span class=\"c12\">###################### ECS Task Failure CW Event Rule ##############<\/span><span class=\"c4\">\r\n\r\nresource \"aws_cloudwatch_event_rule\" \"ecs_task_failure_alert\" {<\/span><span class=\"c0\">\r\n\u00a0name \u00a0 \u00a0 \u00a0 \u00a0=\u00a0<\/span><span class=\"c1\">\"ecs_task_failure_alert_rule\"<\/span><span class=\"c0\">\r\n\u00a0description =\u00a0<\/span><span class=\"c1\">\"ECS Task Failure Alerts\"<\/span><\/span>\r\n\r\n<span style=\"color: #993366;\"><span class=\"c0\">\r\n\u00a0event_pattern = &lt;&lt;EOF\r\n{\r\n\u00a0<\/span><span class=\"c1\">\"source\"<\/span><span class=\"c0\">: [<\/span><span class=\"c1\">\"aws.ecs\"<\/span><span class=\"c0\">],\r\n\u00a0<\/span><span class=\"c1\">\"detail-type\"<\/span><span class=\"c0\">: [<\/span><span class=\"c1\">\"ECS Task State Change\"<\/span><span class=\"c0\">],\r\n\u00a0<\/span><span class=\"c1\">\"detail\"<\/span><span class=\"c0\">: {\r\n\u00a0 \u00a0<\/span><span class=\"c1\">\"lastStatus\"<\/span><span class=\"c0\">: [<\/span><span class=\"c1\">\"STOPPED\"<\/span><span class=\"c0\">],\r\n\u00a0 \u00a0<\/span><span class=\"c1\">\"stoppedReason\"<\/span><span class=\"c0\">: [{\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"anything-but\"<\/span><span class=\"c0\">: {\r\n\u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"prefix\"<\/span><span class=\"c0\">:\u00a0<\/span><span class=\"c1\">\"Scaling activity initiated by\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0}\r\n\u00a0 \u00a0}]\r\n\u00a0}\r\n}\r\nEOF\r\n}\r\n\r\n\r\nresource\u00a0<\/span><span class=\"c1\">\"aws_cloudwatch_event_target\"<\/span><span class=\"c0\">\u00a0<\/span><span class=\"c1\">\"sns\"<\/span><span class=\"c0\">\u00a0{\r\n\u00a0rule = aws_cloudwatch_event_rule.ecs_task_failure_alert.name\r\n\u00a0arn \u00a0= aws_sns_topic.ecs_task_failure_sns.arn\r\n\u00a0input_transformer {\r\n\u00a0 \u00a0input_paths = {\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"AZ\"<\/span><span class=\"c0\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 =\u00a0<\/span><span class=\"c1\">\"$.detail.availabilityZone\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"ECS_CLUSTER_ARN\"<\/span><span class=\"c0\">\u00a0=\u00a0<\/span><span class=\"c1\">\"$.detail.clusterArn\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"PROBLEM\"<\/span><span class=\"c0\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0=\u00a0<\/span><span class=\"c1\">\"$.detail-type\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"REGION\"<\/span><span class=\"c0\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 =\u00a0<\/span><span class=\"c1\">\"$.region\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"SERVICE\"<\/span><span class=\"c0\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0=\u00a0<\/span><span class=\"c1\">\"$.detail.group\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"STOPPED_REASON\"<\/span><span class=\"c0\">\u00a0 =\u00a0<\/span><span class=\"c1\">\"$.detail.stoppedReason\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"STOPPED_TIME\"<\/span><span class=\"c0\">\u00a0 \u00a0 =\u00a0<\/span><span class=\"c1\">\"$.detail.stoppedAt\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"STOP_CODE\"<\/span><span class=\"c0\">\u00a0 \u00a0 \u00a0 \u00a0=\u00a0<\/span><span class=\"c1\">\"$.detail.stopCode\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"TASK_ARN\"<\/span><span class=\"c0\">\u00a0 \u00a0 \u00a0 \u00a0 =\u00a0<\/span><span class=\"c1\">\"$.detail.taskArn\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0}\r\n\u00a0 \u00a0input_template = &lt;&lt;EOT\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"ECS TASK FAILURE ALERT\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"Problem: &lt;PROBLEM&gt;\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"Region: &lt;REGION&gt;\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"Availability Zone: &lt;AZ&gt;\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"ECS Cluster Arn: &lt;ECS_CLUSTER_ARN&gt;\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"Service Name: &lt;SERVICE&gt;\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"Task Arn: &lt;TASK_ARN&gt;\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"Stopped Reason: &lt;STOPPED_REASON&gt;\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"Stop Code: &lt;STOP_CODE&gt;\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"c1\">\"Stopped Time: &lt;STOPPED_TIME&gt;\"<\/span><span class=\"c0\">\r\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0EOT\r\n\u00a0}\r\n}<\/span><\/span><\/pre>\n<\/blockquote>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p class=\"c3\"><span style=\"text-decoration: underline;\"><span class=\"c23 c11\">Step 4: Initialize and Apply Changes<\/span><\/span><\/p>\n<ul class=\"c10 lst-kix_qk3c1etn4dq0-0 start\">\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c6\">Run the command\u00a0<\/span><span class=\"c11 c44\">terraform init<\/span><span class=\"c2\">\u00a0to initialize Terraform once again (this time in your project directory).<\/span><\/li>\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c6\">Run the command\u00a0<\/span><span class=\"c11 c44\">terraform apply<\/span><span class=\"c2\">\u00a0to create the AWS resources specified in your Terraform configuration.<\/span><\/li>\n<li class=\"c3 c8 li-bullet-0\"><span class=\"c6\">Terraform will prompt for confirmation. Enter<\/span><span class=\"c11\">\u00a0<\/span><span class=\"c11 c44\">yes<\/span><span class=\"c2\">\u00a0to proceed.<\/span><\/li>\n<\/ul>\n<p class=\"c3\"><span class=\"c2\">Once the Terraform applies command completes successfully, the ECS task failure alerting mechanism will be set up in your AWS account.<\/span><\/p>\n<h2 id=\"h.1ufsrmxi53sy\" class=\"c3 c20\"><span style=\"text-decoration: underline;\"><span class=\"c24\">Conclusion<\/span><\/span><\/h2>\n<p class=\"c19 c28\"><span class=\"c6\">To conclude this, using AWS services such as Amazon SNS, EventBridge, and CloudWatch Input Transformer provides a comprehensive solution for Amazon ECS task failure alerting. By combining these services, you can easily <\/span><span class=\"c11\">capture<\/span><span class=\"c6\">,<\/span><span class=\"c11\">\u00a0parse<\/span><span class=\"c6\">, and\u00a0<\/span><span class=\"c11\">deliver<\/span><span class=\"c2\"> meaningful notifications about task failures, enabling you to maintain the availability and stability of your containerized applications. Embrace these AWS services and take advantage of their capabilities to improve the reliability and uptime of your ECS application deployments. Refer to our other blogs for further deep insights.\u00a0<\/span><\/p>\n<h2 class=\"c19 c28\"><span style=\"text-decoration: underline;\"><strong><span class=\"c24\">References<\/span><\/strong><\/span><\/h2>\n<p class=\"c3\"><span class=\"c26 c6\"><a class=\"c30\" href=\"https:\/\/www.google.com\/url?q=https:\/\/docs.aws.amazon.com\/AmazonECR\/latest\/userguide\/vpc-endpoints.html%23ecr-vpc-endpoint-policy&amp;sa=D&amp;source=editors&amp;ust=1688039023366088&amp;usg=AOvVaw3k5mDoRgPdK6mQYS6EYIzX\">https:\/\/docs.aws.amazon.com\/AmazonECS\/latest\/developerguide\/ecs_cwet2.html<\/a><\/span><\/p>\n<div class=\"ap-custom-wrapper\"><\/div><!--ap-custom-wrapper-->","protected":false},"excerpt":{"rendered":"<p>Introduction In the fast-paced world of application delivery, ensuring the health and reliability of our ECS tasks is crucial. Without a reliable alerting mechanism, there\u2019s a risk of overlooking critical task failures that can have a bad impact on our production environment. Just imagine a situation where application tasks fail silently, resource constraints go unnoticed, [&hellip;]<\/p>\n","protected":false},"author":1601,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":70},"categories":[1174,4308,4682,2348],"tags":[5278,248,1892,1499],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/57715"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/1601"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=57715"}],"version-history":[{"count":7,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/57715\/revisions"}],"predecessor-version":[{"id":59914,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/57715\/revisions\/59914"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=57715"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=57715"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=57715"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}