{"id":70300,"date":"2025-03-19T12:36:13","date_gmt":"2025-03-19T07:06:13","guid":{"rendered":"https:\/\/www.tothenew.com\/blog\/?p=70300"},"modified":"2025-03-19T16:18:59","modified_gmt":"2025-03-19T10:48:59","slug":"optimizing-application-performance-with-datadog-continuous-profiler","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/optimizing-application-performance-with-datadog-continuous-profiler\/","title":{"rendered":"Optimizing Application Performance with Datadog Continuous Profiler"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p>Modern applications behave quite differently in production compared to development or testing environments. Outlier requests and accounts, edge cases, configuration changes, security features, and request spikes can make an application behave in unexpected ways. This may lead to poor CPU and memory performance, which can be costly and result in an undesirable end-user experience. Datadog Continuous Profiler enables you to quickly uncover costly bugs and identify code improvements to help you reduce infrastructure costs and enhance end-user experience.<\/p>\n<h2>Objective<\/h2>\n<p>By the end of this blog, you&#8217;ll be able to do the following:<\/p>\n<ul>\n<li>Determine when and how to use continuous profiling for diagnosing application performance problems.<\/li>\n<li>Apply a performance diagnostic methodology to troubleshoot code performance issues using profile types, endpoint profiling, and comparing profiles in Continuous Profiler.<\/li>\n<\/ul>\n<h2>Continuous Code Profiling<\/h2>\n<ol>\n<li>Continuous code profiling allows you to measure code performance at all times in any environment.<\/li>\n<li>In production, an application process is mostly a closed box. You can observe external behaviors but not internal ones. Profiling is a way to look inside the box, observe these internal behaviors, and measure the application\u2019s code performance. You can detect and optimize the most time-consuming and resource-intensive lines of code that affect costs and end-user experience.<\/li>\n<li>Profiling becomes much more effective when done in production because it\u2019s usually difficult and time-consuming to simulate production behavior or reproduce specific bottlenecks and outages in non-production environments.<\/li>\n<\/ol>\n<h2>Continuous profiling in Datadog<\/h2>\n<div id=\"attachment_70299\" style=\"width: 926px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-06-19-04-57.png\"><img aria-describedby=\"caption-attachment-70299\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-70299\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-06-19-04-57.png\" alt=\"Example of a CPU Time profiler flame graph in Continuous Profiler.\" width=\"916\" height=\"607\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-06-19-04-57.png 916w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-06-19-04-57-300x199.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-06-19-04-57-768x509.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-06-19-04-57-624x414.png 624w\" sizes=\"(max-width: 916px) 100vw, 916px\" \/><\/a><p id=\"caption-attachment-70299\" class=\"wp-caption-text\">Example of a CPU Time profiler flame graph in Continuous Profiler.<\/p><\/div>\n<p>Datadog Continuous Profiler is an always-on, production code profiler that enables you to analyze code-level performance across your entire environment, with minimal overhead. Profiles reveal which methods\/functions consume the most resources, such as CPU, memory allocation, wall time, and I\/O time spent. With this information, you can optimize your code to reduce end-user latency and cloud provider costs.<\/p>\n<p>Continuous profiling in Datadog allows you to do the following:<\/p>\n<ul>\n<li>Perform code-level tracing with zero instrumentation<\/li>\n<li>Visualize all your stack traces in one place<\/li>\n<li>Discover bottlenecks in your code at a glance<\/li>\n<li>Filter profile data using tags<\/li>\n<li>Get actionable insights for performance improvements<\/li>\n<\/ul>\n<h2>Continuous Profiling Goes Beyond Distributed Tracing<\/h2>\n<p>Consider you have a service called movies-api-java that has multiple endpoints that return movie metadata stored in a database. You learn that the service isn\u2019t performing well and users are experiencing increased latency issues. You want to optimize the service\u2019s performance to ensure that users have the best experience.<\/p>\n<p>You start using Datadog APM and distributed tracing to investigate the performance issues. For example, if you see additional spans in traces for unnecessary, repeated calls to a database when requests are made to an endpoint, you know you can use this information to determine a fix. However, for some issues, you see that you need to go beyond distributed tracing to be able to diagnose issues and optimize performance. In the example below, the trace shows that it takes over 2 seconds to respond to a request, but there are no child spans or other indicators that can tell you more about what may be causing the slow performance.<\/p>\n<p>&nbsp;<\/p>\n<div id=\"attachment_70432\" style=\"width: 923px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-08-55-14.png\"><img aria-describedby=\"caption-attachment-70432\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-70432\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-08-55-14.png\" alt=\"Example of a high-latency trace with no spans.\" width=\"913\" height=\"334\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-08-55-14.png 913w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-08-55-14-300x110.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-08-55-14-768x281.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-08-55-14-624x228.png 624w\" sizes=\"(max-width: 913px) 100vw, 913px\" \/><\/a><p id=\"caption-attachment-70432\" class=\"wp-caption-text\">Example of a high-latency trace with no spans.<\/p><\/div>\n<p>For issues that you can\u2019t solve using APM, you can use Continuous Profiler to investigate at a more granular level. The difference between distributed tracing and continuous profiling is that traces tell you which requests were slow at the service level, whereas profiles tell you why they were slow at the code level.<\/p>\n<h2>Profiles<\/h2>\n<p>In Datadog, trace data and profiling data are automatically linked for application processes that have both APM and Continuous Profiler enabled. In the trace details panel in APM, you can move directly from the information for a trace or a selected span in a trace to the associated profiling data using the Profiles tab.<\/p>\n<div id=\"attachment_70433\" style=\"width: 925px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-00-02.png\"><img aria-describedby=\"caption-attachment-70433\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-70433\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-00-02.png\" alt=\"The Profiles UI found in the details panel for a trace.\" width=\"915\" height=\"492\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-00-02.png 915w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-00-02-300x161.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-00-02-768x413.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-00-02-624x336.png 624w\" sizes=\"(max-width: 915px) 100vw, 915px\" \/><\/a><p id=\"caption-attachment-70433\" class=\"wp-caption-text\">The Profiles UI found in the details panel for a trace.<\/p><\/div>\n<h2>Interpreting the Profiler Flame Graph<\/h2>\n<p>Similar to traces, profiles are represented in flame graphs. However, profiler flame graphs are interpreted differently because the data they represent is different. The profiler flame graph pictured below represents CPU Time by method.<\/p>\n<div id=\"attachment_70434\" style=\"width: 925px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-04-19.png\"><img aria-describedby=\"caption-attachment-70434\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-70434\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-04-19.png\" alt=\"Profiler flame graph with important details explained.\" width=\"915\" height=\"489\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-04-19.png 915w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-04-19-300x160.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-04-19-768x410.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-04-19-624x333.png 624w\" sizes=\"(max-width: 915px) 100vw, 915px\" \/><\/a><p id=\"caption-attachment-70434\" class=\"wp-caption-text\">Profiler flame graph with important details explained.<\/p><\/div>\n<ul>\n<li>The x-axis represents the total CPU consumption.<\/li>\n<li>Each horizontal bar is a frame.<\/li>\n<li>Each frame represents a method. The frames are arranged from top to bottom, in the order that each method was called during a program\u2019s execution.<\/li>\n<li>Each color represents a different package.<\/li>\n<li>The top frame defined here is usually called the \u201croot frame\u201d and its value is the sum of the child frames. Comparing it to a pie chart, the root frame is the total pie and each stack trace represents a different piece of the pie.<\/li>\n<li>The width of each frame corresponds to its resource consumption. The longer the frame, the more CPU Time was used.<\/li>\n<li>Two methods appearing side by side could have been called in parallel or in any order. Frames are ordered alphabetically from left to right.<\/li>\n<li>The bottom frame is called the leaf frame and represents the last method called in the stack. The leaf frame only represents its self time because it has no child frames.<\/li>\n<\/ul>\n<h2>Breaking Down Resource Consumption Using Profile Types<\/h2>\n<p>There are a variety of profile types that you can view in Continuous Profiler to help you understand and investigate different aspects of your code\u2019s performance.<\/p>\n<div id=\"attachment_70435\" style=\"width: 925px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-29-25.png\"><img aria-describedby=\"caption-attachment-70435\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-70435\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-29-25.png\" alt=\"Example of available profile types based on the filtered profile data.\" width=\"915\" height=\"489\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-29-25.png 915w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-29-25-300x160.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-29-25-768x410.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-29-25-624x333.png 624w\" sizes=\"(max-width: 915px) 100vw, 915px\" \/><\/a><p id=\"caption-attachment-70435\" class=\"wp-caption-text\">Example of available profile types based on the filtered profile data.<\/p><\/div>\n<h2>Understanding Profile Types and Resource Consumption<\/h2>\n<p>Each profile type represents a type of resource consumption, such as CPU, wall time, or memory. The profile types available for you to select may differ based on the language being profiled. The following are the most common profile types:<\/p>\n<ul>\n<li><strong>CPU<\/strong> profiling measures which methods consume the most CPU on an application.<\/li>\n<li><strong>Allocation<\/strong> profiling measures the amount of memory allocated by a given method.<\/li>\n<li><strong>Heap<\/strong> profiling measures the amount of heap memory allocated by each function that hasn\u2019t been garbage collected. This is useful for investigating the overall memory usage of your service and identifying potential memory leaks.<\/li>\n<li><strong>Lock<\/strong> profiling measures the amount of time a thread is waiting to acquire a lock and is hence doing nothing.<\/li>\n<li><strong>Wall Time<\/strong> profiling measures the effective time spent by methods. It can be useful to debug latency at first glance and then dig into the other profiling types to find out what was causing the latency. The wall time profile can be considered to be the most similar to the associated APM flame graph.<\/li>\n<li><strong>File I\/O and Socket I\/O<\/strong> measure the number of time spent by methods on disk.<\/li>\n<li><strong>Exceptions<\/strong> measure the amount of exceptions thrown. The profiler doesn\u2019t catch\/handle exceptions, but it tracks their creation.<\/li>\n<\/ul>\n<h2>Investigating Slow Endpoints with Endpoint Profiling<\/h2>\n<p>Endpoint profiling allows you to scope profiler flame graphs by any API endpoint of a service, so you can find endpoints that are slow and causing poor end-user experience. Debugging and understanding why an endpoint has high latency can be tricky. For example, high latency could be caused by a method that is CPU-heavy and unknowingly on the critical path of a request process where latency is important.<\/p>\n<p>You can do the following with endpoint profiling:<\/p>\n<ul>\n<li>Identify the bottleneck methods that are slowing down the endpoint\u2019s overall response time.<\/li>\n<li>Isolate the top endpoints that are responsible for consuming resources like CPU and memory. This is particularly helpful when you\u2019re trying to optimize your service for performance gains.<\/li>\n<li>Understand if third-party code or runtime libraries are the reason for endpoints being slow or heavy on resource consumption.<\/li>\n<\/ul>\n<div id=\"attachment_70436\" style=\"width: 925px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-37-11.png\"><img aria-describedby=\"caption-attachment-70436\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-70436\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-37-11.png\" alt=\"Example of list of endpoints that appears in the summary list for a profile.\" width=\"915\" height=\"277\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-37-11.png 915w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-37-11-300x91.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-37-11-768x232.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-37-11-624x189.png 624w\" sizes=\"(max-width: 915px) 100vw, 915px\" \/><\/a><p id=\"caption-attachment-70436\" class=\"wp-caption-text\">Example of list of endpoints that appears in the summary list for a profile.<\/p><\/div>\n<p>In general, it\u2019s valuable to track which endpoints are consuming the most valuable resources, such as CPU and memory. The list can help you identify if your endpoints have regressed or if you have newly introduced endpoints that are drastically consuming more resources than expected and slowing down your overall service.<\/p>\n<h2>Comparing Profiles to Gain More Insights into Code Performance<\/h2>\n<p>The Compare feature permits you to compare two profiles or profile aggregations. This can help you identify code performance improvements, regressions, and structural changes as you troubleshoot issues. For example, you can see if the service you\u2019re profiling is taking more or less time, utilizing more or less memory, making more or fewer allocations, throwing more or fewer exceptions, or including more or less code and calls than it was in the past.<\/p>\n<div id=\"attachment_70437\" style=\"width: 926px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-44-43.png\"><img aria-describedby=\"caption-attachment-70437\" decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-70437\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-44-43.png\" alt=\"Screenshot of the comparison between two profiler flame graphs.\" width=\"916\" height=\"493\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-44-43.png 916w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-44-43-300x161.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-44-43-768x413.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/03\/Screenshot-from-2025-03-13-09-44-43-624x336.png 624w\" sizes=\"(max-width: 916px) 100vw, 916px\" \/><\/a><p id=\"caption-attachment-70437\" class=\"wp-caption-text\">Screenshot of the comparison between two profiler flame graphs.<\/p><\/div>\n<h2>Conclusion<\/h2>\n<p>Continuous Profiler is a powerful tool that allows you to continuously profile your production code so that you can investigate and optimize its performance. You can discover and optimize inefficient parts of your code to improve resource consumption, cost savings, app performance, and end-user experience.<\/p>\n<p>You\u2019re now able to do the following:<\/p>\n<ul>\n<li>Determine when and how to use continuous profiling to diagnose application performance problems.<\/li>\n<li>Apply a performance diagnostic methodology to troubleshoot code performance issues using profile types, endpoint profiling, and comparing profiles in the Continuous Profiler.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Introduction Modern applications behave quite differently in production compared to development or testing environments. Outlier requests and accounts, edge cases, configuration changes, security features, and request spikes can make an application behave in unexpected ways. This may lead to poor CPU and memory performance, which can be costly and result in an undesirable end-user experience. [&hellip;]<\/p>\n","protected":false},"author":1906,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":124},"categories":[2348],"tags":[1915,1892],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/70300"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/1906"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=70300"}],"version-history":[{"count":10,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/70300\/revisions"}],"predecessor-version":[{"id":70661,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/70300\/revisions\/70661"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=70300"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=70300"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=70300"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}