{"id":20855,"date":"2015-06-10T12:07:43","date_gmt":"2015-06-10T06:37:43","guid":{"rendered":"http:\/\/www.tothenew.com\/blog\/?p=20855"},"modified":"2015-07-29T20:36:35","modified_gmt":"2015-07-29T15:06:35","slug":"logentries-search-and-analysis-using-regex-named-capture-group","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/logentries-search-and-analysis-using-regex-named-capture-group\/","title":{"rendered":"Logentries &#8211; Search and Analysis using RegEx named capture group"},"content":{"rendered":"<p>Most of us would be using some tools for centralizing logs, their analysis and storage. Logentries is also the one falling into this category.<\/p>\n<p>Logentries provides a straightforward way for analysis of logs containing KVPs i.e. Key-Value pairs. But for the cases where KVPs are not present it becomes quite hectic to analyse the logs.<\/p>\n<p>To handle such cases, logentries provides a RegEx named capture group which we can use to create dynamic KVP as per our requirement. We would be understanding and using it for our scenario.<\/p>\n<p><strong>Scenario:<\/strong> Count the occurrences of identical URLs which have thrown 404 error in nginx access log.<\/p>\n<p>Let\u2019s start by taking the following log-entry from my logentries\u2019 account. I have changed the URL randomly. All nginx access logs would be coming in this format only as this format will not change until I change it. We would be creating our KVP on basis on this format.<\/p>\n<p>[js]2015-05-27T08:34:01.557805Z web-server1 nginx-access-log &#8211; &#8211; &#8211; hostname=web-server1 appname=nginx-access-log 0.350 117.247.13.187 &#8211; &#8211; [27\/May\/2015:14:04:01 +0530] &quot;GET \/undefined\/uekf=12ed1 HTTP\/1.1&quot; 404 14657 &quot;http:\/\/www.mydomain.com\/?abc=xyz&quot;[\/js]<\/p>\n<p>Now we are going to create RegEx named capture group which is nothing but creating KVP dynamically.<br \/>\nSyntax of RegEx named capture group is as below.<\/p>\n<p>[js]\/some anchor text to finds key location in log (?P&lt;KEY&gt;regEx_to_find_the_Value)\/[\/js]<\/p>\n<p>Here whatever the \u201cregEx_to_find_the_Value\u201d would return is assigned to \u201cKEY\u201d.<br \/>\n\u201cregEx_to_find_the_Value\u201d should evaluate to url which is in this particular case is \u201c\/undefined\/uekf=12ed1\u201d and this could be any url.<\/p>\n<p>In the log, we have url as \u201c\/undefined\/uekf=12ed1\u201d, \u2018&#8221;GET \u2018 which is present just before url can be taken as anchor text. We can use any string as as key, let it be &#8220;URL&#8221;. Now we are required to define RegEx for the urls. Now urls can contains words, numbers, spaces so we have to define regex accordingly.<br \/>\nSo, the regex would be \u201c<strong>[\\\/\\w\\d\\s.]+<\/strong>\u201d.<br \/>\nwhere \u201c\\\u201d is escape character for \u201c\/\u201d<br \/>\n\\w denotes word,<br \/>\n\\s denotes space,<br \/>\n\\d denotes digits,<br \/>\n. denotes any character other than newline.<br \/>\nThe [ ] brackets mean anything inside the square brackets will be checked in the regEx.<br \/>\n\u201c+\u201d denotes the regEx can occur one or more times.<br \/>\nWe have created the following regEx till now.<\/p>\n<p>[js]\/&quot;GET (?P&lt;URL&gt;[\\\/\\w\\d\\s.]+)\/[\/js]<\/p>\n<p>Now we are ready with dynamic KVP.<br \/>\nAs per our scenario, we want all urls throwing 404 error so we first need to search 404. Then we will apply the RegEx named capture group on this result. We need to group the same urls and calculate the occurrence of same urls.SO the final expression will come out to be as follows.<\/p>\n<p>[js]404 AND \/&quot;GET (?P&lt;URL&gt;[\\\/\\w\\d\\s.]+)\/ groupby(URL) calculate(count)[\/js]<\/p>\n<p>Here we are using \u201cAND\u201d, groupby and calculate functions provided by logentries.<\/p>\n<p>Below is the graphical view which I get on searching with above expression in logentries.<br \/>\n<img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-20862\" src=\"\/blog\/wp-ttn-blog\/uploads\/2015\/06\/log-blog.png\" alt=\"log-blog\" width=\"1310\" height=\"653\" \/><\/p>\n<p>Reference:<br \/>\nhttps:\/\/logentries.com\/doc\/regex\/#regex-field-extraction<br \/>\nhttps:\/\/github.com\/logentries\/le\/blob\/master\/README.md#follow-log-files-through-your-configuration-file<br \/>\nhttp:\/\/www.regexr.com\/<\/p>\n<p>Thanks,<br \/>\nNavjot Singh<br \/>\nTeam AWS, TO THE NEW Digital.<br \/>\nnavjot[dot]singh[at]tothenew[dot]com<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Most of us would be using some tools for centralizing logs, their analysis and storage. Logentries is also the one falling into this category. Logentries provides a straightforward way for analysis of logs containing KVPs i.e. Key-Value pairs. But for the cases where KVPs are not present it becomes quite hectic to analyse the logs. [&hellip;]<\/p>\n","protected":false},"author":154,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":6},"categories":[1174],"tags":[1783,431],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/20855"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/154"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=20855"}],"version-history":[{"count":0,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/20855\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=20855"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=20855"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=20855"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}