{"id":17,"date":"2013-01-31T08:00:57","date_gmt":"2013-01-31T16:00:57","guid":{"rendered":"http:\/\/www.ossintegrators.com\/blog\/?p=17"},"modified":"2013-02-05T10:13:11","modified_gmt":"2013-02-05T18:13:11","slug":"splunk-aternity-part-1","status":"publish","type":"post","link":"http:\/\/www.ossintegrators.com\/blog\/splunk-aternity-part-1\/","title":{"rendered":"Importing Aternity Log Data in to Splunk, Part 1"},"content":{"rendered":"<div style=\"padding-bottom:20px; padding-top:10px;\" class=\"hupso-share-buttons\"><!-- Hupso Share Buttons - http:\/\/www.hupso.com\/share\/ --><a class=\"hupso_toolbar\" href=\"http:\/\/www.hupso.com\/share\/\"><img decoding=\"async\" src=\"http:\/\/static.hupso.com\/share\/buttons\/share-small.png\" border=\"0\" style=\"padding-top:5px; float:left;\" alt=\"Share\"\/><\/a><script type=\"text\/javascript\">var hupso_services_t=new Array(\"Twitter\",\"Facebook\",\"Google Plus\",\"Linkedin\",\"Digg\",\"Reddit\");var hupso_toolbar_size_t=\"small\";var hupso_counters_lang = \"en_US\";var hupso_url_t=\"\";var hupso_title_t=\"Importing Aternity Log Data in to Splunk, Part 1\";<\/script><script type=\"text\/javascript\" src=\"http:\/\/static.hupso.com\/share\/js\/share_toolbar.js\"><\/script><!-- Hupso Share Buttons --><\/div><p>In this post I will be going over how to import unstructured data in to Splunk, extract fields from the data, and use those fields to create a simple dashboard. This example can be followed using a free trial of Splunk, available <a href=\"http:\/\/www.splunk.com\/download\">here<\/a>. The sample data I will be using is available <a href=\"http:\/\/www.ossintegrators.com\/blog\/examples\/aternity-statistics.rar\">here<\/a>. For this post I&#8217;ve used a Windows instance of Splunk, but the interfaces are largely the same, so you should have no trouble following along if you choose to use Linux instead.<br \/>\n<!--more--><\/p>\n<p><strong>Table of Contents<br \/>\n<\/strong><\/p>\n<ul>\n<li><a href=\"#_Importing_the_raw\">Importing the raw data<\/a><\/li>\n<li><a href=\"#_Verify_the_data\">Verify the data<\/a><\/li>\n<li><a href=\"#_Working_with_field\">Working with field extractions<\/a><\/li>\n<li><a href=\"#_Testing_regex_in\">Testing regex in Splunk<\/a><\/li>\n<\/ul>\n<h1>Importing the raw data<\/h1>\n<p>First we&#8217;ll get the actual data in to Splunk so that we can begin working with it. In this case we&#8217;ll import the data directly to the Splunk server, but in the real world you&#8217;d likely be using Splunk&#8217;s Universal Forwarder or sending syslog traps to get the data from another server. Additional information on the Universal Forwarder can be found <a href=\"http:\/\/docs.splunk.com\/Documentation\/Splunk\/latest\/Deploy\/Introducingtheuniversalforwarder\">here<\/a>.<\/p>\n<p>Log in to the Splunk web interface and click on the &#8220;Manager&#8221; link at the top right of the main page.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Manager\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt1.png\" \/><\/p>\n<p>Select &#8220;Data inputs&#8221; from the &#8220;Data&#8221; section on the right hand side of the page.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Data Inputs\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt2.png\" \/><\/p>\n<p>Select the &#8220;Add new&#8221; action in the row &#8220;Files &amp; Directories&#8221; (Note: If you are using Linux you will see a slightly different view.)<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Add New Data Input\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt3.png\" \/><\/p>\n<p>Browse to where you saved the sample file, and then click &#8220;Continue&#8221;<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Preview Data\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt4.png\" \/><\/p>\n<p>Splunk will then pop up a window saying it could not determine a source type. Since this is our first time bringing Aternity data in to Splunk we will tell it to &#8220;Start a new source type&#8221; and click &#8220;Continue&#8221;.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Start New Source Type\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt5.png\" \/><\/p>\n<p>A quick peek at the event viewer shows that Splunk is handling highly verbose multi line log entries like a champ. If the data was being improperly broken up (either multiple messages being treated as a single entry or one message being broken in to multiple parts) you would need to adjust the timestamp and event break settings, which is outside the scope of this post. More information on that can be found <a href=\"http:\/\/docs.splunk.com\/Documentation\/Splunk\/latest\/Data\/Indexmulti-lineevents\">here<\/a>. Given that we&#8217;re happy with the way the data looks, click &#8220;Continue&#8221; to move on to the next step.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Data Preview\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt6.png\" \/><\/p>\n<p>Splunk will pop up another window asking us to name our source type. Because there are multiple potential log file types we might want to bring in to Splunk from Aternity, I will use the name &#8220;<code>aternity_stats<\/code>&#8221; so that we can add other &#8220;<code>aternity_<\/code>&#8221; source types later and keep things consistent. Click &#8220;Save source type&#8221; when you&#8217;re done reviewing the settings.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Review Sourcetype\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt7.png\" \/><\/p>\n<p>The next page gives us a host of options, but we really only need to concern ourselves with the first one. Splunk will ask us if we want to continually index the data, or if we just want to index the file once. For cases like our where we just want to import a fixed data sample, it&#8217;s important to remember to select the Index once option, otherwise Splunk will continue to monitor the file, regardless of whether it&#8217;s having any new data added to it. For a demo environment this might not be much of a concern, but in a production environment, things like this can add up and have major performance impacts.<\/p>\n<p>We&#8217;ll leave the rest of the settings alone, click &#8220;Save&#8221; once you&#8217;ve had a chance to read over them.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Index File Once\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt8.png\" \/><\/p>\n<h1>Verify the data<\/h1>\n<p>We have now successfully imported the data in to Splunk. We&#8217;ll perform a couple of steps to verify this. First, click on &#8220;App&#8221; in the upper right hand corner and select &#8220;Search&#8221;.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Search App\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt9.png\" \/><\/p>\n<p>On the main Search app page we&#8217;ll see a summary of the data in Splunk. If this is a clean install of Splunk, just seeing a non-zero entry for &#8220;Events indexed&#8221; is a good indication. We can also see that is has specifically indexed the source we gave it, and that &#8220;<code>aternity_stats<\/code>&#8221; is showing up as a sourcetype.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Search Main Page\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt10.png\" \/><\/p>\n<p>If this were an already running instance, the data volume might not make the Top lists, so we would have to verify it by searching &#8220;<code>sourcetype=aternity_stats<\/code>&#8221; and see that results are returning.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Search sourcetype=atternity_stats\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt11.png\" \/><\/p>\n<h1>Working with field extractions<\/h1>\n<p>If you haven&#8217;t already, run the search above. We&#8217;ll see that Splunk automatically identifies any fields that match its key\/value pair intelligence, which can be found to the left of the search results. This can often allow you to start putting together useful data visualizations right out of the box.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Side Bar\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt12.png\" \/><\/p>\n<p>We&#8217;re going to extract data that Splunk doesn&#8217;t recognize right away. There are a few of ways to do this, including using Splunk&#8217;s <a href=\"http:\/\/docs.splunk.com\/Documentation\/Splunk\/5.0.1\/Knowledge\/ExtractfieldsinteractivelywithIFX\">Interactive Field Extractor<\/a> (IFX), or you can write your own regex (which I prefer). If you are unfamiliar with regex a good set of tutorials can be found <a href=\"http:\/\/www.regular-expressions.info\">here<\/a>. If you are uncomfortable working with regex, I would suggest trying out the IFX, or you can just skip to the next section and work with the pre-built regex that I provide.<\/p>\n<p>Add &#8220;Connected EPMs&#8221; to your search string so that it reads &#8220;<code>sourcetype=aternity_stats Connected EPMs<\/code>&#8220;. It is possible to just display the log in its entirety in the results by clicking &#8220;Show all 15 lines&#8221; and copy from there, but given that Splunk allows you to navigate the data by clicking, this can make copying a bit trickier.<\/p>\n<p>Instead, click the down arrow next to one of the results and select show source.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Show Source\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt13.png\" \/><\/p>\n<p>This will display the full source in the context of the log file it was taken from. From there we want to copy all the data that is highlighted. <img decoding=\"async\" alt=\"\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt14.png\" \/><\/p>\n<p>Next you&#8217;ll want to fire up the regex tester of your choice. I am personally a fan of <a href=\"http:\/\/www.regexbuddy.com\/\">RegexBuddy<\/a>, but there are also free web based tools such as <a href=\"http:\/\/rubular.com\/\">rubular<\/a> or <a href=\"http:\/\/regexpal.com\/\">regexpal<\/a> (regexpal doesn&#8217;t understand the splunk ?&lt;field&gt; syntax, so you will need to add it after the fact). Our goal is to extract the following fields:<\/p>\n<ul>\n<li>Load Percentage<\/li>\n<li>Connected EPs<\/li>\n<li>Fully-Connected Agents<\/li>\n<\/ul>\n<p>The syntax of <code>(?&lt;field_name&gt;value)<\/code> is used to tell Splunk what to extract and what to name it, replacing field_name with the name you&#8217;d like splunk to use for the field. (see next section examples)<\/p>\n<p>The key is to write the expression you feel most closely matches the data and make sure to test, test, test. In our case, we need to pay particularly close attention to how numbers get formatted in the data: Are they comma separated? Do they have a decimal point? Etc. If you want to take a crack at the regex, stop now and try it out.<\/p>\n<h1>Testing regex in Splunk<\/h1>\n<p>The following are the regex to extract each of the fields:<\/p>\n<p>&nbsp;<\/p>\n<ul>\n<li><code><code>Load Percentage\\(rounded\\): (?&lt;load_perc&gt;\\d+)%<\/code><\/code><\/li>\n<li><code><code>Connected EPs:\\s+(?&lt;connected_eps&gt;[\\d,]+)<\/code><\/code><\/li>\n<li><code><code>Fully-Connected Agents:\\s+(?&lt;connected_agents&gt;[\\d,]+)<\/code><\/code><\/li>\n<\/ul>\n<p>We can test these in Splunk by piping them via the rex command in to our existing search, for example:<\/p>\n<p><code>sourcetype=aternity_stats Connected EPMs | rex \"Connected EPs:\\s+(?&lt;connected_eps&gt;[\\d,]+)\"<\/code><\/p>\n<p>The above search allows us to see the connected_eps field to verify that the data was being extracted properly. As you can see in the screen shot below, the max value never goes above 933. Since it never breaks in to the thousands, we don&#8217;t know if it will be comma separated or not, which is why I&#8217;ve included a comma as a valid character just in case.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Rex Search\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt15.png\" \/><\/p>\n<p>Once we&#8217;re satisfied that our regex is extracting what we&#8217;ve intended we can set these up as permanent field extractions. Click on the &#8220;Manager&#8221; link on the top right of the page and Select &#8220;Fields&#8221; from the &#8220;Knowledge&#8221; section.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Knowledge Fields\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt16.png\" \/><\/p>\n<p>Then select &#8220;Field Extractions&#8221;<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Field Extractions\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt17.png\" \/><\/p>\n<p>And click on &#8220;New&#8221; in the next page<\/p>\n<p><img decoding=\"async\" alt=\"Splunk New Field Extraction\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt18.png\" \/><\/p>\n<p>We are then presented with the form below, which we want to fill out as pictured:<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Add New Field Extraction\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt19.png\" \/><\/p>\n<p>Note that the &#8220;Name&#8221; field is simply a name for our field extraction, and is separate from the name of the field we&#8217;re extracting. In this case, Splunk will extract the variable as &#8220;<code>load_perc<\/code>&#8220;, which we&#8217;ll see in a minute. In most cases it&#8217;s easiest to keep the names identical for simplicity, I&#8217;ve just tweaked it in this case to illustrate the point. Click the &#8220;Save&#8221; button in the lower right when finished. You will be returned to the main Field extractions page. Click &#8220;New&#8221; and repeat for the other two extracts with the following details:<\/p>\n<p>Name: <code>connected_eps<\/code><br \/>\nSourcetype: <code>aternity_stats<\/code><br \/>\nExtraction: <code>Connected EPs:\\s+(?&lt;connected_eps&gt;[\\d,]+)<\/code><\/p>\n<p>Name: <code>connected_agents<\/code><br \/>\nSourcetype: <code>aternity_stats<\/code><br \/>\nExtraction: <code>Fully-Connected Agents:\\s+(?&lt;connected_agents&gt;[\\d,]+)<\/code><\/p>\n<p>Once this is done we can spot check our work by selecting &#8220;Admin&#8221; (or whatever user name you&#8217;re using) from the &#8220;Owner&#8221; drop down.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Check Field Extractions\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt20.png\" \/><\/p>\n<p>Next click on &#8220;&lt;&lt; Back to Search&#8221; in the upper left hand corner. Type &#8220;<code>sourcetype=aternity_stats<\/code>&#8221; in the main search bar, and click on &#8220;View all 70 fields&#8221; in the Fields pane to the left of the results.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk View All Fields\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt21.png\" \/><\/p>\n<p>This will present us with a list of all fields related to the search data that is known to Splunk. You can lookup the fields we have entered by typing them in to the &#8220;Keyword&#8221; box or by just finding them on the list. Click the Green arrow button to the left of the name to add each to the Selected Fields.<\/p>\n<p><img decoding=\"async\" alt=\"Splunk Select Fields\" src=\"http:\/\/www.ossintegrators.com\/blog\/wp-content\/uploads\/2013\/01\/013113_1916_ImportingAt22.png\" \/><\/p>\n<p>Click &#8220;Save&#8221; once you&#8217;re done. You should now see the selected fields show up in the selected fields pane. You can now quickly access these fields to visualize your data in new ways.<\/p>\n<p>This concludes Part 1 of 2. In this example we&#8217;ve imported new unstructured data in to Splunk, and then extracted fields from that data using RegEx, and made sure we didn&#8217;t screw up along the way. In the next post we will cover how to use these fields to create dashboards, perform field lookups, and wrap all of this up in to a neat application that you can use to distribute as you see fit.<\/p>\n<p>Check out our YouTube channel for additional videos on Splunk <a href=\"http:\/\/www.youtube.com\/user\/OSSIntegrators\">here<\/a>.<\/p>\n<p>If you&#8217;re local to the Seattle area and would like to try this in a guided session, be sure to check out our events page <a href=\"http:\/\/www.ossintegrators.com\/events\">here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<div style=\"padding-bottom:20px; padding-top:10px;\" class=\"hupso-share-buttons\"><!-- Hupso Share Buttons - http:\/\/www.hupso.com\/share\/ --><a class=\"hupso_toolbar\" href=\"http:\/\/www.hupso.com\/share\/\"><img src=\"http:\/\/static.hupso.com\/share\/buttons\/share-small.png\" border=\"0\" style=\"padding-top:5px; float:left;\" alt=\"Share\"\/><\/a><script type=\"text\/javascript\">var hupso_services_t=new Array(\"Twitter\",\"Facebook\",\"Google Plus\",\"Linkedin\",\"Digg\",\"Reddit\");var hupso_toolbar_size_t=\"small\";var hupso_counters_lang = \"en_US\";var hupso_url_t=\"\";var hupso_title_t=\"Importing Aternity Log Data in to Splunk, Part 1\";<\/script><script type=\"text\/javascript\" src=\"http:\/\/static.hupso.com\/share\/js\/share_toolbar.js\"><\/script><!-- Hupso Share Buttons --><\/div><p>In this post I will be going over how to import unstructured data in to Splunk, extract fields from the data, and use those fields to create a simple dashboard. This example can be followed using a free trial of Splunk, available here. The sample data I will be using is available here. For this [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[6],"tags":[11,8,10,19,20,15,13,21,14,9,53,12],"class_list":["post-17","post","type-post","status-publish","format-standard","hentry","category-splunk","tag-application-management","tag-aternity","tag-big-data","tag-business-intelligence","tag-data-processing-unstructured-data-data-solutions","tag-it-operations","tag-linux","tag-machine-data","tag-operational-intelligence","tag-regex","tag-splunk","tag-windows"],"_links":{"self":[{"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/posts\/17","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/comments?post=17"}],"version-history":[{"count":19,"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/posts\/17\/revisions"}],"predecessor-version":[{"id":165,"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/posts\/17\/revisions\/165"}],"wp:attachment":[{"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/media?parent=17"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/categories?post=17"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.ossintegrators.com\/blog\/wp-json\/wp\/v2\/tags?post=17"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}