Saturday, July 15, 2017

How-to: Capture and Ingest Windows Event Logs Using NiFi

One of the use cases I wanted to prove out was the consumption of Windows Event logs. For this proof-of-concept I am using Apache NiFi. The walk-through will reference other posts that cover individual components of this approach.

This article focuses on collecting event data from the local host and landing the data in both the local Windows file system as well as in HDFS.

Future posts will dive into using MiNiFi for remote hosts, as well as, methods to visualize these event data.

My setup:

  • Hadoop cluster based on Hortonworks HDP 2.6
    • CentOS 7
  • Windows 10 PC
    • Windows Subsystem for Linux enabled
    • MobaXterm
    • WinSCP

Following are the basic steps:
  1. Verify that Java JDK is installed and JAVA_HOME environment variable is set
  2. Download, install, and configure NiFi on a Windows host
  3. Develop and configure the NiFi process
  4. Run the NiFi job
  5. Validate that data are captured as expected

Download, Install, Configure, and Run NiFi on a Windows Host

The  "Getting Started Guide" on Apache's website is straightforward - I've abbreviated the portions needed for this use case.

From the Downloads page select the appropriate version of the binary .zip (for this example I used 1.3.0). If you use a different version, modify the oaths used accordingly.
  1. Extract .zip to your desired location
  2. The default port for NiFi is 8080
    • Since 8080 is a popular port for for web-enabled applications, you may want to change the port on which NiFi listens
    • The port can be configured in the nifi.properties file:

      <install directory>\nifi-1.3.0\conf\nifi.properties

      # web properties #

      nifi.web.http:port=8080
       
  3. Copy the core-site.xml and hdfs-site.xml configuration files to the Windows file system.
    • This is only necessary if you are landing the data in HDFS
    • In my setup, these files are located in /etc/hadoop/conf/. Depending on your Hadoop implementation, there may be slight variations
    • Windows Subsystem for Linux now has support for SCP. Alternatively, you can use MobaXterm to transfer files between windows and Linux systems
    • Use WinSCP to get core-site.xml and hdfs-site.xml to Windows. I placed mine in <install directory>\nifi-1.3.0\conf\ with the rest of the NiFi configuration files. It doesn't matter where you place them, as long as we can navigate to them later
  4. Start NiFi 
    • Open a CMD prompt with administrator permissions
    • Navigate to <install directory>\nifi-1.3.0\bin\ and execute run-nifi.bat 
    • You should see a screen like this:
       

Develop and Configure the NiFi Process

NiFi uses Processors to do work. Creating a combination of Processors provides a powerful way to manage data flows. Windows event log data is presented as XML. This process will take the XML and transform it to JSON, flatten that JSON, and store that data for future use.

Processors are added dragging the processor icon onto the NiFi canvas. At that point you are prompted to select the processor to be used. 



  1. Get setup to consume Windows Event data
    • Apply permissions in Windows to allow programmatic access to the event data channel(s)
      • There are instructions linked on the processor help screen, but to keep as much of this in one place as possible, I've included a grief explanation here
      • Open a new Administrator permission version of CMD and use these commands to get info and apply permissions
      • wmic useraccount get name,sid

        (in some cases you may still will not have permission to access this data. If so, use this command instead to show the logged-in user's info)

        whoami /user

        ...at least one of these commands will return the info needed - your SID value
      • wevtutil gl <CHANNEL>

        ...where <CHANNEL> is the event channel to which NiFi will listen. This command displays the current permissions for listening to the channel provided. You should see a result something like this: 









      • wevtutil sl Security /ca:O:BAG:SYD:(A;;0xf0005;;;SY)(A;;0x5;;;BA)(A;;0x1;;;S-1-5-32-573)
      • Channels are equivalent to the logs you see in Windows Event Viewer (Application, Security, etc.). For my case, I used the 'Security' channel


      • wevtutil sl Security /ca:O:BAG:SYD:(A;;0xf0005;;;SY)(A;;0x5;;;BA)(A;;0x1;;;S-1-5-32-573)(A;;0x1;;; <SID
    • Develop the XPath Query for the events of interest
      • The easiest way to do this is to open the Windows Event Viewer and select the desired options by using Filter Current Log


      • After making your filter selections, click on the XML tab and copy the XPath query




  2. Add ConsumeWindowsEventLog processor

  3. Add TransformXML processor

  4. Add JoltTransform processor

  5. Add a second JoltTransform processor

  6. Add PutHDFS processor

  7. Add PutFile processor


Run the NiFi Job


Validate that Data are Captured as Expected



How-to: Setup Java JDK on Windows for Development Purposes

Most of the posts I make here will have a Java component. With that in mind, it makes sense that there be some information about making sure that is installed an configured correctly. You can always go direct to the source, but since you are here, I've included the basics.

If you are going to be doing development on a Windows computer, these are the basic steps you need to complete to ease you along:
  1. Download the JDK needed
  2. Install the JDK
  3. Set the JAVA_HOME environment variable
  4. Add the JDK's /bin folder to the PATH environment variable

Download the JDK

On this page there is a lot of good information that can help you determine what to do next. Sometimes all of that info is too much info. In this case, we are going to install JDK 8. 
  1. Navigate to Oracle's Java Downloads page
  2. Click the Java icon for Java Platform (JDK) 8uXXX 
  3. Check the radio button to 'Accept the License Agreement' in the middle of the page
  4. Click on the JDK you want to download (Windows x86 or x64), in this case, jdk-8u131-windows-x64.exe

Install the JDK

  1. Once the .exe installer is downloaded, run it
  2. Unless you have a compelling reason, it is easiest to stick with the default install path
  3. Select the default selections, unless you know why you need to make changes

Set the JAVA_HOME Environment Variable

  1. Get the path to the JDK root folder

  2. Navigate to the environment variables. In Windows 10, right-click Start icon and select System and then choose Advanced System Settings



    ~or~

    left-click Start and start typing 'Environment' until Edit the System Environment Variables appears as an option. Select that option


  3. Create and set the environment variable.
    Variable Name: JAVA_HOME
    Variable Value: C:\Program Files\Java\jdk1.8.0_131

  4. Click OK two times to save the changes

Add JDK /bin folder to the PATH

  1. To add the /bin folder to the PATH variable, select it from the same dialogue box from where you created the JAVA_HOME variable
  2. Select the PATH variable and click Edit...
  3. Add the same path used in the JAVA_HOME variable, but add the subfolder /bin to the end of that path. Your new value should be something like this:

    C:\Program Files\Java\jdk1.8.0_131\bin