This article focuses on collecting event data from the local host and landing the data in both the local Windows file system as well as in HDFS.
Future posts will dive into using MiNiFi for remote hosts, as well as, methods to visualize these event data.
My setup:
- Hadoop cluster based on Hortonworks HDP 2.6
- CentOS 7
- Windows 10 PC
- Windows Subsystem for Linux enabled
- MobaXterm
- WinSCP
Following are the basic steps:
- Verify that Java JDK is installed and JAVA_HOME environment variable is set
- Download, install, and configure NiFi on a Windows host
- Develop and configure the NiFi process
- Run the NiFi job
- Validate that data are captured as expected
Download, Install, Configure, and Run NiFi on a Windows Host
The "Getting Started Guide" on Apache's website is straightforward - I've abbreviated the portions needed for this use case.
From the Downloads page select the appropriate version of the binary .zip (for this example I used 1.3.0). If you use a different version, modify the oaths used accordingly.
From the Downloads page select the appropriate version of the binary .zip (for this example I used 1.3.0). If you use a different version, modify the oaths used accordingly.
- Extract .zip to your desired location
- The default port for NiFi is 8080
- Since 8080 is a popular port for for web-enabled applications, you may want to change the port on which NiFi listens
- The port can be configured in the nifi.properties file:
<install directory>\nifi-1.3.0\conf\nifi.properties
# web properties #
nifi.web.http:port=8080
- Copy the core-site.xml and hdfs-site.xml configuration files to the Windows file system.
- This is only necessary if you are landing the data in HDFS
- In my setup, these files are located in /etc/hadoop/conf/. Depending on your Hadoop implementation, there may be slight variations
- Windows Subsystem for Linux now has support for SCP. Alternatively, you can use MobaXterm to transfer files between windows and Linux systems
- Use WinSCP to get core-site.xml and hdfs-site.xml to Windows. I placed mine in <install directory>\nifi-1.3.0\conf\ with the rest of the NiFi configuration files. It doesn't matter where you place them, as long as we can navigate to them later
- Start NiFi
- Open a CMD prompt with administrator permissions
- Navigate to <install directory>\nifi-1.3.0\bin\ and execute run-nifi.bat
- You should see a screen like this:
Develop and Configure the NiFi Process
NiFi uses Processors to do work. Creating a combination of Processors provides a powerful way to manage data flows. Windows event log data is presented as XML. This process will take the XML and transform it to JSON, flatten that JSON, and store that data for future use.
Processors are added dragging the processor icon onto the NiFi canvas. At that point you are prompted to select the processor to be used.
Processors are added dragging the processor icon onto the NiFi canvas. At that point you are prompted to select the processor to be used.
- Get setup to consume Windows Event data
- Apply permissions in Windows to allow programmatic access to the event data channel(s)
- There are instructions linked on the processor help screen, but to keep as much of this in one place as possible, I've included a grief explanation here
- Open a new Administrator permission version of CMD and use these commands to get info and apply permissions
- wmic useraccount get name,sid
(in some cases you may still will not have permission to access this data. If so, use this command instead to show the logged-in user's info)
whoami /user
...at least one of these commands will return the info needed - your SID value - wevtutil gl <CHANNEL>
...where <CHANNEL> is the event channel to which NiFi will listen. This command displays the current permissions for listening to the channel provided. You should see a result something like this:
- wevtutil sl Security /ca:O:BAG:SYD:(A;;0xf0005;;;SY)(A;;0x5;;;BA)(A;;0x1;;;S-1-5-32-573)
- Channels are equivalent to the logs you see in Windows Event Viewer (Application, Security, etc.). For my case, I used the 'Security' channel
- wevtutil sl Security /ca:O:BAG:SYD:(A;;0xf0005;;;SY)(A;;0x5;;;BA)(A;;0x1;;;S-1-5-32-573)(A;;0x1;;; <SID
- Develop the XPath Query for the events of interest
- The easiest way to do this is to open the Windows Event Viewer and select the desired options by using Filter Current Log
- Add ConsumeWindowsEventLog processor
- Add TransformXML processor
- Add JoltTransform processor
- Add a second JoltTransform processor
- Add PutHDFS processor
- Add PutFile processor
No comments:
Post a Comment