Training
Get a free hour of SANS training

Experience SANS training through course previews.

Learn More
Learning Paths
Can't find what you are looking for?

Let us help.

Contact us
Resources
Join the SANS Community

Become a member for instant access to our free resources.

Sign Up
For Organizations
Interested in developing a training plan to fit your organization’s needs?

We're here to help.

Contact Us
Talk with an expert

Month of PowerShell: Working with Log Files

In this article we'll look at how we can leverage PowerShell's object-passing pipeline to parse and retrieve data from an IIS web server log file.

Authored byJoshua Wright
Joshua Wright

#monthofpowershell

When I started working with PowerShell, I immediately wanted to use PowerShell to do things the way I do them in Bash and Zsh shells. I wanted to find the PowerShell versions of [code]cut[/code]/[code]awk[/code]/[code]sed[/code]/[code]tr[/code]/[code]grep[/code]. While PowerShell can do many of the things those tools do, it's not really The PowerShell Way.

For example, let's look at working with an IIS web server log file:

PS C:\Users\Sec504> gci .\u_ex220608.log


    Directory: C:\Users\Sec504


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----          6/9/2022  12:00 AM         416083 u_ex220608.log


PS C:\Users\Sec504> Get-Content .\u_ex220608.log | Select-Object -First 5
#Software: Microsoft Internet Information Services 10.0
#Version: 1.0
#Date: 2022-06-08 16:25:40
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
2022-06-08 16:25:40 ::1 GET / - 80 - ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/102.0.5005.63+Safari/537.36+Edg/102.0.1245.33 - 200 0 0 1008

After the first 4 header fields, the IIS log entry record looks similar to other ASCII log files, capturing the date, source, HTTP verb, URI, and more. I want to identify the IP address with the most requests from this log file.

I started to collect information by cutting columns and treating the log file as text:

PS C:\Users\Sec504> Get-Content .\u_ex220608.log  | Select-Object -Skip 4 | ForEach-Object { ($_ -Split(" "))[8] }
...
172.30.48.149
172.30.48.149
172.30.48.149
172.30.48.149

Let's break down this command, piece-by-piece:

  • [code]Get-Content .\u_ex220608.log |[/code]: Retrieve the content of the IIS server log file, start a pipeline
  • [code]Select-Object -Skip 4 |[/code]: Skip the first 4 lines of the log file (these are header lines in IIS log files)
  • [code]ForEach-Object {[/code]: Start a loop where the commands within the [code]{}[/code] block will execute for each line in the log file
  • [code]([/code]: Start parenthesis bracket to execute the code here first
  • [code]\(_ -Split(" ")[/code]: Using [code]\)_[/code] to refer to the current line of log data, split the line into multiple elements delimited by a space
  • [code])[/code]: Close the parenthesis bracket, finishing the code
  • [code][8][/code]: From the returned data in the [code]()[/code] parenthesis, access the 9th element (8th offset; that is, starting at 0) which is the IP address of the client in the log file
  • [code]}[/code]: Close the code block executing in the [code]ForEach-Object[/code] loop

This approach to accessing the log file data isn't ideal though, since it only produces a list of IP addresses. As an alternative, let's look at converting the log file into a PowerShell object that allows us to interrogate it using standard PowerShell commands and the pipeline.

First, let's look at the first few header lines in the log file:

PS C:\Users\Sec504> Get-Content .\u_ex220608.log | Select-Object -First 4
#Software: Microsoft Internet Information Services 10.0
#Version: 1.0
#Date: 2022-06-08 16:25:40
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken

The last line shows us the field names for each of the rows of data that follow. We can use this data to create an array of field names:

PS C:\Users\Sec504> $fields = "date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken" -Split " "
PS C:\Users\Sec504>

Here I declared a variable, [code]$fields[/code], and I cut-and-paste the list of field names within quotation marks to treat it as a string. Using the string [code]-Split[/code] operator, PowerShell will convert this to an array object, where each field name it a different element in the array.

Once we have the column names in an array, we can convert the log file to an array of custom PowerShell objects using [code]ConvertFrom-String[/code]. We also need to skip the lines that begin with [code]#[/code] as part of our pipeline with [code]Select-String -NotMatch[/code]:

PS C:\Users\Sec504\Desktop> $weblog = get-content .\u_ex220622.log | Select-String -NotMatch "^#"| ConvertFrom-String -PropertyNames $fields
PS C:\Users\Sec504\Desktop>
  • [code]$weblog =[/code] Declare a variable [code]$weblog[/code] which will hold the array of custom PowerShell objects
  • [code]get-content .\u_ex220622.log |[/code] Read the log file, start the pipeline
  • [code]Select-String -NotMatch "^#"|[/code] Skip the lines that begin with [code]#[/code] using the regular expression marker [code]^[/code] (which means the beginning of the line), followed by [code]#[/code]
  • [code]ConvertFrom-String -PropertyNames $fields[/code] Convert the data to a custom PowerShell object using the property names defined in the [code]$fields[/code] array

Now, [code]$weblog[/code] is an array of custom PowerShell objects that we can access using the pipeline and [code]Select-Object[/code]:

PS C:\Users\Sec504> $weblog | Select-object -property c-ip

c-ip
----
::1
::1
172.30.48.149
172.30.48.149
...

We can count the unique IP addresses in the [code]c-ip[/code] (client IP) property using [code]Group-Object[/code]:

PS C:\Users\Sec504> $weblog | Group-Object c-ip

Count Name                      Group
----- ----                      -----
  635 172.30.48.149             {@{date=6/8/2022 12:00:00 AM; time=17:54:55; s-ip=172.30.48.149; cs-method=GET; cs-u...
  412 172.30.48.1               {@{date=6/8/2022 12:00:00 AM; time=18:53:29; s-ip=172.30.48.149; cs-method=GET; cs-u...
   26 172.30.48.148             {@{date=6/8/2022 12:00:00 AM; time=22:15:42; s-ip=172.30.48.149; cs-method=GET; cs-u...

We can build complex queries, such as identifying the top 5 endpoint URIs that took the longest to process:

PS C:\Users\Sec504> $weblog | Sort-Object -Property time-taken -Descending | Select-Object -First 5 -Property cs-uri-stem

        cs-uri-stem
        -----------
                  /
/dashboard/overview
               /DBL1
               /DBL2
             /status


PS C:\Users\Sec504>

The ability to convert text into custom PowerShell objects is really powerful, since it allows us to leverage all of the other PowerShell commands to interrogate and process the data. For the first time, I'm seeing something in PowerShell that makes me think I'd could make the switch away from Unix text processing tools.

-Joshua Wright

Return to Getting Started With PowerShell


Joshua Wright is the author of SANS SEC504: Hacker Tools, Techniques, and Incident Handling, a faculty fellow for the SANS Institute, and a senior technical director at Counter Hack.

Month of PowerShell: Working with Log Files | SANS Institute