Line by Line Data Parsing - Retrieving minimal data for csv export

I have extracted the SMBServer Audit log entries for a number of servers and saved those to a text file. The structure of the saved text file looks like this.

--------------------------- SERV01---------------------------


TimeCreated : 4/11/2022 12:41:46 PM
Message     : SMB1 access
              
              Client Address: XXX.XXX.XXX.XXX
              
              Guidance:
              
              This event indicates that a client attempted to access the server using SMB1. To stop auditing SMB1 access, use the Windows PowerShell cmdlet 
              Set-SmbServerConfiguration.



--------------------------- SERV02---------------------------
04/22/2022 15:08:28 - SERV02, Received Error: No events were found that match the specified selection criteria.
--------------------------- SERV03---------------------------

I’m now trying to parse these lines to gather up all the access dates and sources for each server that SMB1 access events.

I’m having trouble finding the best way to keep the data together. I’ve tried using switch but I couldn’t figure out to ensure the date/source data stayed with the server name. I’m having a tough time figuring out the logic of retrieving each line from the file and then keeping the right data associated to the right server.

Here is my current (unfinished) attempt at parsing the data.

$Results = Get-Content -Path '.\SMBServerAudit.txt'

$ServerData = @()
$Server = [PSCustomObject]@{
    Name = 'UNKNOWN'
    SMB1AccessDates = 'UNKNOWN'
    SMB1AccessSources = 'UNKNOWN'
}
Write-Host $Results.Count
$Results | ForEach-Object -Process {
    if ($_ -like '*--*') {
        $temp = $_.Trim('-')
        $temp = $temp.trim()
        $Server.Name = $temp
    }
    elseif ($_ -like 'TimeCreated*') {
        $temp = $_.SubString(14,9)
        $server.SMB1AccessDates = $temp
    }
    elseif ($_ -like '*Client*') {
        $temp = $_.SubString(30)
        $Server.SMB1AccessSources = ''
        $Server.SMB1AccessSources = $Server.SMB1AccessSources + ",$($temp)"
    }
    #save to the global serverdata
    if (($Server.Name -ne 'UNKNOWN') -and ($Server.SMB1AccessDates -ne 'UNKNOWN') -and ($Server.SMB1AccessSources -ne 'UNKNOWN')) {
        $ServerData += $Server
        $Server.Name = 'UNKNOWN'
        $Server.SMB1AccessDates = 'UNKNOWN'
        $Server.SMB1AccessSources = 'UNKNOWN'
    }
    else {
        #Do nothing with the data
    }
}

Ultimately what I want to accomplish is to export minimal data to a csv file in the format of:
ServerName,AccessDate,AccessDate(etc),AccessSource,(AccessSource(etc)
I filtered the eventlog using a max of 10 events. So, there could be up to 10 dates and 10 sources for each server. Of course there could be none (and there are a few with none).

I’m looking for suggestions on how to approach this as I can’t seem to think out the logic I need for this.

My first suggestion would be not to use plain text files to collect structured data. :wink: If you have the chance you should try to get the desired data in a standard format like CSV, XML, YAML, JSON or any other standardized format.

If that’s not possible it will be hard to get reasonable results from an inconsistent data source. In the sample data you shared there are actually already 2 different formats. One for SERV01 and another for SERV02.

A good overview over PowerShells capabilities of plain text parsing has been presented by Tobias Weltner some years ago. You may take the time and watch the video until the end:

2 Likes

I’m using Get-WinEvent to retrieve the data remotely. It’s only giving me two fields with the information that I’m interested in. The time the event was created and then the message, in which I only want the source. I’m retrieving a maximum of 10 events for each server.

Thank you. That is a great video. I’ll have a re-think on extracting the data. Maybe I can figure out a way to parse the information at extraction - now that I’ve got some more ideas.

Thanks again.

You should take a step back and start by showing your event collection. That should be modified to output structured data.

Also the way you show your expected CSV is problematic. You show a line containing a server and two access dates and two access sources. What if there are three or even more for a server? And for those servers that have none or one, now the CSV is broken.

1 Like

I have re-worked the data collection and have the script getting the minimal information I want from the audit logs. I’m now outputting it to a file in csv format.

Thanks Everyone.

Great. Good for you. :+1:t4: :clap:t4:

You may share your solution to help others looking for the same or a similar solution. :wink: