Read text file and split line by line into hash table

by OliAdams at 2013-02-13 12:04:40

I have the information below. I am trying to split it at the : and keep part one as a header and part two as the value which I have done with substring.

I cannot figure out how to run through it line by line and save it into a hashtable to use later. Any help would be appreciated. These entries can appear multiple times in the text file. The Data will end up in SQL but I think I need to create an ID for each block to keep it all together. Sorry for the long post.

JOB ID: {2392EB74-0A2C-4A52-A19B-ED8B63A4359A}
JOB NAME: Backup 00002
JOB STATUS: 19 (Completed success)
SERVER NAME: WIN-F60V0SA33SA
DEVICE NAME: Backup-to-Disk Folder 0001
START TIME: 02/09/2013 14:48:34
ELAPSED TIME: 00:00:32
BYTES PROCESSED: 1,636
ESTIMATED BYTES: 1,636
LOGFILE: C:\Program Files\Symantec\Backup Exec\Data\BEX_WIN-F60V0SA33SA_00001.xml
DIRECTORIES PROCESSED: 2
FILES PROCESSED: 8
FILES SKIPPED: 0
FILES CORRUPT: 0
FILES IN-USE: 0
by JeffH at 2013-02-13 12:12:47
What have you tried so far? The thing about a hash table is that the keys need to be unique which is going to be problematic.
by JeffH at 2013-02-13 12:22:03
Assuming $data are the lines of text, here’s one way to parse it out.


$data | foreach {
#split each line at the :
$split = $.Split(":",2)
#trim off extra spaces
$name=$split[0].Trim()
$value=$split[1].Trim()
$name
$value
}


But it seems like you will have this same block of text multiple times in the same file, one block for each job ID. Is that a fair interpretation?
by OliAdams at 2013-02-13 12:26:42
A perfect understanding, that is the way the data is pulled out of backup exec. I could try to modify the earlier part of the script to create an individual text file for each backupjob ID instead if that would make things easier.
by JeffH at 2013-02-13 12:37:58
If each job had its own file that would definitely make it easier because you could parse the file and turn it into a PowerShell object. Otherwise you have to parse the larger file and group the correct number of lines together. You say you are eventually going to write data to SQL. What did you have in mind before then? It might make a difference in how you reformat the data.
by OliAdams at 2013-02-13 12:47:02
I think it might be easier to take you through the process incase I am doing something stupid earlier on which could help with this part.

These are the commands I have to run in BEMCMD toget the data out.

C:\Program Files\Symantec\Backup Exec>bemcmd -o12 -i

JOB ID: {6EC746DF-97E9-4C6D-A125-654561774D92}

JOB ID: {6EC746DF-97E9-4C6D-A125-654561774D92}

JOB ID: {8E170CB8-CBC0-496D-84BD-2CF12275646A}

JOB ID: {8E170CB8-CBC0-496D-84BD-2CF12275646A}

NUMBER OF OBJECTS: 4
RETURN VALUE: 1

C:\Program Files\Symantec\Backup Exec>bemcmd -o21 -i{6EC746DF-97E9-4C6D-A125-65
561774D92}

JOB ID: {2392EB74-0A2C-4A52-A19B-ED8B63A4359A}
JOB NAME: Backup 00002
JOB ACTUAL START TIME: 02/09/13 14:48:34

JOB ID: {AE4EDCDC-F1E9-447B-8F0A-411E3D8C5C43}
JOB NAME: Backup 00002
JOB ACTUAL START TIME: 02/09/13 14:49:44

NUMBER OF OBJECTS: 2
RETURN VALUE: 1

C:\Program Files\Symantec\Backup Exec>bemcmd -o21 -hi:{2392EB74-0A2C-4A52-A19B-
D8B63A4359A}

JOB ID: {2392EB74-0A2C-4A52-A19B-ED8B63A4359A}
JOB NAME: Backup 00002
JOB STATUS: 19 (Completed success)
SERVER NAME: WIN-F60V0SA33SA
DEVICE NAME: Backup-to-Disk Folder 0001
START TIME: 02/09/2013 14:48:34
ELAPSED TIME: 00:00:32
BYTES PROCESSED: 1,636
ESTIMATED BYTES: 1,636
LOGFILE: C:\Program Files\Symantec\Backup Exec\Data\BEX_WIN-F60V0
A33SA_00001.xml
DIRECTORIES PROCESSED: 2
FILES PROCESSED: 8
FILES SKIPPED: 0
FILES CORRUPT: 0
FILES IN-USE: 0
RETURN VALUE: 1


So first I get a list of the JOB IDS then run each job IDS for a list of backup IDS. Then run each backup id to get a the actual job information. I have also been struggling to store the data at each stage, so I have been using a text file as temp storage. The Powershell code is below. Please be gentle I am quite new to this.

$JobType = ‘-o12’
$ID = ‘-i’
$Command = "C:\program files\Symantec\Backup Exec\BEMCMD"
$Jobs = & $Command $JobType $ID
$jobType = ‘-o21’
$Query = ‘-i’

Set-content ‘c:\Scripts\Test.txt’ $Jobs

[array]$arrIDS = Get-Content "C:\Scripts\Test.txt" | Sort -Unique
Clear-Content "C:\Scripts\Test.txt"

ForEach ($ID in $arrIDS) {

IF ($ID -match ‘JOB ID:’){

$ID = $ID -Replace('JOB ID: ')
$ID = $ID.Trim()
$ID = "-i$ID"

$ArrJobs = & $command $JobType $ID

Add-Content ‘C:\Scripts\Test.txt’ $ArrJobs


}

}

[array]$arrJobs = Get-Content "C:\Scripts\Test.txt" | Sort -Unique

Clear-Content "C:\Scripts\Test.txt"

ForEach($ID in $arrJOBS){



If ($ID -match ‘JOB ID:’){

$ID = $ID -Replace('JOB ID: ')

$ID = "-hi:$ID"

$BackupInfo = & $Command $JobType $ID

Add-Content ‘C:\Scripts\Test.txt’ $BackupInfo
}



}


I understand this is a lot to ask and really do appreciate any feedback.
by JeffH at 2013-02-13 13:58:17
Does the bemcmd tool allow you to format the output in any way such as CSV or XML? That would help tremendously. Otherwise, you are stuck with some gnarly text parsing. There’s just no easy way to do it.
by OliAdams at 2013-02-13 14:09:54
There is no way im afraid. The latest version of Backup exec has a powershell module but only 1 out of the 6 areas I support currently has that. Could I extract the ID no from the first line and create a new text file with the name ID.txt and store 17 Lines of the file into that and repeat that until the file is empty?
by mjolinor at 2013-02-13 14:16:12
Does this help any?
I believe it should split your file into an array text blocks, one per job.

$file = <path to file>

$regex = '(?ms)^JOB ID: .+?^FILES IN-USE: \d+\s*?'

$text = [IO.File]::ReadAllText($file)

$Jobs = [regex]::matches($text,$regex) |
foreach {$
.groups[0].value}

EDIT: That method call should have been ReadAllText, not ReadAllLines. -Fixed.

And looking at that data, it appears that you should be able to do a -replace to replace the first colon with = and then run it through convert-from-stringdata.
by OliAdams at 2013-02-13 14:30:08
i have not used regular expressions before so will have to go and read up on what this part does. I thought the ? means a replacement for any one character but have not seen the ^symbol before. Thanks a lot for all the help I will report back on my progress :smiley:
by mjolinor at 2013-02-13 14:31:53
Note I had to edit that script. If you don’t get any results, in $Jobs, re-copy it an try it again.
by mjolinor at 2013-02-13 16:48:55
And, not knowing when to leave well enough alone:

$file = 'c:\Scripts\test.txt'

$File_regex = '(?ms)^JOB ID: .+?^FILES IN-USE: \d+\s*?'

$text = [IO.File]::ReadAllText($file)

$Jobs = [regex]::matches($text,$file_regex) |

foreach {
$.groups[0].value
}

$job_regex = '^([^:]+?):(.+)\s*?'


$Job_Hashes =

foreach ($job in $jobs)
{

$job_ht = @{}

$job.split("n&quot;&#41; |<br><br> foreach {<br> $_ -match $job_regex &gt; $null<br> $job_ht[$matches[1].trim&#40;&#41;] = $matches[2].trim&#40;&#41;<br> }<br><br> $Job_ht<br><br>}<br><br><br> $job_hashes</code><br><br>Note - if you have V3, you can trade that [IO.File]::ReadAllText for Get-Content -Raw, <br>and you might want to change line 21 to be an ordered hash table ( $job_ht = [ordered]@{})</blockquote>by OliAdams at 2013-02-14 00:02:54<blockquote>I must be missing something because I cannot get any output. Am I correct in thinking if I put write-output $Job before $job_ht = @{} It should display some output? I cannot get any output at all. Also is $job_Regex used as it gets declared but doesnt seem to be used anywhere.</blockquote>by mjolinor at 2013-02-14 03:19:23<blockquote>The missing $job_regex has been fixed. I wrote that based on the block of lines in your first post, and tested it against a file that contained that block of lines repeated a few times. If your data doesn't really look like that, then it may not match, and the regex will need to be adjusted.</blockquote>by OliAdams at 2013-02-14 05:04:47<blockquote>I havent used regex much before and dont like putting anything in a script I dont understand, if that makes any sense. I am about to start reading the section of learn powershell 3 in a month of lunches I just got which is related to Regular Expressions. I thought I would have another go whilst I waited for your reply. Not implying I had to wait a long time ofc your reply was very fast :D. I came up with the below and IT WORKS and solves the other problem I had where if I just export it all as is there is no way of identifying which parts should be with what.<br><br>This is a very rough draft as i have not had time to clean it yet I just got it working. <br><br>$file = 'c:\Scripts\test.txt'<br> <br> $Test = Get-Content $File -TotalCount -1 -Delimiter 'RETURN VALUE: 1'<br> <br> $Job_ht = @{}<br><br>ForEach ($Object in $Test) {<br> <br> If ($object.IndexOf(':') -lt 1){Continue}<br> <br> $FileName = ($Object.Substring($Object.IndexOf(':') + 1, 56)).TrimStart() <br> $FileName = $FileName -Replace ('{')<br> $FileName = $FileName -Replace ('}')<br> $Filename = $Filename.Substring(0,$Object.IndexOf('-')-1)<br><br> $Object = $Object.Replace(': ',' =')<br> <br> $Object.Split(&quot;n") | ForEach{

If($
-match(‘JOB ID:’)){$ID = ($.Substring($.IndexOf(‘{’),($.IndexOf(‘}’))))}

If ($
.IndexOf(‘=’) -lt 1) {} Else {

$Split = $_.Split(‘=’,2)

$Name = $Split[0].Trim()
$Value = $Split[1].Trim()

$obj = New-Object System.Object
$Obj | Add-Member -MemberType NoteProperty -Name Name -Value $Name
$Obj | Add-Member -MemberType NoteProperty -Name Value -Value $Value
$Obj | Add-Member -MemberType NoteProperty -Name ID -Value $ID

$obj | Export-csv "C:\NewBackupInfo.csv" -Append -NoTypeInformation


Get-Variable Name | Export-Csv -Path ‘C:\Scripts\Data.csv’ -Append -NoTypeInformation -NoClobber


}
}
}


Thanks to all without your ideas and assistance I could never of done this. Now im off do the same for backup exec 2012 luckily that how powershell support-