Strings!!

See below for what I have tried

What is the best method to turn this data into either simple xml or csv or an object with properties?

I want to take files like this, and capture these pieces of data:

A. Fileid B. Type C. Reqdate D. Reqtime E. Gendate F. Gentime G. Progid H. Desc

Example File 1
FILEID: “JPG139727”
PATH: “/optical/incoming/JPG139727”
TYPE: “JPG”
SECLEV: “”
STATID: “123-12-1212”
USRID: “IDV”
REQDATE: “01/15/2016”
REQTIME: “10:39:51”
GENDATE: “01/15/2016”
GENTIME: “10:39:51”
PROGID: “8219”
GROUPID: “Photo”
DESC: “JOHN DOE”

Example file 2

FILEID: “TXT4553675”
PATH: “/optical/incoming/TXT4553675”
TYPE: “TXT”
SECLEV: “1”
STATID: “”
USRID: “tlr1”
REQDATE: “11/23/2013”
REQTIME: “16:23:15”
GENDATE: “11/23/2013”
GENTIME: “22:30:11”
PROGID: “glvrpt”
GROUPID: “Miscellaneous”
DESC: “G/L Voucher Report”

[pre]

gci .\ -Include *.arf -File -Recurse | ?{$.psparentpath -like “*temp”} | %{$temp = $

$temp.FullName.Replace(‘.arf’, ‘.csv’)
$temp.FullName

get-content $_ | export-csv -Path $temp.FullName -Force
#THIS RESULTS IN A SINGLE COLUMN INSTEAD OF TWO

}
#This exmaple does much the same as above

gci .\ -Include *.arf -File -Recurse | ?{$.psparentpath -like “*temp”} | %{import-csv $ | export-csv ($_.FullName.Replace(‘arf’,‘csv’)) -NoTypeInformation -Delimiter : -Force}

#FINALLY I tried this and it’s closer but I don’t understand how to work with the split out results

gci .\ -Include *.arf -File -Recurse | ?{$.psparentpath -like “*temp”} | %{$array = (get-content $ -Raw).Split(‘/(:)/’)
$array
}
[/pre]

An easy way to turn this data “structure” into something usable could be something like this:

$RawData = @’
FILEID: “TXT4553675”
PATH: “/optical/incoming/TXT4553675”
TYPE: “TXT”
SECLEV: “1”
STATID: “”
USRID: “tlr1”
REQDATE: “11/23/2013”
REQTIME: “16:23:15”
GENDATE: “11/23/2013”
GENTIME: “22:30:11”
PROGID: “glvrpt”
GROUPID: “Miscellaneous”
DESC: “G/L Voucher Report”
‘@
$NewData = $RawData -replace ‘:’, ’ =’ | ConvertFrom-StringData
$NewData

Now you have a hashtable and you can access the single elements by their names.

#region input

$InFolder = '.\in'
$OutFile = '.\myoutfile1.csv'

#endregion

#region Process

$myOutput = foreach ($File in (Get-ChildItem $InFolder)) {
    
    $Content = Get-Content $File.FullName

    #Parse content
    $myObj = [PSCustomObject]@{}
    foreach ($Line in $Content) {
        $myObj | Add-Member -MemberType NoteProperty -Name (($Line.Split(':')[0]).Trim()) -Value (($Line.Split(':')[1]).Replace('"','').Trim())
    }
    
    $myObj
}

#endregion

#region Output

$myOutput | FT -a # out to console
$myOutput | Export-Csv $OutFile -NoType 

#endregion

Note:

  • Parsing depends on the ‘:’ colon as a separator between field name and field value. If colon is missing in any line in input files, or there’s more than one colon, parsing will fail.
  • You may end up with objects having different properties if the input files do not have the same exact ‘field names’

Typically I go for the keep it simple approach, but basically you need to parse the data and create a PSObject.

$files = Get-ChildItem C:\Scripts\*.txt

$results = foreach ($file in $files) {
    $content = Get-Content -Path $files

    $props = @{}
    foreach ($row in $content) {
        $line = ($row -replace '"') -split ':'
        
        $props.Add($line[0].Trim(), $line[1].Trim())
        
    }
    
    New-Object -TypeName PSObject -Property $props
}

$results | 
Select-Object -Property FileId,Type,ReqDate,ReqTime,GenDate,GenTime,ProgId,Desc |
Export-CSV -Path C:\Scripts\my.csv -NoTypeInformation

Once it’s an object, you can filter it, export it or whatever you need to do.

Output:

PS C:\Users\rasim> $results 


USRID   : IDV
TYPE    : JPG
FILEID  : JPG139727
DESC    : JOHN DOE
SECLEV  : 
REQTIME : 10
GENTIME : 10
REQDATE : 01/15/2016
GENDATE : 01/15/2016
GROUPID : Photo
PROGID  : 8219
STATID  : 123-12-1212
PATH    : /optical/incoming/JPG139727

USRID   : IDV
TYPE    : JPG
FILEID  : JPG139727
DESC    : JOHN DOE
SECLEV  : 
REQTIME : 10
GENTIME : 10
REQDATE : 01/15/2016
GENDATE : 01/15/2016
GROUPID : Photo
PROGID  : 8219
STATID  : 123-12-1212
PATH    : /optical/incoming/JPG139727




PS C:\Users\rasim> $results | 
Select-Object -Property FileId,Type,ReqDate,ReqTime,GenDate,GenTime,ProgId,Desc


FILEID  : JPG139727
TYPE    : JPG
REQDATE : 01/15/2016
REQTIME : 10
GENDATE : 01/15/2016
GENTIME : 10
PROGID  : 8219
DESC    : JOHN DOE

FILEID  : JPG139727
TYPE    : JPG
REQDATE : 01/15/2016
REQTIME : 10
GENDATE : 01/15/2016
GENTIME : 10
PROGID  : 8219
DESC    : JOHN DOE

or, inspired by olaf’s idea above, this can be done via a single liner:

gci .\in | % { [PSCustomObject]((Get-Content $_.FullName -Raw) -replace ':', ' =' | ConvertFrom-StringData) } | Export-Csv .\myoutfile2.csv -NoType 

Thank-you all very much, I was just reading your respective posts about date formatting on another thread!

I will post both results once finished, but I have confirmed this does what I need - I may have a ton of these to process so the dynamic naming threw me off at first (working with keys, needed to strip out the “”)

 

[pre]

gci .\ -Include *.arf -File -Recurse | ?{$_.psparentpath -like "*temp"} | %{ $RawData = get-content $_ -raw $NewData = $RawData -replace ':', ' =' | ConvertFrom-StringData
#$NewData.Keys #$NewData.Values -match '[a-z]{3}\w+' $fname = $NewData['Fileid'].replace('"','') pause $NewData | Export-Clixml -Depth 2 -Path C:\$fname-export.xml -Force -verbose }
[/pre]