ncrooks
February 22, 2019, 9:01am
1
See below for what I have tried
What is the best method to turn this data into either simple xml or csv or an object with properties?
I want to take files like this, and capture these pieces of data:
A. Fileid B. Type C. Reqdate D. Reqtime E. Gendate F. Gentime G. Progid H. Desc
Example File 1
FILEID: “JPG139727”
PATH: “/optical/incoming/JPG139727”
TYPE: “JPG”
SECLEV: “”
STATID: “123-12-1212”
USRID: “IDV”
REQDATE: “01/15/2016”
REQTIME: “10:39:51”
GENDATE: “01/15/2016”
GENTIME: “10:39:51”
PROGID: “8219”
GROUPID: “Photo”
DESC: “JOHN DOE”
Example file 2
FILEID: “TXT4553675”
PATH: “/optical/incoming/TXT4553675”
TYPE: “TXT”
SECLEV: “1”
STATID: “”
USRID: “tlr1”
REQDATE: “11/23/2013”
REQTIME: “16:23:15”
GENDATE: “11/23/2013”
GENTIME: “22:30:11”
PROGID: “glvrpt”
GROUPID: “Miscellaneous”
DESC: “G/L Voucher Report”
[pre]
gci .\ -Include *.arf -File -Recurse | ?{$.psparentpath -like “*temp”} | %{$temp = $
$temp.FullName.Replace(‘.arf’, ‘.csv’)
$temp.FullName
get-content $_ | export-csv -Path $temp.FullName -Force
#THIS RESULTS IN A SINGLE COLUMN INSTEAD OF TWO
}
#This exmaple does much the same as above
gci .\ -Include *.arf -File -Recurse | ?{$.psparentpath -like “*temp”} | %{import-csv $ | export-csv ($_.FullName.Replace(‘arf’,‘csv’)) -NoTypeInformation -Delimiter : -Force}
#FINALLY I tried this and it’s closer but I don’t understand how to work with the split out results
gci .\ -Include *.arf -File -Recurse | ?{$.psparentpath -like “*temp”} | %{$array = (get-content $ -Raw).Split(‘/(:)/’)
$array
}
[/pre]
Olaf
February 22, 2019, 9:40am
2
An easy way to turn this data “structure” into something usable could be something like this:
$RawData = @’
FILEID: “TXT4553675”
PATH: “/optical/incoming/TXT4553675”
TYPE: “TXT”
SECLEV: “1”
STATID: “”
USRID: “tlr1”
REQDATE: “11/23/2013”
REQTIME: “16:23:15”
GENDATE: “11/23/2013”
GENTIME: “22:30:11”
PROGID: “glvrpt”
GROUPID: “Miscellaneous”
DESC: “G/L Voucher Report”
‘@
$NewData = $RawData -replace ‘:’, ’ =’ | ConvertFrom-StringData
$NewData
Now you have a hashtable and you can access the single elements by their names.
#region input
$InFolder = '.\in'
$OutFile = '.\myoutfile1.csv'
#endregion
#region Process
$myOutput = foreach ($File in (Get-ChildItem $InFolder)) {
$Content = Get-Content $File.FullName
#Parse content
$myObj = [PSCustomObject]@{}
foreach ($Line in $Content) {
$myObj | Add-Member -MemberType NoteProperty -Name (($Line.Split(':')[0]).Trim()) -Value (($Line.Split(':')[1]).Replace('"','').Trim())
}
$myObj
}
#endregion
#region Output
$myOutput | FT -a # out to console
$myOutput | Export-Csv $OutFile -NoType
#endregion
Note:
Parsing depends on the ‘:’ colon as a separator between field name and field value. If colon is missing in any line in input files, or there’s more than one colon, parsing will fail.
You may end up with objects having different properties if the input files do not have the same exact ‘field names’
Typically I go for the keep it simple approach, but basically you need to parse the data and create a PSObject.
$files = Get-ChildItem C:\Scripts\*.txt
$results = foreach ($file in $files) {
$content = Get-Content -Path $files
$props = @{}
foreach ($row in $content) {
$line = ($row -replace '"') -split ':'
$props.Add($line[0].Trim(), $line[1].Trim())
}
New-Object -TypeName PSObject -Property $props
}
$results |
Select-Object -Property FileId,Type,ReqDate,ReqTime,GenDate,GenTime,ProgId,Desc |
Export-CSV -Path C:\Scripts\my.csv -NoTypeInformation
Once it’s an object, you can filter it, export it or whatever you need to do.
Output:
PS C:\Users\rasim> $results
USRID : IDV
TYPE : JPG
FILEID : JPG139727
DESC : JOHN DOE
SECLEV :
REQTIME : 10
GENTIME : 10
REQDATE : 01/15/2016
GENDATE : 01/15/2016
GROUPID : Photo
PROGID : 8219
STATID : 123-12-1212
PATH : /optical/incoming/JPG139727
USRID : IDV
TYPE : JPG
FILEID : JPG139727
DESC : JOHN DOE
SECLEV :
REQTIME : 10
GENTIME : 10
REQDATE : 01/15/2016
GENDATE : 01/15/2016
GROUPID : Photo
PROGID : 8219
STATID : 123-12-1212
PATH : /optical/incoming/JPG139727
PS C:\Users\rasim> $results |
Select-Object -Property FileId,Type,ReqDate,ReqTime,GenDate,GenTime,ProgId,Desc
FILEID : JPG139727
TYPE : JPG
REQDATE : 01/15/2016
REQTIME : 10
GENDATE : 01/15/2016
GENTIME : 10
PROGID : 8219
DESC : JOHN DOE
FILEID : JPG139727
TYPE : JPG
REQDATE : 01/15/2016
REQTIME : 10
GENDATE : 01/15/2016
GENTIME : 10
PROGID : 8219
DESC : JOHN DOE
or, inspired by olaf’s idea above, this can be done via a single liner:
gci .\in | % { [PSCustomObject]((Get-Content $_.FullName -Raw) -replace ':', ' =' | ConvertFrom-StringData) } | Export-Csv .\myoutfile2.csv -NoType
ncrooks
February 22, 2019, 10:20am
6
Thank-you all very much, I was just reading your respective posts about date formatting on another thread!
I will post both results once finished, but I have confirmed this does what I need - I may have a ton of these to process so the dynamic naming threw me off at first (working with keys, needed to strip out the “”)
[pre]
gci .\ -Include *.arf -File -Recurse | ?{$_.psparentpath -like "*temp"} | %{
$RawData = get-content $_ -raw
$NewData = $RawData -replace ':', ' =' | ConvertFrom-StringData
#$NewData.Keys
#$NewData.Values -match '[a-z]{3}\w+'
$fname = $NewData['Fileid'].replace('"','')
pause
$NewData | Export-Clixml -Depth 2 -Path C:\$fname-export.xml -Force -verbose
}
[/pre]