Manipulating imported CSV data points for new CSV export

As a newbie to manipulating CSV files within PowerShell I need a foot up with the following problem.

stats.csv contains data for a time series of two objects A,B. I need to create a report referencing some values from this and also need to calculate new values from the data it contains.

stats.csv available here: stats.csv file · GitHub

The first value on each row is the time series (PT). There are two rows for each time event in the file, one for each object (ON) A and B. The second value for each row is the object (ON). As you can see if you scroll through the CSV, there will be two entries for each PT - one for ON A and another for ON B. Each ON has it’s own unique data for values for the PT for TT,RIO,TB,RS,WS,RT that need to be calculated or passed directly to a new CSV.

The following calculations are needed for the report (excuse the crudeness below):

$total_tt = sum TT where PT value is the same for ON=A And ON=B
e.g. for the same PT entries @ 02/06/2017 15:45
$total_tt would be 12804.49537

$total_tb = sum TB where PT value is the same for ON=A And ON=B
e.g. for the same PT entries @ 02/06/2017 15:45
$total_tb would be 180.226005

$avg_rp = (if $total_tt>0(sum RIO where same PT for ON=A and=B)/$total_tt)
e.g. for PT entries @ 02/06/2017 15:45
$avg_rp would be 53.81826021 (percentage)

$avg_rs = (if sum of RT from same PT value for ON=A And ON=B>0(RIORS for ON=A+RIORS where ON=B)/sum of RIO for A and B)
e.g. for PT entries @ 02/06/2017 15:45
$avg_rs would be 9.944376086

$avg_ws = within the same TS(if $total_tt-RIOA-RIOB>0((TTA-RIOA)*WSA+(TTB-RIOB)*WSB/(RIOA-RIOB)
e.g. for PT entries @ 02/06/2017 15:45
$avg_ws would be 17.33865279

$avg_rt = average RT where same PT value is observed for ON=A And ON=B
e.g. for PT entries @ 02/06/2017 15:45
$avg_rt would be 0.293554

Create a 1st output file: report_detail.csv with the following values (some newly calculated, some from the original csv):

PT,$total_tt,$total_tb,$avg_rp,$avg_rs,$avg_ws,$avg_rt,RT(ON:A from stats.csv),RT(ON:B from stats.csv)

Next, return the row numbers from report_detail.csv that contain the maximum and 95th percentile value for $total_tt and send them to report_headlines.csv with first row headers for the csv:

measurement,PT,$total_tt,$total_tb,$avg_rp,$avg_rs,$avg_ws,$avg_rt,RT(ON:A from original CSV),RT(ON:B from original CSV).

Any help is appreciated to get me started.

You could start with “reading” your csv file with Import-CSV. The you iterate through all lines with a Foreach loop and calculate whatever you need. In the end you can output all this to the console or export it to a new csv with Export-CSV

BTW: A few lines of your csv file would have been enough to show what it’s about! :wink:

Hi. No issues. I can trim back the csv :wink:

Would you be able to show me a working example of what you suggest of what I’m asking for with $total_tt? If I get a foot up I can work the remainder.

Here you go:

# region Input
$Data = Import-Csv .\stats.csv 
#endegion


#region Process
$A = ($Data | group ON | ? { $_.Name -eq 'A' }).Group
$B = ($Data | group ON | ? { $_.Name -eq 'B' }).Group

$Analysis = foreach ($ONA in $A) { 
    Remove-Variable ONB -ErrorAction SilentlyContinue
    $ONB = $B | ? { $_.PT -eq $ONA.PT }
    if ($ONB) {
        $SumTT = [decimal]$ONA.TT + [decimal]$ONB.TT
        $SumRT = [decimal]$ONA.RT + [decimal]$ONB.RT
        New-Object -TypeName PSObject -Property @{
            PT         = [DateTime]$ONA.PT
            RTA        = [decimal]$ONA.RT
            RTB        = [decimal]$ONB.RT
            SumTT      = $SumTT
            SumTB      = [decimal]$ONA.TB + [decimal]$ONB.TB
            AvgRB      = $(
                if ($SumTT -gt 0) {
                    ([decimal]$ONA.RIO + [decimal]$ONB.RIO)/$SumTT
                } else { 0 }
            )
            'AvgRB(%)' = $(
                if ($SumTT -gt 0) {
                    '{0:P2}' -f (([decimal]$ONA.RIO + [decimal]$ONB.RIO)/$SumTT) 
                } else { 0 }
            )
            AvgRS      = $(
                if ($SumRT -gt 0) {
                    (([decimal]$ONA.RIO * [decimal]$ONA.RS) + ([decimal]$ONB.RIO * [decimal]$ONB.RS)) /
                    ([decimal]$ONA.RIO * [decimal]$ONB.RIO) * 100                  
                } else { 0 }
            )
            AvgWS      = $(
                if ($SumTT - ([decimal]$ONA.RIO + [decimal]$ONB.RIO)) {
                    (([decimal]$ONA.TT-[decimal]$ONA.RIO)*[decimal]$ONA.WS + 
                        ([decimal]$ONB.TT-[decimal]$ONB.RIO)*[decimal]$ONB.WS) / 
                            ([decimal]$ONA.RIO-[decimal]$ONB.RIO)
                } else { 0 }
            )
            AvgRT      = ([decimal]$ONA.RT+[decimal]$ONB.RT)/2
        } 
    }
} 
$Analysis = $Analysis | Sort SumTT -Descending 
#endregion


#region Output
$Analysis | select PT, SumTT, SumTB, AvgRB, AvgRS, AvgWS, AvgRT, RTA, RTB | 
    Export-Csv .\report_detail.csv -NoType -Force

$Index95 = [math]::ceiling($Analysis.Count * 0.95)
$95PercentileValue = ($Analysis | Sort SumTT)[$Index95]
$Maximum = $Analysis | select -First 1 
$Maximum | Add-Member -MemberType NoteProperty -Name 'Measurement' -Value 'Maximum'
$Maximum | select Measurement, PT, SumTT, SumTB, AvgRB, AvgRS, AvgWS, AvgRT, RTA, RTB | 
    Export-Csv .\report_headlines.csv -NoType -Force 
$95PercentileValue | Add-Member -MemberType NoteProperty -Name 'Measurement' -Value '95Percentile'
$95PercentileValue | select Measurement, PT, SumTT, SumTB, AvgRB, AvgRS, AvgWS, AvgRT, RTA, RTB | 
    Export-Csv .\report_headlines.csv -NoType -Append
#endregion

You owe me $50…lol

I think than grouping will be much more effective if
instead of grouping by ON we directly group by PT

$Data = Import-Csv D:\stats.csv | Group-Object -Property PT | Foreach-Object {
  $a = $_.Group | Where-Object { $_.ON -eq 'A' }
  $b = $_.Group | Where-Object { $_.ON -eq 'B' }
  [PSCustomObject]@{
     PT = $_.Name
     A = $a
     B = $b
  }
  #manipulate data 1
}
#manipulate data 2