CSV import - Group, sort and finally Sum string values

Hi All,

I am trying to minimise the time it takes me to monitor the amount time spent on certain tasks. I have a CSV file:
Project,StartDate,StartTime,EndDate,EndTime,Duration
Project 1,19122018,16:39:26,19122018,16:39:28,00:01.8
Project 2,19122018,16:39:28,19122018,16:39:29,00:00.3
Project 3,19122018,16:39:29,19122018,16:39:29,00:00.4
Project 4,19122018,16:39:29,19122018,16:39:29,00:00.3
Project 5,19122018,16:39:29,19122018,16:39:30,00:00.3
The projects and dates will change week to week and are not known ahead of time. There may be multiple occurances of each project/date combination.

I am trying to find the total duration of each project for each day.

So far I am able to list each duration for each project/day but as they are strings I am struggling to generate a sum.

$output = (Import-CSV -Path -Delimiter ",") | Group-Object -property StartDate,Project

foreach ($name in $output)
{
write-host $name.name "," $name.Group.Duration
}

Ideally I would like
StartDate,Project,Sum of Duration

Thanks in advance for any assistance.

If your CSV had a normal dat code it would be easier, but here is what I came up with based on your CSV

$output = Import-CSV -path C:\TEMP\stuff.csv
foreach ($project in $output)
{
    $WTFStart = $project.StartDate.Insert(2,'-').Insert(5,'-')
    $Day = $WTFStart.Split('-')[0]
    $Month = $WTFStart.Split('-')[1]
    $Year = $WTFStart.Split('-')[2]
    $StartTime = $project.StartTime
    $StartDate = (Get-Date -Day $Day -Month $Month -Year $Year) + $StartTime
    $WTFEnd = $project.EndDate.Insert(2,'-').Insert(5,'-')
    $EndDay = $WTFEnd.Split('-')[0]
    $EndMonth = $WTFEnd.Split('-')[1]
    $EndYear = $WTFEnd.Split('-')[2]
    $EndTime = $project.EndTime
    $EndDate = (Get-Date -Day $EndDay -Month $EndMonth -Year $EndYear) + $EndTime
    $Elapsed = $EndDate - $StartDate
    $duration = [ordered]@{'StartDate'=$StartDate;
    'Name'=$project.Project;
    'Duration'= $Elapsed     
    }
$obj = New-Object -TypeName PSObject -Property $duration
Write-Output -InputObject $obj
}

Vern,
obviously Rich posted this question three times … :wink:

https://powershell.org/forums/topic/grouping-sum-imported-csv-text/#post-130863

https://powershell.org/forums/topic/grouping-sum-csv-content/#post-131100

… now he can choose the solution he likes. :wink:

You can use [TimeSpan]::ParseExact to convert the text…

[TimeSpan]::ParseExact($duration, $format, $null)

edit: Olaf was faster than me.

… unfortunately the duration Rich showed in his CSV data has an “unconventional” format. Do you know a format string we could use to use with [TimeSpan]::ParseExact ?

The last one might be “f: The tenths of a second”…

[TimeSpan]::ParseExact($duration, 'mm\:ss\.f', $null)

So no more than an hour work per entry.

Wow … cool … thanks a lot. I hope Rich will see this. That makes his task much easier.

Happy holidays! :wink:

You too :slight_smile:

I’m thinking it might be a bug in the csv… Only an hour per entry?

Hi, perhaps I’ve got a bit carried away, but you can try this:

<#
.Synopsis
   Aggregate project activity durations.
.DESCRIPTION
   The script measures activity durations, based on a CSV time sheet entries.
.EXAMPLE
   Measure-ProjectActivityDuration -Path .\TimeSheet.csv
.EXAMPLE
   Measure-ProjectActivityDuration -TimeSheet .\TimeSheet_CurrentWeek.csv -Delimiter ';'
.INPUTS
   [System.String]
.OUTPUTS
   [System.Management.Automation.PSCustomObject]
.NOTES
   In case the total duration of activities for a project exceeds 24 hours, the number of days will be set as a dot-delimited prefix to the HH:mm:ss entry, as shown below:
   TotalDuration StartDate Project

1.04:41:53 19122018 Project 2
02:00:06 19122018 Project 1
00:00:01 19122018 Project 5
00:00:00 19122018 Project 4
00:15:00 19122018 Project 3

.COMPONENT
The component this cmdlet belongs to
.ROLE
The role this cmdlet belongs to
.FUNCTIONALITY
The functionality that best describes this cmdlet

#>

[CmdletBinding()]
[Alias()]
[OutputType([System.Management.Automation.PSCustomObject])]

Param
(
# Path to the timesheet file.
[Parameter(Mandatory = $true, Position = 0)]
[ValidateNotNull()]
[ValidateNotNullOrEmpty()]
[Alias(“TimeSheet”)]
[String]$Path,

# Symbol used to delimit the columns in the CSV file.
[Parameter(Mandatory = $false, Position = 1)]
[ValidateNotNull()]
[ValidateNotNullOrEmpty()]
[String]$Delimiter = ','

)

Begin {
# Import and sort the time sheet.
$timeSheet = Import-Csv -LiteralPath $Path -Delimiter $Delimiter | Sort-Object StartDate, StartTime;
# Create and empty duration table.
[System.Collections.ArrayList]$durationTable = @();
}
Process {
foreach ($entry in $timeSheet) {
# Create an entry in the duration table if the project has not been listed there yet.
if ([bool]($durationTable | Where-Object {$_.Project -eq $entry.Project}) -eq $false) {
$tableEntry = New-Object -TypeName PSCustomObject -Property @{Project = $entry.Project; StartDate = $entry.StartDate; TotalDuration = New-TimeSpan};
$durationTable.Add($tableEntry) | Out-Null;
}
# Calculate the duration of the activities for the current entry.
$startTime = [datetime]::ParseExact(“$($entry.StartDate) $($entry.StartTime)”, “ddMMyyyy HH:mm:ss”, $null);
$endTime = [datetime]::ParseExact(“$($entry.EndDate) $($entry.EndTime)”, “ddMMyyyy HH:mm:ss”, $null);
$activityDuration = New-TimeSpan -Start $startTime -End $endTime;

    # Add the duration for the current entry to the total duration for the project.
    $projectEntry = $durationTable | Where-Object {$_.Project -eq $entry.Project};
    $projectEntry.TotalDuration += $activityDuration;
}

}
End {
Return $durationTable;
}

 

Thanks Olaf, I was unaware of the other 2, this is the only one Don Jones tweeted about though.

Example of using the duration column

$csvfile = 'Project,StartDate,StartTime,EndDate,EndTime,Duration
Project 1,19122018,16:39:26,19122018,16:39:28,00:01.8
Project 2,19122018,16:39:28,19122018,16:39:29,00:00.3
Project 3,19122018,16:39:29,19122018,16:39:29,00:00.4
Project 4,19122018,16:39:29,19122018,16:39:29,00:00.3
Project 1,22122018,16:39:26,19122018,16:39:28,00:03.8
Project 1,24122018,16:39:26,19122018,16:39:28,00:11.1
Project 5,19122018,16:39:29,19122018,16:39:30,00:00.3'

$csv = ConvertFrom-Csv $csvfile

# parse information
$durationformat = 'mm\:ss\.f'

# group by project
$grouped = $csv | Group-Object -Property Project

$result = foreach($prj in $grouped) {
    $sum = [TimeSpan]::new(0)

    # process all durations for this project
    foreach($entry in $prj.Group) {
        $sum += [TimeSpan]::ParseExact($entry.Duration, $durationformat, $null)
    }

    [pscustomobject]@{
        # !! needs to sort on startdate, if entries are not in order
        StartDate = $prj.Group[0].StartDate
        Project = $prj.Name
        'Sum of Duration' = $sum.ToString($durationformat)
    }
}

$result | ConvertTo-Csv -NoTypeInformation