Code Efficiency - CSV Trim and Replace Operations

Hi

My code below finds all CSVs recursively in a directory, loads located CSVs, converts the CSVs into TSVs, trims first and end double quotes and then removes all double quotes between a tab space.

My Question:

Is there a way to make the code more efficient? I can’t seem to find a way to remove the ForEach-Object loop. Please help. Thanks. (For some reason the accent symbol does not work to display the code below)

$content = "C:\somedir"

$files = Get-ChildItem -Path $content -Recurse -Filter *.csv | 
	Select-Object -Property *, @{
		expression ={
			Write-Host $_.FullName
			$source = $_.FullName
			$dest = $_.DirectoryName + "\" + $_.BaseName + ".tsv"
			
			Import-Csv -Path $source | 
			convertto-csv -NoTypeInformation -Delimiter "`t" | 
			% { 
				ForEach-Object {
					$_.Trim('"') |
					% { $_ -replace (""t"" ,"t") }
				}
			} | Set-Content "$dest"
		}
	}

Which ForEach-Object loop are you trying to eliminate? I count 3 of them in your code (line 12, 13, 15). % is an alias for foreach-object. Without samples of your csv, I think you might be better working with the raw data that multiple conversions (import-csv then convertto-csv). I’m not sure if my example below is more efficient, but you could use measure-command to see for sure. Either way I find yours a little hard to read with nested loops, but that is just a preference.

$content = ".\documents"
$files = Get-ChildItem -Path $content -Recurse -Filter *.csv

foreach ($item in $files)
{
   (Get-Content $item.fullname -Raw) -replace ',',"`t" -replace '"','' |
       Set-Content -Path "$($item.fullname).tsv"
}

Loops are traditionally avoided if possible, but here are a couple of options. This is the CSV to mess with:

$csv = @"
"EmployeeId","FirstName","LastName","Manager","Password","Username"
"tst00001","Angela","Tester","rsimmers","<RANDOM>","atester"
"tst00002","Angelo","Tester","rsimmers","<RANDOM>","atester"
"tst00003","Bob","Tester","rsimmers","<RANDOM>","btester"
"tst00004","Chris","Tester","rsimmers","<RANDOM>","ctester"
"tst00005","Derek","Tester","rsimmers","<RANDOM>","dtester"
"@

This would parse using Import-CSV, so if there are “rogue” double qoutes in the content, this is parsing it as a CSV and then gluing it together with string builder. While this is using for loops, there is no Replace operations, it’s just parse and glue it together:

$csv = Import-Csv -Path 'C:\Scripts\temp.csv'

$myNewCSV = [System.Text.StringBuilder]::new()

#Header Row
foreach ($row in $csv) {
    [void]$myNewCSV.Append("{0}rn" -f ($row.PSObject.Properties.Name -join ';'))
    break
}

#Data Row
foreach ($row in $csv) {
    [void]$myNewCSV.Append("{0}rn" -f ($row.PSObject.Properties.Value -join ';'))
}

$myNewCSV.ToString() #Set Content

Output:

EmployeeId;FirstName;LastName;Manager;Password;Username
tst00001;Angela;Tester;rsimmers;<RANDOM>;atester
tst00002;Angelo;Tester;rsimmers;<RANDOM>;atester
tst00003;Bob;Tester;rsimmers;<RANDOM>;btester
tst00004;Chris;Tester;rsimmers;<RANDOM>;ctester
tst00005;Derek;Tester;rsimmers;<RANDOM>;dtester

#Measure Command
Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 15
Ticks             : 152401
TotalDays         : 1.76390046296296E-07
TotalHours        : 4.23336111111111E-06
TotalMinutes      : 0.000254001666666667
TotalSeconds      : 0.0152401
TotalMilliseconds : 15.2401

Another approach is using Get-Content -Raw, which should bring the content in as raw content and not parse lines. This is available Powershell 4+.

$content = Get-Content -Path 'C:\Scripts\temp.csv' -Raw
$content.Replace('"','')

Output:

PS C:\Users\rasim> $content.Count
1

PS C:\Users\rasim> $content = Get-Content -Path 'C:\Scripts\temp.csv' -Raw
$content.Replace('"','')
EmployeeId,FirstName,LastName,Manager,Password,Username
tst00001,Angela,Tester,rsimmers,<RANDOM>,atester
tst00002,Angelo,Tester,rsimmers,<RANDOM>,atester
tst00003,Bob,Tester,rsimmers,<RANDOM>,btester
tst00004,Chris,Tester,rsimmers,<RANDOM>,ctester
tst00005,Derek,Tester,rsimmers,<RANDOM>,dtester


PS C:\Users\rasim> Measure-Command {
$content = Get-Content -Path 'C:\Scripts\temp.csv' -Raw
$content.Replace('"','')
}


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 12
Ticks             : 127013
TotalDays         : 1.47005787037037E-07
TotalHours        : 3.52813888888889E-06
TotalMinutes      : 0.000211688333333333
TotalSeconds      : 0.0127013
TotalMilliseconds : 12.7013