Remove a single CR/LF from a CSV file

First post so apologies if I break any rules.

We receive sales orders in CSV format from our ecommerce website. I have written an import procedure that imports the csv data and formats it ready for entry into our internal software. It saves a huge amount of keying and the associated typing errors. It works perfectly EXCEPT when the customer adds a special delivery instruction (leave it with a neighbour etc). The website app doesn’t just include the special instruction, it prefixes it with the screen prompt offered to the user AND adds a CR/LF which our import procedure recognizes as a new record. We can fix orders with special instructions manually and then they import cleanly, but this involves locating the defect using notepad and removing the CR/LF, then resaving.

Is it possible to locate the string (always the same):

“please enter them here.” followed by a CR/LF
and remove the CR/LF then save the file under its original name. That’s what we do manually but I’d like to automate it. I’m thinking that there is a (get-content) and (set-content) combination that might work but I’m not a powershell user and am not familiar with the exact syntax.

Hi, welcome to the forum :wave:

This will take all lines that don’t start with ‘please’ and write them to a new file.
You can write this all on one line, I have made it multiline for readability.

Get-Content myFile.csv | 
    Where-Object {$_ -notlike 'please*'} | 
        Set-Content myFileClean.csv

Phil,
Welcome to the forum. :wave:t3:

When you crosspost the same question at the same time to different forums you should at least post links to the other forums along with your question to avoid people willing to help you potentially wasting their time.

Thanks

1 Like

This will remove the CrLf (\r\n) sequence from the end of the “Please enter them here.\r\n” string.

$x = get-content f1.csv -Raw
$x -replace "(please enter them here\.)\s*\r\n",'$1';

That is getting very close to what I need. Can you help with a couple of enhancements:

  1. This script outputs the contents of the file to screen. How do I write the contents back into the original filename (or a modified filename) instead?
  2. The script removes the CRLF sequence perfectly. Could the script be adapted to remove the “Please enter them here.” text as well?

This should do it:

$x = Get-Content f1.csv -Raw
$x = $x -replace "please enter them here\.\s*\r\n",'';
Set-Content f1.csv -Value $x

EDIT: Oops . . . I’ve been working with Perl for the last couple of months and mistakenly coded the line with the regex as if it were Perl. Using Perl the original line would have changed the contents of $x “in place”. PowerShell, however, leaves the original contents of $x in place and sends the modifications to either a pipe or requires the results to be assigned (i.e. “consumed”) in some way. Without the assignment the results are simply sent to the success stream, which defaults to the console.

Thanks to people’s help, I’ve settled on a solution that seems to work.

Script (csvfix.ps1) is as follows (the script is generated on the fly by our import procedure):

$csvFilePath = “W:\WEBCSV\800.csv”
$cleanedFilePath = “W:\WEBCSV\800.csv”
$csvData = Get-Content -raw $csvFilePath | ForEach-Object { $_ -replace “If you have any special delivery requirements, please enter them here.rn”,“”}
$csvData | Set-Content $cleanedFilePath

Powershell is launched via a batch file that references the script. I had a lot of failures because the ececution policy outside of the powershell session wasn’t respected inside the session (powershell would not run the script). So the batch file sets this at the start of the session as follows:

powershell -ExecutionPolicy Unrestricted w:\webcsv\csvfix.ps1

I did find that the script adds a new blank line at the end of the file but i’m able to automatically remove that within our own software.

Thanks once again for all the help.

1 Like

There’s absolutely no need for using either of those pipes. Because you’re using the “-Raw” switch with Get-Content the variable $csvData contains just one string. That string is the entire contents of the file. Get-Content swallows the whole file in one gulp.