Regexp (^) in replace

What’s wrong with:

$files = Get-ChildItem -Path "$env:_R_SPS\*.bat" -Recurse
foreach( $file in $files )
{
     ((Get-Content -LiteralPath $file -Raw) -replace '^rem','@rem')                         | Set-Content -LiteralPath $file
}

I thought / hoped that e.g. “^rem” would replace all “rem” at the start of line with “@rem” but it does not work a such.

Many thanks - Michael

This would only work at the start of the file since you used -Raw. That gives the content as one long string instead of line by line. Not sure if that’s the only issue but that’s what stands out instantly. Also, what does happen?

This works:

# using directive must be at the start of a script
using namespace System.Text.RegularExpressions

Get-Content -LiteralPath $file -Raw | ForEach-Object {
       [regex]::Replace($_, "^rem", "@rem", [RegexOptions]::Multiline)
}

Many thanks! Next “problem”: What is the best way to store the changed version into the original file:

Get-Content -LiteralPath $file -Raw | ForEach-Object {
       [regex]::Replace($_, "^rem", "@rem", [RegexOptions]::Multiline) | Set-Content -LiteralPath $file
}

gives me “Set-Content : Der Prozess kann nicht auf die Datei “D:\temp\replace\test.xxx” zugreifen, da sie von einem anderen Prozess verwendet wird.” Do I need an intermediate file?

Thnaks again - Michael

The process cannot access the file “D:\temp\replace\test.xxx” because it is being used by another process

While Get-Content | ForEach-Object do it’s job the file is opened and can not be accessed until it’s closed, therefore you need to save contents to variable and update file later:

# Save modifications to variable, at this time file is opened and cannot be modified
$data = Get-Content -LiteralPath $file -Raw | ForEach-Object {
       [regex]::Replace($_, "^rem", "@rem", [RegexOptions]::Multiline)
}

# At this point file is closed and can be modified
Set-Content -Value $data -LiteralPath $file

Btw. both of these need to be inside your foreach loop, you need to save modifications before modifying next file.

if would probably also not be bad idea to make a copy of your files before modifying them directly, just in case.

Many thanks! Quite clear but I have to lear to swich from linear programming (FORTRAN, VB etc.) to object oriented thinking and doing.

Your contributions realy hellp me on that way!

Michael

Next question / problem: I have to make several changes at once. If I do it that way:

# using directive must be at the start of a script
using namespace System.Text.RegularExpressions

$file = "d:\temp\replace\test.xxx"

# Save modifications to variable, at this time file is opened and cannot be modified
$data = Get-Content -LiteralPath $file -Raw | ForEach-Object {
       [regex]::Replace($_,  "!",        "@",                      [RegexOptions]::Multiline)
       [regex]::Replace($_, "^rem",      "@rem",                   [RegexOptions]::Multiline)
       [regex]::Replace($_, "^SETLOCAL", "@rem --- to be deleted", [RegexOptions]::Multiline)
       [regex]::Replace($_, "^ENDLOCAL", "@rem --- to be deleted", [RegexOptions]::Multiline)
}

# At this point file is closed and can be modified
Set-Content -Value $data -LiteralPath $file

the contents of the file are multiplied (every replace seems to create an new block).

Thanks again

BTW: “Of course” :upside_down_face: I am working in a test environment and your code block will be inside my foreach file loop.

This is because each [regex]::Replace is a function which thus returns a full copy of the line being processed in a file, resulting in as many duplications as there are [regex]::Replace statements.

There are 2 ways on how to implement this, one option is to run ForEach-Object multiple times which is not good for performance, another option using evaluator (faster) is below:

# using directive must be at the start of a script
using namespace System.Text.RegularExpressions

# Regex evaluator that runs for each line in a file
[MatchEvaluator] $Evaluator = {
	param ($Line)

	# replace all matches per line
	$result = [regex]::Replace($Line, "!", "@")
	$result = [regex]::Replace($result, "^rem", "@rem")
	$result = [regex]::Replace($result, "^SETLOCAL", "@rem --- to be deleted")

	# finally return result
	[regex]::Replace($result, "^ENDLOCAL", "@rem --- to be deleted")
}

$file = "d:\temp\replace\test.xxx"

# Save modifications to variable, at this time file is opened and cannot be modified
$data = Get-Content -LiteralPath $file -Raw | ForEach-Object {
    # Here OR all that you want to replace and then update $Evaluator above to handle this
	[regex]::Replace($_, "!|^rem|^SETLOCAL|^ENDLOCAL", $Evaluator, [RegexOptions]::Multiline)
}

# At this point file is closed and can be modified
Set-Content -Value $data -LiteralPath $file

Following site will help to learn more about advanced regex:
Regex Class (System.Text.RegularExpressions) | Microsoft Learn

Great! Thank you so much for your effort!

Concerning performance of the “multiple times solution”: Isn’t it possible to store the result of “Get-Content -LiteralPath $file -Raw” in an object and then run “ForEach-Object” aginst this stored object multiple times?

Michael

1 Like

No because performance issue is not with Get-Content -Raw but rather with ForEach-Object

When you run Get-Content -Raw what happens is that entire file content is returned as a single line rather than multiple lines, and this happens only once.
Reason why you’re able then to parse line by line is because when -Raw is specified line breaks are preserved with \n character, that’s how multiline regex option recognizes a new line in a bulk that was returned.

Knowing this it follows that Get-Content -Raw is a single operation, it’s not iterated over and over so it’s not a performance issue.

ForEach-Object is however needed only once, if you save data to variable and run ForEach-Object multiple times like this:

$data = Get-Content -LiteralPath $file -Raw
$data = $data | ForEach-Object { /* regex operation 1 */ }
$data = $data | ForEach-Object { /* regex operation 2 */ }
$data = $data | ForEach-Object { /* regex operation 3 */ }
$data = $data | ForEach-Object { /* regex operation 4 */ }

This are obviously 4 ForEach-Object operations + 4 regex operations which is 8 operations in total, in comparison to my sample code which uses only one ForEach-Object + 5 regex operations which is 6 operations in total.

Therefore it’s 8 vs 6 operations or 40% improvement.

You can find out many discussions about ForEach-Object penalizing perfomance ex:
Speed up Foreach-Object and Where-Object by using Steppable Pipeline · Issue #10982 · PowerShell/PowerShell (github.com)

Note that you could omit -Raw and regex multiline option like this:

$data = Get-Content -LiteralPath $file | ForEach-Object {
	[regex]::Replace($_, "!|^rem|^SETLOCAL|^ENDLOCAL", $Evaluator)
}

However this is much worse because now ForEach-Object will run as many times as there are lines in a file, resulting in much more operations and worse performance.

btw. here is an improvement of my previous sample code, it doesn’t use ForEach-Object at all:

$data = Get-Content -LiteralPath $file -Raw
$data = [regex]::Replace($data, "!|^rem|^SETLOCAL|^ENDLOCAL", $Evaluator, [RegexOptions]::Multiline)

# save modifications
Set-Content -Value $data -LiteralPath $file

This is now only 5 operations and a 50% improvement vs previous 40%.
So I suggest you update your code for this small improvement.

https://en.m.wiktionary.org/wiki/help_vampire#Noun

1 Like

Interesting…

I don’t often visit help forums to provide answers, lately more often I ask for help rather than answering.

There were times when I was very active on some PC help forums and BSOD sections wasting time on how to fix broken computers, but my interests were to boost my skills solving issues rather than helping.

Often it’s easy to profile people but “help vampire” is certainly a new term and categorization to me.

1 Like

Just a friendly warning not to get all your helper blood sucked out of ya! :wink:

And your help is appreciated, I’ve found you quite helpful. :slight_smile:

1 Like

I want to say thanks to this overwehlming and dextremly helpful hints on my way to object oriented programming (being an old FORTRAN/VB etc. guy).
I certainly did not want to drain anyone’s energy and have a little bad conscience because maybe I should better have a PS course. On the other hand I find learning with real life examples is more effective.

Michael

1 Like

I’m glad to hear it worked for your problem!

No you didn’t drain my energy, that’s not an issue at all, but marking the post that was helpful as “solution” would be great :slightly_smiling_face:

Other users who visit a thread with a solution in the future will have easy time to focus on specific portion of the thread rather than reading trough entire thread trying to figure out the point.

1 Like

Done - Thanks again and stay as you are (knowledgeable and ready to help) - MIchael