Manipulating string (was: What am I doing wrong?)

Hello,

This shall be probably one of those “of course” moments but I spend last 2 hours trying to figure out why initializing a string variable is causing difference in output in simple script below

 #$batch="" $SetToReplace = get-content C:\Users\me\Desktop\bb.txt for ($index = 0; $index -lt 2; $index++) {        $batch += $SetToReplace -replace "%SERVERNAME%", "DEL \\$index"	 } write-output $batch

If I initiliaze $batch variable then output do not have carriage return and entire output is on one line but I don’t initialize $batch variable then output has carriage reutrn.

With initialization

DEL \d$\inetpub\ DEL \d$\inetpub\ DEL \d$\inetpub\DEL \1\d$\inetpub\ DEL \1\d$\inetpub\ DEL \1\d$\inetpub</strong>

Without initialization

DEL \d$\inetpub
DEL \d$\inetpub
DEL \d$\inetpub
DEL \1\d$\inetpub
DEL \1\d$\inetpub
DEL \1\d$\inetpub</strong>

Contents of bb.txt file is below

%SERVERNAME%\d$\inetpub
%SERVERNAME%\d$\inetpub
%SERVERNAME%\d$\inetpub</strong>

Not quite.

Get-Content doesn’t return a single string; each line in the text file is a String object, so $SetToReplace is technically a collection, or array, of strings. When you ask PowerShell to display an array, it displays one item per line. The insertion of the carriage return is part of the shell’s display-to-screen process; the carriage return isn’t “in” the string.

When you “initialize” $batch, you’re making it a single string. Adding to it using += concatenates. When you don’t do so, += appends an object to the array. So in one case you have a single string, in the other case you have an array of string objects.

Remember that everything in PowerShell is object-oriented. You’re rarely dealing with text, although at first glance it can look like text because the shell obviously has to display text when rendering objects to the screen for you to see.

Where your approach probably first went wrong was treating $SetToReplace as a single string, like you’re doing. You’d normally use a foreach construct to enumerate through those lines, one at a time, and execute your -replace operation on each line individually.

It looks like you were trying to enumerate the strings in the $SetToReplace array (using a for loop instead of foreach, but that’s okay. Your loop is hard-coded to only process the first two lines of the file, though; is that what you intended?) What seems to be missing is an array index operator in this line:

# instead of this:
$batch += $SetToReplace -replace "%SERVERNAME%", "DEL \\$index"

# try this:
$batch += $SetToReplace[$index] -replace "%SERVERNAME%", "DEL \\$index"

I’m not sure what you want $batch to be, when this code is done. Should it be an array of strings, just like $SetToReplace? Or should it be a single String object that happens to contain multiple lines? Assuming you want it to be an array, you’ll want to initialize it as an empty array first:

$batch = @()
$SetToReplace = get-content .\test.txt
for ($index = 0; $index -lt 2; $index++) {
    $batch += $SetToReplace[$index] -replace "%SERVERNAME%", "DEL \\$index"	
}

write-output $batch

If instead you wanted $batch to be a string, you’d need to put the line terminators in yourself, something like this (I initialized $batch to an empty string, though strictly speaking, that wasn’t necessary in this example):

$batch = ""
$SetToReplace = get-content .\test.txt
for ($index = 0; $index -lt 2; $index++) {
    if (-not [string]::IsNullOrEmpty($batch)) {
        $batch += "`r`n"
    }

    $batch += $SetToReplace[$index] -replace "%SERVERNAME%", "DEL \\$index"
}

write-output $batch

Yes, thanks that’s what my assumption was then when I Get-Content I get entire file as one huge string with rn between lines like if you use native System.IO ReadToEnd() property of File class.
So what would be the best approach if you want to replace certain text and text file without going through foreach() loop of every single file, if I have a big file I assume it will take much longer time and CPU power compared to pattern match of entire string. I would also like to preserver rn as they were in original file.

If you’re using PowerShell 3.0 or later, the simplest way to do that is to add the -Raw switch to Get-Content. This returns a single String variable containing all of the text in the file (including line terminators), instead of an array of strings. The performance doesn’t seem to really be a big deal; I ran test code on a ten thousand-line file and it finished in less than half a second either way.

If you do want to do this without a loop in the PowerShell file (though there’s still going to be some looping done behind the scenes by the Regular Expression engine either way), you can try it like this. It’s a little bit tricky to get the incrementing “index” value into the file without your own loop, but it works:

$text = get-content .\test.txt -Raw
$script:index = 0

$batch = [regex]::Replace($text, "(?i)%SERVERNAME%", { "DEL \\$((++$script:index))" })

Write-Output $batch

There’s more than one way to avoid that loop:

$index=0
filter batch { $_ -replace '(?i)%SERVERNAME%',"DEL \\$((++$index))" }
$batch = get-content .\test.txt | . batch
Write-Output $batch

Interesting. I didn’t know you could dot source a function or filter, or do it in the middle of a pipeline like that.

Maybe we need an article about scoping :).