Multi-line pattern looking for contained content in file

I stand corrected on the above…Encoding is the issue. If i open in notepad++

encoding shows UCS-2 Little Endian on the file that DOES NOT parse properly

and

UCS-2 LE BOM on the file that DOES.

I have tested converting back and forth on notepad++ and this is the issue.

powershell does not have an option that i have found to convert a file to UCS-2 LE BOM and i have tried each of the listed enumerators to convert and none will convert and parse properly.

 

The fix for this was not to write back to the same file i was pulling from…If i did it would purge the data…i had to run with out-file to a new file and encode it under unicode

gc -encoding unknown $file | out-file $new_file -encoding unicode

once i did this and ran my code it was running the counts properly

If posting code / text fails for you here, then setup up a free gist or GitHub account and post there and post the like here.

Hmm, I couldn’t see the encoding of a UCS-2 Little Endian (no bom) file in notepad++. It looks like with get-content there’s a $null between each letter. Notepad can handle unicode no bom but get-content can’t without specifying the encoding.

get-content -encoding unicode $file | select-string whatever

For future reference you can convert the file like this and output to the same file; the parentheses make sure the first part is done before the second part starts. The default for out-file is unicode anyway in PS 5. Set-content defaults to ansi, which should also be fine. PS 6 defaults to utf8 no bom with all commands. The code tags look like < pre > < /pre > with no spaces.

(get-content -encoding unknown $file) | set-content $file

Example of taking the initial 2 byte unicode BOM out of a file (utf8 bom is 3 bytes). FAQ - UTF-8, UTF-16, UTF-32 & BOM
You can see the bom (“FF FE”) in emacs hexl-mode or some other binary editor.

$file = 'hi.txt'
'hi there' | set-content -Encoding unicode $file
$bytes = [io.file]::ReadAllBytes($file)
[IO.File]::WriteAllBytes($file, $bytes[2..$bytes.length])
get-content $file | select-string ('h' + [char]$null + 'i')

h i   t h e r e