Hex changes after piping to findstr ???

Re: cat CPO0008.VDI | findstr HDRE > HDRfile

Hex format of input file: 48 44 52 45 31 38 31 34
Hex format of output file: FF FE 48 00 44 00 52 00 45 00 31 00 38 00 31 00 34 00

Any help with why/how this is happening is so greatly appreciated. Thank you!

findstr is not Powershell. My suggestion would be to stay with the original complete cmdlet names. It is easier to read and easier to understand for yourself and for the people willing to help you.

Hello, Danielle.

Binary data is not what PowerShell can easily handle. When a string (and that’s what cat returns) goes through a pipeline, its encoding gets set to $OutputEncoding. PowerShell is just not the right tool in your case. So you need to find another method of finding things in a binary file. For example, you can write a C# program.

Well, maybe someone can help you do this by using .NET methods from PowerShell.

Thank you sooo much Nimms for your fantastic explanation! I understand and have an idea of how to remedy it. I really appreciate it, very much!

Well…so I changed to

select-string ./CPO0016.vdi -pattern “HDRE” | select-object Line

And my output is still padded as before…

Okay, I’m hoping I’m actually going to learn something here :smiley: I just did a little manual test in the PS console shell:

echo test > test.txt

When I view test.txt in binary mode, using TextPad, this is what I see:

FF FE 74 00 65 00 73 00 74 00 0D 00 0A 00

Yeah, that’s because it saves it in Unicode (UTF-16 Little-Endian). And when PowerShell tries to save raw binary data in Unicode, it screws it up. So please, don’t try to do this in PowerShell, it’s a pain. Just use an external tool for this. I don’t know the exact tool, but it’s quite easy to write one yourself.

Anyway, I think we’re going off-topic here.

you may just use Get-Content/Set-Content instead of redirection (>)
it have -Encoding parameter.

and I think powershell is a good tool even for this, but it’s Select-String for strings, not for byte values :wink:

I’m posting to a forum of experts? Which only continues with extended talk that doesn’t directly relate.

select-string ./CPO0016.vdi -pattern “HDRE” | select-object Line > ./HDRfile

Why is output being excessively embedded? Makes no sense - anyone have a clue?

Not, there is PS commutinty, not all of us - experts :slight_smile:

@nimms say you about unicode, I say about get/set-content
You lookging for more expertise here ?

ok, I try… but want to mention, we have no crystall ball for distant seeing what your file look like and what you want to get from it. and we don’t know why you want to search something literal in binary file and want to have it untouched.

[expert mode on]
you get unicode encoded file because string in .net internally have unicode encoding and you use string-based cmdlets and redirection that directly save that representation to file.
if you try to use get/set-content with -encoding parameter you can directly control enconding of your data but lose string searching capabilities. (if you use ‘Byte’ value)
[expert mode off]

now you can read about unicode and read help for set-content. and finally make your work like you like :slight_smile:

Redirection is creating a Unicode file. Try this:

"test"|Set-Content .\test.txt -Encoding Unicode
Get-HexDump .\test.txt
00000000  ff fe 74 00 65 00 73 00 74 00 0d 00 0a 00        ÿþt.e.s.t.....
"test"|Set-Content .\test.txt -Encoding Ascii
Get-HexDump .\test.txt
00000000  74 65 73 74 0d 0a                                test..

I grabbed the Get-HexDump function from poshcode if you need it.

To fix your most recent attempt:

Get-HexDump .\test.vdi
00000000  48 44 52 45 31 38 31 34                          HDRE1814
select-string ./test.vdi -pattern "HDRE" | select-object -expand Line|Set-Content .\test.txt -Encoding Ascii
Get-HexDump .\test.txt
00000000  48 44 52 45 31 38 31 34 0d 0a                    HDRE1814..

Awesome, Ron - thank you! That works and I learned :slight_smile: Much gratitude, Danielle