Powershell: Change / Save encoding How to convert several txt files UTF-8 to UTF-8-BOM

good day, I have several .txt files written in Russian language., in UTF-8. I need to convert all this files to UTF-8-BOM. So, I find this line on internet. But there is only one file. How can I convert all my 1000 .txt files at once?

(Get-Content -path "c:\temp\test.txt") | Set-Content -Encoding UTF8 -Path "c:\temp\test.txt"

I try this 2 code, but is not working.

$MyPath = "C:\Folder1"
# This variable can be reused
$utf8 = New-Object System.Text.UTF8Encoding $false

$MyFile = Get-Content $MyPath -Raw
Set-Content -Value $utf8.GetBytes($MyFile) -Encoding Byte -Path $MyPath

OR

$MyPath = "C:\Folder1"
Get-ChildItem -Path $sourcedir -Filter *.txt | ForEach-Object {

# This variable can be reused
$utf8 = New-Object System.Text.UTF8Encoding $false

$MyFile = Get-Content $MyPath -Raw
Set-Content -Value $utf8.GetBytes($MyFile) -Encoding Byte -Path $MyPath
}

Which version of PowerShell are you using?

Note that with Set-Content in Windows PowerShell (5.1) -Encoding utf8 encodes in UTF-8 with BOM but with PowerShell Core (7.1) -Encoding utf8 encodes in UTF-8 with no BOM.

You may find that Get-Content and Set-Content on the same file gives an error because the file is in use so write the new files to a different folder.

For processing multiple files, just use a foreach loop.

# Assumes Windows PowerShell, use -Encoding utf8BOM with PowerShell Core.

$files = Get-ChildItem E:\Temp\Source\ -Filter *.txt 

foreach ($file in $files) {

    Get-Content $file.FullName | Set-Content "E:\Temp\Destination\$($file.Name)" -Encoding utf8

}

Get-Help about_foreach

I change only the PATH.

$files = Get-ChildItem c:\Folder1\ -Filter *.txt 

foreach ($file in $files) {

    Get-Content $file.FullName | Set-Content "c:\Folder1\$($file.Name)" -Encoding utf8BOM

}

I test the -Encoding with utf8 and utf8BOM. I got this error

Set-Content : Cannot bind parameter 'Encoding'. Cannot convert value "utf8BOM" to type
"Microsoft.PowerShell.Commands.FileSystemCmdletProviderEncoding". Error: "Unable to match the identifier name utf8BOM
to a valid enumerator name. Specify one of the following enumerator names and try again:
Unknown, String, Unicode, Byte, BigEndianUnicode, UTF8, UTF7, UTF32, Ascii, Default, Oem, BigEndianUTF32"
At line:3 char:92
+ ... e | Set-Content "c:\Folder1\$($file.Name)" -Encoding utf8BOM
+                                                                   ~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [Set-Content], ParameterBindingException
    + FullyQualifiedErrorId : CannotConvertArgumentNoMessage,Microsoft.PowerShell.Commands.SetContentCommand

Please do not post images of code opr error messages. Instead post the text of error messages and format it as code as well. You can go back and edit your already existing post. You don’t have to create a new one.

Thanks in advance.

I find this solution, it is a Python Script. I test it, and WORKS only in Notepad++.
Run this code on Menu → Plugins → Python Scripts . After running the code, a new pop-up window appears, and you must write there the path of your files, like c:\Folder1\ Click OK. That’s all !

# -*- coding: utf-8 -*-
from __future__ import print_function

from Npp import notepad
import os

uft8_bom = bytearray(b'\xEF\xBB\xBF')
top_level_dir = notepad.prompt('Paste path to top-level folder to process:', '', '')
if top_level_dir != None and len(top_level_dir) > 0:
    if not os.path.isdir(top_level_dir):
        print('bad input for top-level folder')
    else:
        for (root, dirs, files) in os.walk(top_level_dir):
            for file in files:
                full_path = os.path.join(root, file)
                print(full_path)
                with open(full_path, 'rb') as f: data = f.read()
                if len(data) > 0:
                    if ord(data[0]) != uft8_bom[0]:
                        try:
                            with open(full_path, 'wb') as f: f.write(uft8_bom + data)
                            print('added BOM:', full_path)
                        except IOError:
                            print("can't change - probably read-only?:", full_path)
                    else:
                        print('already has BOM:', full_path)

I change, and put the error

1 Like

You didn’t just change the path, you also changed the encoding from utf8 to utf8BOM.

You are using Windows PowerShell 5.1 or lower which does not have the utf8BOM option because in 5.1 and lower utf8 encodes using BOM by default.

utf8BOM is only available in PowerShell Core. It is required because in later versions utf8 encodes without BOM.

1 Like