Modifying ACL in AD, VERY random error (trust relationship failed)

I have made a script that has to loop through thousands of AD user home directories one by one, basically doing the following steps for each one:

  • Take ownership of the folder
  • Add an access rule for Domain Admin group
  • Return the ownership of the folder
  • Loop through all child folders and files, enabling inheritance and removing all explicit permissions

After excessive testing and problem solving the script works perfectly, except for 1 problem that has left me banging my head against a wall.

The script successfully loops about 50-150 folders (very random) and then results in the following error:

“the trust relationship between the primary domain and the trusted domain failed”

I built an additional loop that will retry 30 times (once every 30 seconds) when this error occurs. However this does not help as the trust relationship remains lost for as long as the script runs.

The most interesting part is, that once I run the script again (starting from the problematic folder), the folder is processed without further error. The script never gets stuck on the same folder again. But then this happens again, say 50 folders later.

This is a HUGE inconvenience as I will need to process at least 15,000 user folders and I will always need to compile a new list of “folders left to process”, when 1 fails.

Here is the basic code functionality, where I’ve taken out all the unnecessary error handling and retry-looping etc. for better readability:

$userFolder = "\\domain.local\path\to\folders"
$homeFoldersFound = (Get-ChildItem -LiteralPath $userFolder -Directory -Force)

foreach ($folder in $homeFoldersFound) {
    $accessControl = Get-Acl -LiteralPath $folder.FullName -ErrorAction Stop

    #Current owner
    $folderOwner = $accessControl.Owner

    #Take ownership for the user running the script
    $accessControl.SetOwner([System.Security.Principal.NTAccount]$currentUser)

    #Access rule to add
    $accessRule = New-Object System.Security.AccessControl.FileSystemAccessRule($groupToAdd,"FullControl","ContainerInherit,ObjectInherit", "None", "Allow")
    $accessControl.AddAccessRule($accessRule)

    #Purge current explicit permissions
    $accessControl.SetAccessRuleProtection($true, $false)

    #Apply ownership and access rules
    set-acl -AclObject $accessControl -LiteralPath $folder.FullName -ErrorAction Stop | Out-Null


    #Return the previous ownership and apply
    $accessControl.SetOwner([System.Security.Principal.NTAccount]$folderOwner)
    $accessControl.SetAccessRuleProtection($false, $false)
    set-acl -AclObject $accessControl -LiteralPath $folderItem -ErrorAction Stop | Out-Null


    #Loop through child items, enable inheritance & remove explicit permissions
    foreach ($item in (Get-ChildItem -LiteralPath $folder.FullName -Recurse -ErrorAction Stop)) {
        #More code
    }
}

It’s a DFS-file system and accessible through many different paths, but apparently not locally (only as a share). I’ve also tried changing the path from “\domain.local\path\to\folders” to other, specific servers, such as “\domainDC2\path\to\folders” among others, but the same problem keeps occurring.

Again, there shouldn’t really be anything wrong with the code, as the error happens so randomly and passes when running the script again. Any ideas on what might cause this / how to work around it?

All help is appreciated!

Just to add, every time the error occurs, the code has already run to the part where it has set me as the new owner of the directory successfully, so I need to change the owner back manually as the next time the script is run it would think I am the original owner.

This error message usually occurs when a machine has lost its trust relationship with AD. Reasons for this is either because the computer object has been removed from AD or the machine has been off for a long period of time and has not ‘checked in’. If you are running the script against lots of domain joined computers it is possible that some have lost their trust with the domain meaning the authentication credentials you are using to connect to the share cannot be validated from the machine that has lost its trust.

I hope that helps, sorry I’m not the best at explaining.