Memory exhaustion using NTFSSecurity module on a deep folder traverse

I posted this question on a different forum, but I think I stumped everyone there so I thought Id post here to see if I can nudge this forward a bit.

I have been tasked with reporting all of the ACL’s on each folder in our Shared drive structure. Added to that, I need to do a look up on the membership of each unique group that gets returned.

Im using the NTFSSecurity module in conjunction with the get-childitem2 cmdlet to get past the 260 character path length limit. The path(s) I am traversing are many hundreds of folders deep and long since pass the 260 character limit.

I have been banging on this for a couple of weeks. My first challenge was crafting my script to do my task all at once, but now im thinking thats my problem… The issue at hand is resources, specifically memory exhaustion. Once the script gets into one of the deep folders, it consumes all RAM and starts swapping to disk, and I eventually run out of disk space.

Here is the script:

$csvfile = 'C:\users\user1\Documents\acl cleanup\dept2_Dir_List.csv'

foreach ($record in Import-Csv $csvFile)
{

$Groups = get-childitem2 -directory -path $record.FullName -recurse | Get-ntfsaccess | where -property accounttype -eq -value group
$groups2 = $Groups | where -property account -notmatch -value '^builtin|^NT AUTHORITY\\|^Creator|^AD\\Domain'
$groups3 = $groups2 | select account -Unique

 $GroupMembers = ForEach ($Group in $Groups3) {
    (Get-ADGroup $Group.account.sid | get-adgroupmember | select Name, @{N="GroupName";e={$Group.Account}}

)}
$groups2 | select FullName,Account,AccessControlType,AccessRights,IsInherited | export-csv "C:\Users\user1\Documents\acl cleanup\Dept2\$($record.name).csv"
$GroupMembers | export-csv "C:\Users\user1\Documents\acl cleanup\Dept2\$($record.name)_GroupMembers.csv"
} 

NOTE: The dir list it reads in is the top level folders created from a get-childitem2 -directory | export-csv filename.csv.

During the run, it appears to not be flushing memory properly. This is just a guess from observation. At the end of each run through the code, the variables should be getting over-written, I thought, but memory doesn’t go down, so it looked to me that since memory didn’t go back down, that it wasn’t properly releasing it? Like I said, a guess… I have been reading about runspaces but I am confused about how to implement that with this script. Is that the right direction for this?

Thanks in advance for any assistance…!

RichardX

Well, keep in mind that .NET is in charge of things like memory management and garbage collection, and if you’re binding the CPU then it won’t take the time to do that.

You’re also forcing it to cache a butt-tonne of stuff in RAM (variables). Like, rather than forcing it to get ALL the directories streaming into the pipeline, why not go through directories and ForEach them? That way, you can look at one directory at a time, get the data you want, and perhaps dump it to a CSV or XML file - to get your group list, for example. Then go through THAT one item at a time. The current approach is very memory-wasteful, in other words.

Look at this:

$Groups = get-childitem2 -directory -path $record.FullName -recurse | Get-ntfsaccess | where -property accounttype -eq -value group
$groups2 = $Groups | where -property account -notmatch -value '^builtin|^NT AUTHORITY\\|^Creator|^AD\\Domain'
$groups3 = $groups2 | select account -Unique

$Groups continues to exist, even when you’re not using it anymore. $groups2 is all but a copy of that, and continues to exist past the point you’re done using it. Instead, as an example of an alternate approach:

get-childitem2 -directory -path $record.FullName -recurse | 
Get-ntfsaccess | 
where -property accounttype -eq -value group |
where -property account -notmatch -value '^builtin|^NT AUTHORITY\\|^Creator|^AD\\Domain' |
export-clixml something.xml

Now you’re caching your data on disk in an XML file, not in memory. You can Import-CliXML that information for the next step.

import-clixml something.xml | 
select account -unique |
export-clixml unique.xml

I’m not sure what you mean by not flushing RAM properly; the way I’m reading this, you’ve not given it anything to flush.

That’s important to NOT do in the first step, because -unique forces Select-Object to buffer all the results in RAM - again, wasteful, but doing it in one step will let RAM from the previous step be recovered.

The trick in the pipeline is to avoid using blocking commands like Sort or Select’s -unique. When doing so, the pipeline can stream one object at a time and is fairly memory-efficient. I’d need to test and see if Export-CliXML is blocking or not; it might be that another format like Export-CSV or even something custom-made would be more efficient. And avoid caching huge amounts of data in variables three times over. .NET is particularly inefficient at repeatedly adding stuff to variables - assigning the results of a pipeline to a variable, then that pipeline is producing yooge results, I’d expect to be slow and memory-wasteful.

Thanks for the help, Don. I am still really inexperienced at all of this and this made me realize how much of the basics I am still missing! This script, as inefficient as it is, is kind of the peak of my powers, but the lack of fundamentals, like memory management, is kind of obvious now. I had never heard of the clixml cmdlets, for example.

I tried your suggestions, but during the phase where Im filtering to get only the unique groups, the result is incorrect - showing only one group when there are 7. The export from the previous step contains everything correct so…

Import-Clixml 'C:\Users\user1\Documents\acl cleanup\Dept2\DirPathAcls.xml' | select Account -Unique | Export-Clixml 'C:\Users\user1\Documents\acl cleanup\Dept2\DirPathAclsUnique.xml'

Ive decided to scrap this and start over - maybe taking a different approach will help my understanding of what I am doing and your suggestions have me asking different questions on how to solve this problem. And last, what I meant by not releasing memory, was that I expect the variables to be over written every pass, so I expected them to vary in size each run, and not knowing exactly how memory in this situation works, I assumed it would be released and then readded and then released, and this would fluctuate and not just keep going and going and going.