Efficency advice on remove files script

Hi,

Can someone provide me some advice on how to improve efficency of my remove-tempfiles script, specifically around the looping and removing of files?

https://gist.github.com/mortensummer/ab8116a201221bb87d2683ebaafe4d3b

Currently, I loop through all the folders and gather the list of files to remove, slap them in an array, and then I run through it again removing the files if $Commit is true. Would it be more efficent to run through the folders, and if $commit = true then remove the file found?

I’m not specifically looking for the exact answer or code snippet - more about ways to approach this so I can attempt it and learn from it.

Thanks in advance,

Tom

If you are talking about efficiencies in system resources I can make a recommendation.
Line 80 - Using the .Delete() method of the System.IO.FileInfo object is more efficient than using the Remove-Item Cmdlet.

I’m not noticing any of the big things that I generally look for. Couple spots where you might be able to squeeze out a little extra speed, though…

Lines 47 and 62… Replace the Where-Object cmdlets with the Where() method. The method is quicker but you may not see much improvement depending on the number of files that you’re working with. I’ve got about 8k files in my Documents folder and it takes 2 seconds to run through them with Where-Object. Just shy of 1 second with the method.

The first filter would end up looking like this…

[pre]$files = (Get-ChildItem -Path $Path -recurse -ErrorAction stop).where{
($.Extension -in $Extensions) -or
($
.Name -in $TempFiles) -or
($_.Extension -match $RevitTempExtension)
}[/pre]

 

As a general good practice, I’d also ditch those backticks. Every spot where you’re using them will handle a linebreak just fine, and they can sometimes cause more headache then they’re worth. Mark Kraus wrote a great article going into some detail there. It’s well worth a read.

Thanks guys - these are both great comments and ideas - i’ll implement them and see how it improves.

To give it some context - The folder structure it runs through is near on 2Tb, with thousands of project folders in the root. Each project has multiple structural engineering analysis files in there, which when the analysis application is run generate loads of temp files in the project folder (I do wonder why the analysis package isn’t using local disk for this temp workings though??)

Is it acceptable to run two loops to find and then remove the files? My gut says i should be able to do this in one, however this is one of the areas i’m uncertain on. If two loops is fine… then this is almost job done!

It will be good if you are okay with below code in the final else statement.

}else{
    Write-LogFile "Found, but NOT removing below files"
    $Files.FullName | Out-File -FilePath $LogFile -Append
}


This can reduce file open close activities which will happen if this is in a loop.

Thanks @kvprasoon - although this output all found files for every time it finds out. i guess you mean it to be outside the loop?

But more interestingly and to help my learning, can you explain what you mean about the file open close activities? I have the same line in my Write-Logfile function.

Your line:

  $Files.FullName | Out-File -FilePath $LogFile -Append

My line from my function:

"$timestamp $logentry" | Out-File $LogFile -Append
 

In your final else statement, $files will contain al the files info which your are not going to remove, the same has been logged in a file inside the foreach.
When the Out-File call gets called inside a loop. It writes the file full name foreach file.

Out-File opens the file, writes(new or append) it and closes it per invocation. Here its happening for all the files you have.
So avoiding that many open/close activities will improve performance when you work with large numbers.

Great - yes - Got it. I didnt notice the lack of foreach in your code block.

All makes sense now - Thank you!