I have thousands of folders in a directory and each time a set of files are processed a new version is created of that folder. For example:
3-CC-TEST-v1
3-CC-TEST-v3
14061-TISB-v1
14061-TISB-v8
14061-TISB-v20
How do I look at everything before the first dash in the folder name and find duplicates? The number can be anything (no fixed length) so I just need to look at everything before the first dash and find duplicates? I just need to write it out to the screen and view the list.
If it’s always the dash you can use a calculated property splitting the folder name on the dashes and use the first element of the resulting array for further investigations:
BTW: When you post code, sample data, console output or error messages please format it as code using the preformatted text button ( </> ). Simply place your cursor on an empty line, click the button and paste your code.
Removing the filtering does combine both files and folders and is additionally helpful.
I found more old unnecessary duplicate files. In a folder containing over 6000 items.
Just a little more background on the purpose if you are curious…
(All of the files and folders are generated by a system creating Video On Demand Files. The software has changed so now everything new is contained within folders but older content is just in the root as mp4 files until retranscoded. The retranscoding process is what adds the v and number to the end of the file or folder.)
It might be complaining at a high level and it depends on the amount of files and folders you have to treat and on the speed of your infrastructure but when you need to run this task regularily you may use only one query collecting the files AND folders and process them at once.
Usually file system operations are the most expensive ones according to CPU time and it saves a lot of runtime when you can minimize those.