File Copy - Check Hash *and* File Date? -Newbie

I’m new to powershell and was working on a script to copy files from one directory and subdirectories to another. I found an example script and was wondering about the reason to check the hash in addition to the file date?

In the script, the destination is checked for the existence of the same-named file. If that file exists, the hash is checked, and if it’s different the file dates are checked and THEN the file is copied over.

Thinking out loud, if the file dates are the same but the hash is different, I really wouldn’t know how to resolve which file to keep without opening both and checking them out. Otherwise, I’d just copy the latest file to the destination directory.

Can you all think of an instance where the file name and date would be the same but somehow the hash is different? I’m thinking that would be really weird, and rare, and as such I’m thinking of dumping that part of the code just to save some processing time.

Here’s the example script for reference: xcopy - Powershell Copy-Item but only copy changed files - Stack Overflow

Thank you for your help!

-dad

DADMODE1000,
Welcome to the forum. :wave:t4:

Without following the link at first … my general answer would probably be … well … it depends. :wink:

Usually when it’s about to copy a fairly amount of files or even just one bigger file the command line tool number one on Windows would be robocopy. For the vast majority of the cases it will provide you with the right option to do the job.

Only when it’s not enough to compare the metadata of 2 given files you could check the hash of both files. And it will only tell you if the files are exactly bit equal. It will not tell you what file is newer. If the files are bit equal it does not matter if one of the files is newer - they are the same.

Thanks for the welcome! :smiley:

I’ve seen robocopy suggested and I looked at the options and I couldn’t quite determine if it could be done:

What I’m trying to do with the script (because people like to make lots of empty subdirectories) is to recursively copy all files out of a directory/subdirectories and put them into just one other directory. While it might sound horrible to do that, it will help me not have to search through a bunch of empty directories to find what I might be looking for. Filenames are not always consistent and there’s an abundance of PDF files, so, it’s better to just copy them all out into one place and I can find what I need much more easily.

With robocopy, it seemed that it would always create the directory tree in the destination.

So, if the files aren’t bit equal, one of them is newer, it seems therefore, based on your reaponse, the hash check really isn’t needed. If there are two files with the same name, just check the dates. Thank you for your help!

Well … that’s a quite uncommon requirement I think. And if it has to be this way robocopy is not an option. But …

… yes. But wheny ou use the option “/s” instead of “/e” robocopy will skip empty folders. :wink:

Yes.

Right.

The comparison with hash values is also used to determine if two files are the same even if they have different names or meta data. So you could use this information for deduplication. :wink:

If only I wanted that option! I really don’t want the directory tree. I have to click and click and click to look in a folder for something I may not want. It’s a lot easier to have everything on one place. But I really do appreciate your help and responses. Thank you! I’d go into more detail and elaborate but I wouldn’t want to bore you to death and such. :slight_smile: Thanks!

Don’t worry. I am here by choice. :wink:

When I’m looking for something particular and I don’t know where it is I use the search. Usually that’s way faster than I am. :wink:

OK, if that’s your requirement. :wink: But it does not need that big script actually

$Source = 'C:\sample\subfolder'
$Target = 'D:\sample\collection'
Get-ChildItem -Path $Source -Filter * -File -Recurse -Force |
    ForEach-Object{
        Copy-Item -Path $_.FullName -Destination $Target
    }

Now you just have to have a plan what to do if you have two files with the same name. :wink:

Well, right now what I have written up does the following:

Inputs - $source and $destination directories

From the $source, it grabs the names of the ‘root’ directories, usually about 7 or 8 directories, and then duplicates those under the destination directory. That way some of the files (these are engineering project folders/directories) are sorted a bit. I also make a ‘media’ folder and a ‘simulations’ folder under each of those folders so that images, videos are sorted automatically, and any simulation files are also as I’m gonna do that anyways.

So then the script starts looping through each one of the subdirs and copies all the files out from those, and puts the files into the corresponding destinations.

So:
\project directory\folder 1\bunch of subfolders etc

gets copied into

\destination folder\folder 1\ -all the files from ‘folder 1’
\destination folder\folder 1\media\ -any photos, videos, etc.
\destination folder\folder 1\simulations\ -any model/simulation files that were in the subdirs

There’s an $exclude array that prevents copying things like email messages, miscellaneous log files, some drawing files and other stuff I just won’t need/use.

If the file extension corresponds to what I put in the arrray for $media or $simfiles then it’ll get sorted into the corresponding folders. I learned about the ‘switch’ command today so in retrospect that might have shortened up the code a bit and made it look cleaner. I still haven’t quite gotten the hang of using a switch in a function tho. I haven’t found any documentation for putting the word [switch] in the ‘param’ field in the function subroutine. So I just put a value into a variable and send it into the function. ‘Switch’ seems like ‘Case If’ to me.

That’s basically it. :slight_smile: At the moment the modified (from that link) runs quite well. What I’m running into that’s weird is when I right click on a project folder, the number of files it lists is different than the number of files that PowerShell recursively counts. There aren’t any hidden files to my knowledge so I’m not sure what the difference is and why. But I went through one of the folders and manually counted and what powershell tabulates is correct. Don’t know what the extra files are. I did not count folders, heh, just files.

I’m not a very experienced or good programmer but I’ve messed with VBA and VBS quite a bit over the years. Powershell has been really easy to pick up, as opposed to VBS which was quite cryptic to me.

Anywho… you said you were here by choice so… you got the boring brain dump. :slight_smile:

Sorry for the late answer …

That’s actually a very complex and therefore potentially error prone approach. It can work as expected if there are only file types you know in advance and you planed to treat.

There is a distinct difference between the switch statement and the switch paramter of a function or a script.

We all started once. But PowerShell makes it easy to catch up and to be very productive with a very small amount of code.

That’s what I meant with my first command in this reply. You may change your approach slightly to make sure you get everything from the source folder.

You could copy the original folder completely to a temporary folder and move the files you need to the destination instead of copying them. If you treated all files you knew in advance there may be files left in the temporary folder. You could then copy this rest to a designated folder in your destination folder structure.

That’s a really interesting idea! I may give that a go. And thanks for the links on the ‘switch’ parameter. :slight_smile:

Following up after pondering for a while, what’s weird is that when I view the directories (using Explorer) the number of files agrees with what Powershell reports. But the ‘right click / properties’ of a folder sometimes returns different #'s of files. There aren’t any hidden files that I can see. Are there hidden files on my computer that won’t show even if I’ve enabled ‘Show Hidden Files’?

By default Get-ChildItem does not list any hidden or system files. To enumerate those you’d need to provide the parameter -Force. So the count you get in PowerShell might be incomplete.

1 Like