Hi PowerShell Community,
I’ve got a script that frequently creates hashtables from collections. I used to do this “by hand” until I realized that
Group-Object already provides this functionality through its
-AsHash parameter. I’ve replaced some of the “by hand” code with Group-Object calls and I’ve realized that it’s no longer as fast.
My question is, am I using the cmdlet wrong and causing this performance hit? If I’m not, I’m also wondering how the call to
Group-Object (a cmdlet presumably written in C#) could be slower than the “by hand” PowerShell code.
I’ve already done a bit of investigating myself by writing a script that creates hashtables using both methods (“by hand” and Group-Object) and timing each. I’ve found that
Group-Object is only slightly slower when the number of keys in the hash table is ~5000 or lower. However, once you get to something like 10,000 keys the difference in performance is staggering and
Group-Object takes much longer.
The lowdown on the Gist script:
Just dot source it and run
Compare-HashCreation with the required params
Compare-HashCreation -NumValues 50000 -NumKeys 1000
This will create a list of 50,000 tuples of the form (Num, “foobar”) where N is a random number from 0-999. Then it will create two hashtables via both methods and using the the Num property of the tuple for the hashtable keys.