searching one list for values from another list.

first, always good to find your own forum posts when googling for how to do stuff

that was the second part of a bigger question i was answering today. basically i’m pulling an array of $sources with a ‘path’ column whose values look like ‘share/folder/subfolder/’ (yes, with those slash directions) from one location, and and array of $allstats from a mysql query, which has a “sourcefilename” column with a similar path as $sources. i just want a count of how many times each $source.path appears in $allstats. i just have to do a little manipulation with the slash directions.

i came up with three ways to do this query. version 1 was even before the previous paragraph, i was looping through $sources and re-querying the mysql database repeatedly for each $source.path. pretty slow. then i figured it’d be faster to pull the whole mysql query into powershell once, and then loop through that PS array looking for each $source. so that array is $allstats.

version 2, i foreach through each $source, and nest another foreach going through each line in $allstats to see if it matches, then increment a counter if they do.

version 3, i do ($allstats | where-object -blah blah).count.

(hoping this gist link works)


v3 looks like a cleaner way to do the same thing as v2, but but v3 takes about 30% longer to run, according to measure-command. which made me wonder why, and made me wonder if there’s a way to do it that’s 30% faster than v2.


how about

# use below instead of the internal foreach in V1
$allstat.sourcefilename.Where({$_ -like "*$sourcepath*"}).count

@kvprasoon,i may be confused. the code in the OP was v2 and v3, there was no V1… and i’m not sure where to drop your code into my loops. i tried it one way, and it ended up running longer than either of the previous two versions, and i stopped it. the resulting $array only had two lines in it at that point.

Hey John,

I believe Kvprasoon was meaning for V2 not V1. And he is referring to this below with my interpretation.

Measure-Command {
    foreach ($source in $sources)
        $array += [pscustomobject]@{
        count = $allstat.sourcefilename.Where({$_ -like "*$sourcepath*"}).count
        path = $source.path}

so to clear something up… $allstatS is the array, $allstat is one row in a foreach.

so since your suggestions are not foreaching through allstats, i think you guys are meaning:
<p style=“padding-left: 40px;”>$allstatS.sourcefilename.Where({$_ -like “$sourcepath”}).count</p>
v2 takes about 50 seconds, v3 takes about 75 seconds. if i let either of the two below run for three minutes, $array is only about 25% complete.

one thing i found that sped up v2 even more was reducing the size of $allstats after every source with
<p style=“padding-left: 40px;”>$allstats = $allstats -notlike “$sourcepath”</p>
so the next loop didn’t have to parse through all the $allstats rows that i’d already counted. that knocked it down to 30 seconds.


so at the core of this is how we’re parsing $allstats. so i just took out those commands and measured them for one $sourcepath.

the v2 foreach($allstat in $allstats){blah blah…count++} loop takes 3 seconds

the v3 ($allstats | ? sourcefilename -like “$sourcepath”).count takes 6 seconds

($allstats.sourcefilename.Where({$_ -like “$sourcepath”}).count takes 46 seconds.