Combining data - how to also add data that is in set2 and not set1?

Hello!

I am brand new to PowerShell (started yesterday). I’m trying to combine two sets of data into a single array using a common variable (name in example below). The issue I am running into is that values may appear on one set and not the other, but I still need them to show up.

I’ve been able to get it so that values from $CSVData01 that aren’t on $CSVData02 appear, but I am struggling with also getting values on $CSVData02 that aren’t on $CSVData01 to appear. Here’s what I have with dummy data:

[pre]$CSVData01 = @’
“name”,“start_date”,“end_date”
Ashleigh,1/5/2019,4/5/2019
Hunter,5/8/2018,8/8/2018
Steve,6/25/2019,9/25/2019
Jason,11/9/2015,2/9/2016
Heather,3/22/2018,6/22/2018
Sarah,7/16/2019,10/16/2019
Jane,1/1/2017,4/1/2017
'@ | Convertfrom-Csv

$CSVData02 = @’
“name”,“rehire”,“bonus”
Ashleigh,5/8/2019,500.00
Hunter,8/25/2019,250.00
Steve,10/22/2019,100.00
Heather,7/13/2019,600.00
Sarah,10/5/2019,50.00
Alice,5/5/2019/6000.00
Andy,1/8/2015,2.00
'@ | Convertfrom-Csv

$outarray = @(
foreach ($DataSet01 in $CSVData01) {
$DataSet02 = $CSVData02 | Where-Object {$DataSet01.name -eq $_.name}
[PSCustomObject]@{
name = $DataSet01.name
start_date = $DataSet01.start_date
end_date = $DataSet01.end_date
rehire = $DataSet02.rehire
bonus = $DataSet02.bonus
}
}
)
$outarray | Write-Output[/pre]

Here is the output I’m getting

name : Ashleigh
start_date : 1/5/2019
end_date : 4/5/2019
rehire : 5/8/2019
bonus : 500.00

name : Hunter
start_date : 5/8/2018
end_date : 8/8/2018
rehire : 8/25/2019
bonus : 250.00

name : Steve
start_date : 6/25/2019
end_date : 9/25/2019
rehire : 10/22/2019
bonus : 100.00

name : Jason
start_date : 11/9/2015
end_date : 2/9/2016
rehire :
bonus :

name : Heather
start_date : 3/22/2018
end_date : 6/22/2018
rehire : 7/13/2019
bonus : 600.00

name : Sarah
start_date : 7/16/2019
end_date : 10/16/2019
rehire : 10/5/2019
bonus : 50.00

name : Jane
start_date : 1/1/2017
end_date : 4/1/2017
rehire :
bonus :

How do I get Alice and Andy also to appear on this output?

Thanks so much!

You have a typo in your $CSVData02. :wink:

$CSVData01 = @'
"name","start_date","end_date"
Ashleigh,1/5/2019,4/5/2019
Hunter,5/8/2018,8/8/2018
Steve,6/25/2019,9/25/2019
Jason,11/9/2015,2/9/2016
Heather,3/22/2018,6/22/2018
Sarah,7/16/2019,10/16/2019
Jane,1/1/2017,4/1/2017
'@ | Convertfrom-Csv

$CSVData02 = @'
"name","rehire","bonus"
Ashleigh,5/8/2019,500.00
Hunter,8/25/2019,250.00
Steve,10/22/2019,100.00
Heather,7/13/2019,600.00
Sarah,10/5/2019,50.00
Alice,5/5/2019,6000.00
Andy,1/8/2015,2.00
'@ | Convertfrom-Csv

$IntermediateResult = Compare-Object -ReferenceObject $CSVData01 -DifferenceObject $CSVData02 -Property name -IncludeEqual -PassThru |
Select-Object -Property Name, start_date, end_date

$outarray = @(
    foreach ($DataSet01 in $IntermediateResult) {
        $DataSet02 = $CSVData02 | Where-Object { $DataSet01.name -eq $_.name }
        [PSCustomObject]@{
            name       = $DataSet01.name
            start_date = $DataSet01.start_date
            end_date   = $DataSet01.end_date
            rehire     = $DataSet02.rehire
            bonus      = $DataSet02.bonus
        }
    }
)
$outarray | 
    Format-Table -AutoSize

Nice job thus far for a beginner! There isn’t any default cmdlets for joining objects, so it is a manual process unless you use a function like Join-Object. Basically, you are doing a FULL JOIN from a SQL standpoint. You’ve gotten the matches, so you need to identify the records that didn’t match and create a new object. The $results.Name is an implicit loop, so it’s creating an array of the names and the -notcontains searches that array for the names not contained in the $CSVData02 object. I’m only mentioning what it’s doing because it a shortcut to generate an array. Here is a an example (there was a small issue with CSV 2 for Alice having an extra / vs a , which I fixed in the code too):

$CSVData01 = @'
"name","start_date","end_date"
Ashleigh,1/5/2019,4/5/2019
Hunter,5/8/2018,8/8/2018
Steve,6/25/2019,9/25/2019
Jason,11/9/2015,2/9/2016
Heather,3/22/2018,6/22/2018
Sarah,7/16/2019,10/16/2019
Jane,1/1/2017,4/1/2017
'@ | Convertfrom-Csv

$CSVData02 = @'
"name","rehire","bonus"
Ashleigh,5/8/2019,500.00
Hunter,8/25/2019,250.00
Steve,10/22/2019,100.00
Heather,7/13/2019,600.00
Sarah,10/5/2019,50.00
Alice,5/5/2019,6000.00
Andy,1/8/2015,2.00
'@ | Convertfrom-Csv


$results = foreach ($DataSet01 in $CSVData01) {
    $DataSet02 = $CSVData02 | Where-Object {$DataSet01.name -eq $_.name}

    [PSCustomObject]@{
        name       = $DataSet01.name
        start_date = $DataSet01.start_date
        end_date   = $DataSet01.end_date
        rehire     = $DataSet02.rehire
        bonus      = $DataSet02.bonus
    }
}

#Get the other object and Select-Object generates a new PSObject that matches the schema returned in $results
$noMatch = $CSVData02 | 
           Where{$results.Name -notcontains $_.Name} | 
           Select name, start_date, end_date, rehire, bonus

#Create an empty array and then add both schemas
$joined = @()
$joined += $results
$joined += $noMatch

'Matched: {0}' -f $results.Count
'Not Matched: {0}' -f $noMatch.Count
'Joined: {0}' -f $joined.Count

$joined | Format-Table

Thank y’all so much! You really are community heroes!

I hope it’s alright to expand on this thread a little. I’m trying to accomplish the same thing, only at a much larger scale.

I have CSVs that I’m importing with close to 50k rows (Or in this case, unique users).

Importing the CSVs only takes about a second, but after that it gets stuck.

Normaly you would create your own new thread for your question and reference this thread with a link if needed.

It’s gonna be hard to give a reasonable advice if you do not show your code.