comparing 2 csv with different headers by 1 column value and displaying unique

Hello.

I am new to PowerShell and I need help.

I have 2 CSV files:

1.csv:

“Id”;“HrId”;“CreationTime”
“97574bd2-e887-45e7-9280-3ece8587ae57”;“REQ-20201230-03”;“2020-12-30 19:44:42.040383+07”
“f6ba914d-587e-442d-a230-faab8e899308”;“REQ-20201230-02”;“2020-12-30 17:56:40.438461+07”
“449df790-a638-4ea4-bc9c-e92af4f6cedc”;“REQ-20201230-01”;“2020-12-30 14:51:12.119974+07”
“63212824-db25-4840-ac74-48b5e0ab3c4d”;“REQ-20201228-11”;“2020-12-28 18:56:24.713882+07”
“86cbb50c-c1d5-467d-af6c-57ba2f8eb1c0”;“REQ-20201228-10”;“2020-12-28 18:49:43.722974+07”

 

2.csv:

“source_id”;“target_id”
“ed3fd4bc-5f70-4442-867e-0fac010f74fb”;“f6ba914d-587e-442d-a230-faab8e899308”
“b7eeaea3-8183-4fa3-bc5c-a6d6ec0c1655”;“97574bd2-e887-45e7-9280-3ece8587ae57”
“7caa84ea-6ad3-4842-a7c2-e576abcc929a”;“449df790-a638-4ea4-bc9c-e92af4f6cedc”

 

Task:

  • take Id from the Id column in 1.csv
  • if the Id from 1.csv is not found in the target_id column in 2.csv, then the line with this Id from 1.csv is considered unique and must be exported to a new CSV (output.csv).

 

output.csv:

“Id”;“HrId”;“CreationTime”

“63212824-db25-4840-ac74-48b5e0ab3c4d”;“REQ-20201228-11”;“2020-12-28 18:56:24.713882+07”
“86cbb50c-c1d5-467d-af6c-57ba2f8eb1c0”;“REQ-20201228-10”;“2020-12-28 18:49:43.722974+07”

 

I tried different options unsuccessfully, the best one shown below can return matches:

do{ 
 $csv1 = Read-Host Please input 1st CSV file path (example: c:\temp\1.csv)
 $exten1 = [IO.Path]::GetExtension('$csv1') -ne 'csv'
 if ($exten1 -eq $false){
  write-host "Wrong CSV" -ForegroundColor Red
 }
} while ($exten1 -eq $false)

do{
 $csv2 = Read-Host Please input 2nd CSV file path  (example: c:\temp\2.csv)
 $exten2 = [IO.Path]::GetExtension('$csv2') -ne 'csv'
 if ($exten2 -eq $false){
  write-host "Wrong CSV" -ForegroundColor Red
 }
} while ($exten2 -eq $false)

$csvfilepath = $csv1.TrimEnd(".csv") + "_dif.csv"

$csv_data1 = Import-CSV $csv1 -UseCulture
$csv_data2 = Import-CSV $csv2 -UseCulture

$matchcounter = 0
foreach ($order1 in $csv_data1){
    $matched = $false
    foreach ($order2 in $csv_data2){
        $obj = "" | select "Id","HrId","CreationTime"
        if($order1.'Id' -eq $order2.'target_id' ){
            $matchCounter++
            $matched = $true
            $obj.'Id' = $order1.'Id'
            $obj.'HrId' = $order1.'HrId'
            $obj.'CreationTime' = $order1.'CreationTime'
            $obj | Export-Csv $csvfilepath -Append -NoTypeInformation -Force -Encoding UTF8
        }
    }
}
Write-Host "File" "$csvfilepath"

1_dif.csv:

″Id″;″HrId″;″CreationTime″
″97574bd2-e887-45e7-9280-3ece8587ae57″;″REQ-20201230-03″;”2020-12-30 19:44:42.040383+07″
″f6ba914d-587e-442d-a230-faab8e899308″;″REQ-20201230-02″;”2020-12-30 17:56:40.438461+07″
″449df790-a638-4ea4-bc9c-e92af4f6cedc″;″REQ-20201230-01″;”2020-12-30 14:51:12.119974+07″

 

Thanks in advance.

Dmitriy

This forum is for scripting questions rather than script requests. We do not write customized and ready to use scripts or solutions on request.

What have you tried so far? We expect you to make an own attempt to get your task done or to solve your problem. If you have done so already please document here what exactly you have done and show your code. Then we probably might be able to help you step further.

Then … when you post code, error messages, sample data or console output format it as code, please.

Here you can read how that works: Guide to Posting Code.

<->

I tried different options unsuccessfully, the best one shown below can return matches:

do{ 
 $csv1 = Read-Host Please input 1st CSV file path (example: c:\temp\1.csv)
 $exten1 = [IO.Path]::GetExtension('$csv1') -ne 'csv'
 if ($exten1 -eq $false){
  write-host "Wrong CSV" -ForegroundColor Red
 }
} while ($exten1 -eq $false)

do{
 $csv2 = Read-Host Please input 2nd CSV file path  (example: c:\temp\2.csv)
 $exten2 = [IO.Path]::GetExtension('$csv2') -ne 'csv'
 if ($exten2 -eq $false){
  write-host "Wrong CSV" -ForegroundColor Red
 }
} while ($exten2 -eq $false)

$csvfilepath = $csv1.TrimEnd(".csv") + "_dif.csv"

$csv_data1 = Import-CSV $csv1 -UseCulture
$csv_data2 = Import-CSV $csv2 -UseCulture

$matchcounter = 0
foreach ($order1 in $csv_data1){
    $matched = $false
    foreach ($order2 in $csv_data2){
        $obj = "" | select "Id","HrId","CreationTime"
        if($order1.'Id' -eq $order2.'target_id' ){
            $matchCounter++
            $matched = $true
            $obj.'Id' = $order1.'Id'
            $obj.'HrId' = $order1.'HrId'
            $obj.'CreationTime' = $order1.'CreationTime'
            $obj | Export-Csv $csvfilepath -Append -NoTypeInformation -Force -Encoding UTF8
        }
    }
}
Write-Host "File" "$csvfilepath"

 

1_dif.csv:

″Id″;″HrId″;″CreationTime″
″97574bd2-e887-45e7-9280-3ece8587ae57″;″REQ-20201230-03″;”2020-12-30 19:44:42.040383+07″
″f6ba914d-587e-442d-a230-faab8e899308″;″REQ-20201230-02″;”2020-12-30 17:56:40.438461+07″
″449df790-a638-4ea4-bc9c-e92af4f6cedc″;″REQ-20201230-01″;”2020-12-30 14:51:12.119974+07″

done

To compare two different objects you can use Compare-Object. It just needs a property you have in both objects.

$CSV01 = Import-CSV -Path D:\sample\1.csv -Delimiter ';'
$CSV02 = Import-Csv -Path D:\sample\2.csv -Delimiter ';' | Select-Object -Property *,@{Name = 'Id';Expression={$_.target_id} }

Compare-Object -ReferenceObject $CSV01 -DifferenceObject $CSV02 -Property 'Id' -IncludeEqual -PassThru

Now you can add further steps by evaluating the property SideIndicator.

The output looks like this:

Id                                   HrId            CreationTime                  SideIndicator
--                                   ----            ------------                  -------------
97574bd2-e887-45e7-9280-3ece8587ae57 REQ-20201230-03 2020-12-30 19:44:42.040383+07 ==
f6ba914d-587e-442d-a230-faab8e899308 REQ-20201230-02 2020-12-30 17:56:40.438461+07 ==
449df790-a638-4ea4-bc9c-e92af4f6cedc REQ-20201230-01 2020-12-30 14:51:12.119974+07 ==
63212824-db25-4840-ac74-48b5e0ab3c4d REQ-20201228-11 2020-12-28 18:56:24.713882+07 <=
86cbb50c-c1d5-467d-af6c-57ba2f8eb1c0 REQ-20201228-10 2020-12-28 18:49:43.722974+07 <=

This works for me, thanks, below is the complete code

cls

#ask path for 1st file, expamle: c:\temp\2.csv
$csv_path1 = Read-Host Please input path 1st CSV

#set name and path for export
$csvfilepath = $csv1.TrimEnd(".csv") + "_dif.csv"

#import 1st file
$CSV01 = Import-CSV -Path $csv_path1 -Delimiter ';'

#ask path for 2nd file, expamle: c:\temp\2.csv
$csv_path2 = Read-Host Please input path 2nd CSV

#import and select
$CSV02 = Import-Csv -Path $csv_path2 -Delimiter ';' | Select-Object -Property *,@{Name = 'Id';Expression={$_.target_id} }

#compare and export
Compare-Object -ReferenceObject $CSV01 -DifferenceObject $CSV02 -Property 'Id' -IncludeEqual -PassThru | Where-Object SideIndicator -eq "<=" | Select Id,HrId,CreationTime | 
Export-Csv $csvfilepath -NoTypeInformation -Append -Force -Encoding UTF8

Write-Host "File:" "$csvfilepath"  -ForegroundColor Yellow