Need Help With Formatting CSV

I have two .csv files that I need to merge into one. Both files have a few columns that are similar. I need help using Powershell to code such that if the “Email Address” columns in both files are similar, then the records should be merged and the values in their corresponding “Amount” column should be summed up.

I’m very new to Powershell and will appreciate any help I can get.

Thanks.

There’s not really an easy magical way to do that, unfortunately. It’s going to be a good bit of coding. For example, you say “similar.” What is your definition of a “similar” email address?

Thanks for the response. I meant the same email address. I intend on having the email address columns as the unique identifier (primary key).

Well, I guess that makes it a bit easier. If the files aren’t truly humongous, I’d probably read one into a variable using Import-CSV.

$file2 = Import-CSV file2.csv

Then I’d read the other one a line at a time using Import-CSV and ForEach-Object.

Import-CSV file1.csv | ForEach-Object {
}

In there, I’d search for the email address.

Import-CSV file1.csv | ForEach-Object {
 $email = $_.email # email from file1
 $matching = $null
 $matching = $file2 | Where { $email -eq $_.email } # gets matching email in file2
 if ($matching) {
  # you've got a match
 } else {
  # there was no match
 }
}

For the output, I’d construct a new object, and pipe it to Export-CSV on append. The new object would contain whatever new values you want, and you’d be exporting to a new CSV file.

Something like that. Just trying to lay out the logic, you’ll need to figure the appropriate exact syntax for your situation.

I will try your approach. I appreciate your help. Thanks.

There are also some iterations of Join-Object out there that basically can join the objects like a typical SQL JOIN. Here is one of the functions:

If you ware working on small objects a for loop like Don suggested will work. If it’s really slow, try the Join-Object function. If you’re working with really large datasets, it would be better to put the data in SQL tables and do a JOIN there where the product is make to do perform JOIN operations.