Add content from .txt to a .csv file

Ahoy,

I need directives on how to accomplish the following situation:
I have a csv-file which looks like this:

DisplayName	PrimarySmtpAddress	IssueWarningQuota	           WhenMailboxCreated
XXX YYY	            XXX@XXX.se	         7.813 GB (8,388,608,000 bytes)	    2012-12-18 18:52
OOO AAA             OOO@XXX.se	         3.809 GB (4,089,446,400 bytes)	    2012-12-18 18:52

And I have a .txt file which looks like this:

User              Total Size (MB)
----              ---------------
XXX YYY               4437
OOO AAA               1854

I would now like to merge this two file to a csv-file which should look like this:

DisplayName	PrimarySmtpAddress	IssueWarningQuota	           WhenMailboxCreated      Total Size (MB)
XXX YYY	            XXX@XXX.se	         7.813 GB (8,388,608,000 bytes)	    2012-12-18 18:52           4437
OOO AAA             OOO@XXX.se	         3.809 GB (4,089,446,400 bytes)	    2012-12-18 18:52           1854

Anyone could help me/point me to a solution?

Thanks in advance!

Why don’t you open the csv file in Excel, paste the total size column from the .txt file into the open Excel document, and then save the excel document as a csv?

Does this solve the problem for you?

I don’t know of a SQL-like join function in powershell, which in no way means that there isn’t one. :slight_smile: I’m hoping that one of the gurus here will say that this post was unnecessary and let us know of a good and easy way to do it.

I would have liked to use the LINQ Join method, but calling generic methods (such as [System.Linq.Enumerable]::Join) isn’t always the easiest in PowerShell, as far as my meager understanding of the language takes me. However, performing your join should be pretty simple to do in code.

I’m assuming your files are comma-separated. If they are tab-separated, just add a Delimiter parameter to the Import-Csv and Export-Csv calls, providing the tab character for delimiter character. I’m also assuming that all user names are unique.

If you are certain that all users exist in both files and only once in each file, the following simple solution could probably work for your scenario:

function Join-UserData
{
PARAM (
$UserData,
$AppendData
)
# If you are certain the arrays are already sorted in the same way, this step is unnecessary
$sortedUserData = $UserData | Sort DisplayName
$sortedAppendData = $AppendData | Sort User
# Just iterate through the array and copy the Total Size (MB) property to the user object
for($index = $sortedUserData.Length - 1; $index -ge 0; $index–)
{
$user = $sortedUserData[$index]
$appendDataItem = $sortedAppendData[$index]
Add-Member -InputObject $user -MemberType NoteProperty -Name ‘Total Size (MB)’ -Value $appendDataItem.‘Total Size (MB)’
}
Write-Output $sortedUserData
}
$userData = Import-Csv userData.csv
$appendData = Import-Csv appendData.csv
Join-UserData -UserData $userData -AppendData $appendData | Export-Csv output.csv

If you are not certain that all users exist in the AppendData file, it needs some alteration. Just as a quick exercise for me, I wrote a version which will copy all properties (except the key property) from the appendData to the user data and also allow you to provide the name of the key property as a parameter (which takes it closer to becoming a more reusable function). That version comes next:

function Join-UserData
{
PARAM (
$UserData,
$AppendData,
$UserDataKeyPropertyName,
$AppendDataKeyPropertyName
)
# Sort the arrays so we only need to iterate through the items once in the code later
# If you are certain your data is already sorted, this isn’t necessary
$sortedUserData = $UserData | Sort $UserDataKeyPropertyName
$sortedAppendData = $AppendData | Sort $AppendDataKeyPropertyName
$userDataLength = $sortedUserData.Length
$appendDataLength = $sortedAppendData.Length
$appendDataIndex = 0;
$appendDataItem = $sortedAppendData[$appendDataIndex];
for($userIndex = 0; $userIndex -lt $userDataLength; $userIndex++)
{
$user = $sortedUserData[$userIndex];
while(($user.“$UserDataKeyPropertyName” -gt $appendDataItem.“$AppendDataKeyPropertyName”) -AND ($appendDataIndex -lt $appendDataLength))
{
# As long as the users key property value is greater than the append data key property value
# we can just ignore the appendDataItem, since no matching user will be found.
# This is only valid as long as we do an outer left join, which I have assumed
$appendDataIndex++
$appendDataItem = $sortedAppendData[$appendDataIndex];
}
if ($appendDataIndex -ge $appendDataLength)
{
# If we have already looked through the whole array with append data we won’t find any more data to append, so we’re finished
break
}
if ($user.“$UserDataKeyPropertyName” -ceq $appendDataItem.“$AppendDataKeyPropertyName”)
{
# We have matching keys, so identify all NoteProperties other than the key property and copy those
# This will throw an exception if there are duplicate property names in the user object and the appendItem.
$properties = $appendDataItem | gm -MemberType NoteProperty
foreach ($property in $properties)
{
if (-Not($property.Name -eq $AppendDataKeyPropertyName))
{
Add-Member -InputObject $user -MemberType NoteProperty -Name $property.Name -Value $appendDataItem.“$($property.Name)”
}
}
}
}
Write-Output $sortedUserData
}
$userData = Import-Csv userData.csv
$appendData = Import-Csv appendData.csv
Join-UserData -UserData $userData -AppendData $appendData -UserDataKeyPropertyName DisplayName -AppendDataKeyPropertyName User | Export-Csv output.csv

[Edit: Formatting issues, even though I avoided backticks :slight_smile: Had to remove all empty lines, which unfortunately might make it harder to read according to me.]

On the other hand, after trying it, the LINQ-way is a lot nicer. :slight_smile: Here’s a sample using the LINQ-way (wrapping the type conversions and generic reflection parts with a function):

function Join-UserData
{
    param (
        [PSCustomObject[]]$Collection1,
        [PSCustomObject[]]$Collection2,
        [Func[PSCustomObject,String]]$KeySelector1,
        [Func[PSCustomObject,String]]$KeySelector2,
        [Func[PSCustomObject,PSCustomObject,PSCustomObject]]$DataMerger
    )

    # There's bound to be a better way to get the method, but at the moment I know of none.
    $genericJoinMethod = ([System.Linq.Enumerable].GetMethods() | Where { $_.Name -eq "Join" -And ($_.GetParameters().Count -eq 5)})
    $joinMethod = $genericJoinMethod.MakeGenericMethod([PSCustomObject], [PSCustomObject], [String], [PSCustomObject])

    $joinMethod.Invoke($null, @($Collection1, $Collection2, $KeySelector1, $KeySelector2, $DataMerger))
}

$parameters = @{
    Collection1 = Import-Csv userData.csv;
    Collection2 = Import-Csv appendData.csv;
    KeySelector1 = {param($d1) $d1.DisplayName};
    KeySelector2 = {param($d2) $d2.User};
    DataMerger = {
        param($d1, $d2) 
        Add-Member -InputObject $d1 -MemberType NoteProperty -Name 'Total Size (MB)' -Value $d2.'Total Size (MB)'
        Write-Output $d1
    };
}
Join-UserData @parameters | Export-Csv output.csv

Observe though, that there is a difference between the two methods. The LINQ version above will do the equivalent of a SQL INNER JOIN, while the loads-of-code-version will do an outer join. If you want to do a left outer join and are okay with iterating through the appendData array once for each item in the userData array, you could do it easily using just a Foreach and a Where, like the following:

$userData = Import-Csv userData.csv
$appendData = Import-Csv appendData.csv
$userData | Foreach { 
    $dataToAppend = $appendData | Where User -eq $_.DisplayName
    if ($dataToAppend)
    {
        Add-Member -InputObject $_ -MemberType NoteProperty -Name 'Total Size (MB)' -Value $dataToAppend.'Total Size (MB)'
    }
    Write-Output $_
}

@Robert Westerlund: Wow! I am very grateful for your massive answer. Tack så hemskt mycket, landsbroder!

And no, the order of the names are not the same. But thanks to the contribute of Robert I will for sure create a solution for this. I will post here again when I’ve a working solution. I don’t have access to the files atm.

Again, thank you so much for your commitment Robert.

Glad I could help.

Just as a suggestion; I’d probably use the last version since it’s a lot easier to read (and easily adaptable if you want an inner join instead of left outer join). There are some performance implications, of course, with iterating through the arrays n*m times, but unless you have a huge amount of lines and/or you’re planning on running this frequently it really shouldn’t matter.

Okey, I got access to the files and it’s working… Almost :wink: !
If I use the following code:


$userData = Import-Csv E:\tests\output.csv
$appendData = Import-Csv E:\tests\appendData.csv
$userData | Foreach { 
    $dataToAppend = $appendData | Where User -eq $_.DisplayName
    if ($dataToAppend)
    {
        Add-Member -InputObject $_ -MemberType NoteProperty -Name 'Total Size (MB)' -Value $dataToAppend.'Total Size (MB)'
    }
    Write-Output $_ | Export-Csv E:\tests\file.csv

It only works for the last person, because I suppose the Export-Csv command erases the content when adding the new user?
So I changed the Export-Csv command to the following:


    Write-Output $_ | Export-Csv -Append E:\tests\file.csv -NoTypeInformation

But that doesn’t work at all?! It doesn’t add the column “Total Size (MB)” at all. The file.csv is like a copy of output.csv

Any suggestions?

Right, you probably want to skip the if-statement and just add the member no matter if there is a value for it or not. Just add it so it has an empty value if no data to append is found.

Also, you should place the Export-Csv call after the end of the Foreach, so that you pipe the results of the Foreach to Export-Csv. The following works fine for me:

$userData = Import-Csv userData.csv
$appendData = Import-Csv appendData.csv
$userData | Foreach { 
    $dataToAppend = $appendData | Where User -eq $_.DisplayName
    Add-Member -InputObject $_ -MemberType NoteProperty -Name 'Total Size (MB)' -Value $dataToAppend.'Total Size (MB)'
    Write-Output $_
} | Export-Csv output.csv -NoTypeInformation

Works like a charm! Thank you, once again. I have some similar tasks to complete now, but thanks to your extraordinary answers I believe I can sort it out my self.