Guru Help.... CSV manipulation....

Please can anyone help with developing a solution to my problem?

I have several CV sheets from different sources that I need to consolidate into one master sheet. I don’t need every column from every sheet.

So, for example, I have the following headings:
SOURCE

SHEET1
ComputerName,OS,Model,Colour,Location,

SHEET2
Computername,ApplicationID,ApplicationName,DeployDate,

SHEET3
Computername,ApplicationID,RetirementDate,Environment,DC,

As you can see, all sheets will have a Computername. So I think I need as much info about a computer in one row. One problem is that each computer might host multiple applications. There will be an unknown number of applications so I can’t add a column for each one (Application1, Application2 etc.) So I will want one row for each application with duplicate computer information (I think).

So the result will look something like this:
RESULT

MASTER SHEET
ComputerName,OS,Model,ApplicationID,ApplicationName,DeployDate,RetirementDate,Environment,DC,
Server1,W2K3,HP DL 380 G5,23456,DoSomething,12/03/2009,12/12/2015,Prod,Timbuktu,
Server1,W2K3,HP DL 380 G5,23987,DoSomethingDifferent,2/06/2010,12/12/2015,Prod,Timbuktu,
Server1,W2K3,HP DL 380 G5,17656,MakeitUp,23/08/2010,12/12/2015,Prod,Timbuktu,
Server2,W2K3,HP DL 380 G5,15894,MakeitUp,23/08/2010,12/12/2015,UAT,Tatooine,
Server3,W2K3,HP DL 385 G3,63636,CleverStuff,16/04/2010,12/12/2015,Prod,Tatooine,

I have started off by importing each csv and then have created a custom object with the column headings that I want. Like this:

#Import the sheet1 and modify the column headings removing all the spaces
$list1 = gc “C:\Users\thicko\Documents\Scripts\w2k3\Sheet1.csv” | Select -skip 1 | `
ConvertFrom-Csv -Header ‘ComputerName’,‘OS’,‘Model’,‘Colour’,‘Location’,

#Create the custom object
$Consol1 = @()
#Remove all non Windows hosts and add the columns required from this sheet
Foreach ($AllComp in $List1) {
If ($Allcomp.os -notmatch ‘Windows’) {
#verbose out for testing!
Write-Verbose “$($AllComp.ComputerName) is $($AllComp.OS)”
}
Else {

    $obj1 = New-Object -TypeName PSObject
    Add-Member -InputObject $obj1 -MemberType NoteProperty -Name 'Hostname' -Value $Allcomp.ComputerName
    Add-Member -InputObject $obj1 -MemberType NoteProperty -Name 'Model' -Value $Allcomp.Model
    Add-Member -InputObject $obj1 -MemberType NoteProperty -Name 'OS' -Value $Allcomp.OS
    $Consol1 += $obj1
    }
}

#Import the sheet2 and modify the column headings removing all the spaces
$list2 = gc “C:\Users\thicko\Documents\Scripts\w2k3\Sheet2.csv” | Select -skip 1 | `
ConvertFrom-Csv -Header ‘Computername’,‘ApplicationID’,‘ApplicationName’,‘OS’

#Create the custom object
$Consol2 = @()
#Remove all non Windows hosts and add the columns required from this sheet
Foreach ($AllComp in $List2) {
If ($Allcomp.OS -notmatch ‘Windows’) {

      #verbose out for testing!
    Write-Verbose "$($AllComp.Computername) is $($AllComp.OS)"
    }
Else {
    
    $obj2 = New-Object -TypeName PSObject
    Add-Member -InputObject $obj2 -MemberType NoteProperty -Name 'Computer name' -Value $Allcomp.Computername
    Add-Member -InputObject $obj2 -MemberType NoteProperty -Name 'Application ID' -Value $Allcomp.ApplicationID
    Add-Member -InputObject $obj2 -MemberType NoteProperty -Name 'Application Name' -Value $Allcomp.ApplicationName
    
    $Consol2 += $obj2
    }
}

#Now what to do with $Consol1 + $Consol2 to consolidate the information???!!!

Obviously I need to add the third sheet too. But what do I need to do now to consolidate the objects to give me a csv like my result?

Any help, greatly appreciated!

Cheers :slight_smile:
Tom

You’re basically wanting to do a SQL-like JOIN on your data. There are many blogs and articles on this: See:

http://www.powershelladmin.com/wiki/An_Advanced_Cmdlet_to_Merge_Csv_Files_in_PowerShell
http://blogs.technet.com/b/heyscriptingguy/archive/2011/10/31/use-powershell-to-append-csv-files-easily.aspx
http://blogs.msdn.com/b/powershell/archive/2012/07/13/join-object.aspx

If the files are huge, you cold also just import the CSV’s into SQL and then use standard SQL joins to create your final data.

Hi Rob,

Yes. Exactly that! :slight_smile: I’d been staring at it for so long, my google skills had vanished!
Thank you for pointing me in the right direction.

Sadly, I’m in a big cumbersome environment at the moment and don’t have access to sql servers. If I get really bored at the weekend, maybe I’ll fire one up at home just in case I need it later.

But I think I may need to multi task as there is some decent rugby starting tomorrow!

Cheers
Tom

Rob is right. This is not a PowerShell limitation. You’ve exceeded what is practical with simple CSV files. In terms of your environment, you can download SQL Express for fee from Microsoft and it will be more than adequate for what you want to do.

If you want to attempt it without a SQL server but using SQL commands have a look at Chrissy LeMaire’s blog for inspiration. While this article shows how to quickly find duplicates, you should be able to use similar techniques with ODBC drivers to treat the CSV files as database tables:

Matt,
:slight_smile: Awesome!

Thanks all for your help!
Cheers
Tom