import-csv same header twice

andreas-lorensen · March 27, 2017, 12:38pm

I have a problem…I have some csv files, and sometime these by error has same header´field twice, and that makes import-csv not so happy.

So how do I get around that? I never knows if or which header could be more than once in the csv files. It would be fine, if next ocurance of a header just got appended a number or something.

sam-boutros · March 27, 2017, 12:50pm

‘Same header field twice’, meaning the entire header row duplicated? , or meaning more than one column have the same label?
If it’s the former, I would use Get-Content, compare .[0] and .[1] lines, if they’re the same, over-write the source CSV using lines .[1]…[last]
If it’s the latter, I would use Get-Content to read the first line, use .split(‘,’) to break it down into an array of column labels, and check for and repair duplicates if any, write updated data back to CSV…

andreas-lorensen · March 27, 2017, 12:59pm

It is “more than one column have the same label”

Sound like the way to go with Get-Content. Can you give an example how to use that and fix duplicate fields ?

andreas-lorensen · March 27, 2017, 1:42pm

OK I am on the right way…

My test.csv is like this:
“Header1”,“Header2”,“Header1”
“Nummer1”,“Nummer2”,“Nummer3”

$headers = (Get-Content .\test.csv)[0].Split(‘,’)

Then I have a aray - how to compare each part whith the rest, and if a duplicate add number to the duplicate ?

Something like
foreach ($header in $headers) {
}

But how to compare and add?

rob-simmers · March 27, 2017, 1:55pm

Try this:

#test.csv
#Header1,Header2,Header3
#Header1,Header2,Header3
#Data1,Data2,Data3


$params = @{
    Path = "C:\Users\Rob\Desktop\test.csv"
}

$test = Get-Content @params

if ($test[0] = $test[1]) {
    $csv = Import-CSV @params | Select * -Skip 1
}
else {
    $csv = Import-CSV @params
}


$csv

andreas-lorensen · March 27, 2017, 2:26pm

thanks - but my “problem” is that there more than one column have the same label.
My test.csv looks like this - see that Header1 is duplicate
“Header1”,“Header2”,“Header1”
“Nummer1”,“Nummer2”,“Nummer3”

I figured out how to read firstline, and get that into a aray, but how to compare each part if it is a duplicate, and change it and write it back to the test.cvs - that is my next problem…

andreas-lorensen · March 27, 2017, 3:21pm

Well - something like this worsk, but what to do when duplicate is found…

$headers = (Get-Content -Path .\test.csv)[0].Split(',')

$newheaders = ''
foreach ( $header in $headers ) 
{ 
  IF ( $newheaders.Contains( $header ) -and $newheaders[$header] -lt 2 ) 
  {
    #I found a duplicate - what to do  
  }
  $newheaders += $header
}

sam-boutros · March 27, 2017, 3:33pm

$headers = (Get-Content .\test1.csv)[0].Split(',') | sort

1..$($headers.Count-1) | % { # Skipping .[0] since it has no prior to compare to..
    if ($headers[$_] -match $headers[$_-1]) { 
        # Found duplicate, increment last character
        $headers[$_] = $headers[$_].Replace($headers[$_][-1],[int]($headers[$_][-1])+1)
    }
}

"Old headers: $(((Get-Content .\test1.csv)[0].Split(',') | sort) -join ', ')"
"New headers: $(($headers | sort ) -join ', ')"

andreas-lorensen · March 27, 2017, 5:18pm

Ohhhh great - I am learning

Now I got this - It works, but dosnt care if there is more than one duplicate

$headers = (Get-Content -Path .\test.csv)[0].Split(',')

#[array]$newheaders = ''

foreach ( $header in $headers ) 
{ 
  IF ( $newheaders.Contains($header)) 
  {
    $newheaders += ($header.insert(($header.Length -1),"_duplicate"))
  }
  ELSE 
  {
    $newheaders += $header
  }
}

$newheaders

curtis-smith · March 27, 2017, 11:33pm

I think the problem you are going to run into the examples so far is that it does not account for the data under the columns, it just gets the headers and unduplicates then and in the process changes the order of the columns so they no longer match their data

I would take a different approach. I would just add a column number to the end of each column to ensure it is unique. Without knowing the rest of your process, I do not know if this will introduce new challenges, but it seemed like maintaining the column header name exactly was not of the utmost importance.

input.csv

"Hea,der1",Header2,"Hea,der1"
"Nummer1","Nummer2","Nummer3"

Script

$content = Get-Content .\input.csv
$headerline = $content[0]
$headers = (($headerline |
    Select-String "`"[^`"\r\n]*`"|'[^'\r\n]*'|[^,\r\n]*" -AllMatches).Matches |
    Where-Object {$_.Value}).Value.trim('"') |
    ForEach-Object {
        $i++
        "$($_)_$i"
    }
$content[0] = "`"$($headers -join '","')`""
$content | Set-Content .\input.csv

Remove-Variable "i"

Results

"Hea,der1_1","Header2_2","Hea,der1_3"
"Nummer1","Nummer2","Nummer3"

andreas-lorensen · March 28, 2017, 12:48am

That is a beatuful way to do it…

Topic		Replies	Views
Problem with Import-Csv and headers PowerShell Help	2	191	January 7, 2018
import-csv the same file to two variables and IF() says they aren't equal PowerShell Help	2	135	January 21, 2016
Help importing overlapping date CSVs - comparing whole csv lines PowerShell Help	4	170	September 28, 2017
Basic working with CSV PowerShell Help	5	451	June 21, 2021
Compare 2 header of a csv file PowerShell Help	3	160	December 15, 2015

import-csv same header twice

Related Topics