Removing columns and rows from CSVs

A friend of mine in finance posed an interesting question to me that I had never attempted before the other day. It quickly became evident that it would be alot of work so PowerShell came to mind. :slight_smile: Here’s the short of it.

He has some finance application that he exports some sort of report from on a regular basis. He needs to combine/merge portions of the CSV files and analyze the data. The report comes in the form of a CSV (yippee!!). The problem is the format of the CSV (booo!!). The first column and the first 12 rows is all garbage. Essentially, I have many CSV files (100s) that I need to delete the first column and the first 12 rows then I can work with the good data.

I can delete the 12 rows like this.

Get-content -path c:\my.csv | Select-object -skip 12
I haven't found a method to remove the first column, any ideas on that piece?

 

 

 

 

The best solution would be to change the report where it is created. Most of the time it’s easier and more reliable.

[quote quote=158553]I can delete the 12 rows like this.

Get-content -path c:\my.csv | Select-object -skip 12
[/quote] I suspect you already tried that ... I think a better way would be to use Import-Csv and pipe it to Select-Object Skip 12 as it deals with the CSV headers if there are some. [quote quote=158553] I haven't found a method to remove the first column, any ideas on that piece?[/quote] Depending on the amount of headers / columns you can use Select-Object for this purpose as well. You simply specify all headers / columns as properties you like to keep ... that's all.

Unfortunately, changing the source output is not an option. It’s a 3rd party app that we don’t control or have access to, we get what we get as far as reports are concerned.

For whatever reason Import-csv | Select-Object -Skip 12 doesn’t return anything.

Below is what the actual CSV looks like, they are all formatted the same way. I need to grab the data in B13:M13 down each row and discard the rest of the stuff in the CSV.

https://imgur.com/VBX5Tdv

What you have there is not a valid CSV file though - you will have to make one of it first. So your approach to skip the first 12 lines seems to be valid in this case. :wink:
You could save the result of this “operation” as CSV file with pipeing it to Out-File and treat it as such from this moment.

Regardless of that - most of the time an image of code or example data is not that helpful. If you post code or sample data you should format it as code. That’ll prevent unwanted line breaks and makes it possible to copy and to test your code or sample data.

Ok, so I’ve managed to get some viable results to the console. This code returns the correct results to the console but I can’t seem to figure out the getting them to a CSV. Export-CSV does not give the expected output and out-file outputs the correct data but it’s all comma separated in the same cell in the each row of the CSV

$csvPath = "$env:HOMEDRIVE\Reports\" $csvs = Get-ChildItem -Path $csvPath -Filter *.csv

$results = foreach ($csv in $csvs){
Get-Content -Path $csv | Select-Object -Skip 13
}
$results


I tried

$results | Export-Csv -Path $csvPath\report.csv -NoTypeInformation
and
$results | Out-File -FilePath $csvPath\report.csv
Any thoughts?

Please read this post right from the top of the list to learn how to post code: Read Me Before Posting! You’ll be Glad You Did! :wink:

I’d recommend to do what you need in 2 steps for now … it will be easier to understand, I think.

$csvPath = “$env:HOMEDRIVE\Reports”
$csvs = Get-ChildItem -Path $csvPath -Filter *.csv

foreach ($csv in $csvs) {
$NewFileName = Join-Path -Path $csv.Directory.FullName -ChildPath ($csv.BaseName + ‘_processed’ + $csv.extension)
Get-Content -Path $csv | Select-Object -Skip 13 | Out-File -FilePath $NewFileName
}

$ProcessedCsvList = Get-ChildItem -Path $csvPath -Filter ‘*_processed.csv’
foreach ($ProcessedCsv in $ProcessedCsvList) {
Import-Csv -Path $ProcessedCsv -Delimiter ‘,’
}


So you remove the first 13 lines and save the rest of the file to a new file with a new name. Then you can proceed with this newly created, hopefully valid CSV files.
You might adjust the -Delimiter to your files but it should give you the pure data part of your CSV files.

Thanks for your help Olaf. In order to drop the first column i made a slight change and I wonder if there is a more elegant method to do this?

[pre]

$ProcessedCsvList = Get-ChildItem -Path $csvPath -Filter ‘*_processed.csv’
$results = foreach ($ProcessedCsv in $ProcessedCsvList) {
Import-Csv -Path $ProcessedCsv -Delimiter ‘,’ -Header “C1”,“Date”,“ASIN”,“Name”,“Cat1”,“Cat2”,“Cat3”,“ProdType”,“TransType”,“SalesPrice”,“Currency”,"#Units",“EarningorRefunds” |
Select-Object “Date”,“ASIN”,“Name”,“Cat1”,“Cat2”,“Cat3”,“ProdType”,“TransType”,“SalesPrice”,“Currency”,"#Units",“EarningorRefunds”
}

[/pre]

That does what I need it to do, drops C1 but seems really inefficient. It appears I have to import all the columns then select just the ones that I want to maintain the data structure.

To remove a particular header (“C1” in your case) from a CSV file you can do it like this:

NewHeaders = Import-CSV -Path 'CSV-File' -OutVariable CSV | Get-Member -MemberType NoteProperty | Select-Object Name | Where-Object {_.Name -ne ‘C1’}
$CSV | Select-Object -Property $NewHeaders