parse csv files with posh

Hello, I’m looking for a way to parse csv files with posh, really big files 1gb+. import-csv works kinda slow. what would be the alternative way to do so?

Right now I’m doing this:

$newcsv = Import-Csv -Path $_.FullName -Encoding Default -Delimiter ";"`
    -Header date,0,user,1,folder,session,2,pc,3,4 | Where-Object {($_.date -like "2015*")`
    -and ($_.session -like "opensession")}

    foreach ($_ in $newcsv) {
        $newcsv[[array]::indexof($newcsv,$_)].pc = $_.pc.split(" ").trim('(m)') | select -Last 1
    }
    $newcsv | Select-Object user,pc -Unique | Export-Csv -Delimiter ";" -NoTypeInformation -Path "$folder\Unique\unique$name.csv" -Encoding Default
    $newcsv | Select-Object date,user,folder,session,pc| Export-Csv -Delimiter ";" -NoTypeInformation -Path "$folder\NotUnique\notunique$name.csv" -Encoding Default
}

And it takes 30 minutes on a really good laptop (i7\16gb\240ssd) to parse 800mb file.

I saw this thing (Microsoft.VisualBasic.FileIO.TextFieldParser), but i have no clue what to do with and, and can’t find anything relevant. I have no background in coding and cant seem to figure this one out

Have a look at Chrissy Le Maire’s blog. She demonstrates several techniques for working with large CSV files:

High-Performance Techniques for Importing CSV to SQL Server using PowerShell

Quickly Find Duplicates in Large CSV Files using PowerShell

wow, this is great, now i need to wrap my head around it! thank you!
hm, with further reading all of this seems to require sql, is there any method of doing this without sql, as i need to give out this script to other people and it should just work, without the need to install sql or configure anything…

well, if you got any ideas?

The first example does use SQL server but only because it’s demonstrating how to quickly import a CSV file into SQL. You should find the second script, for finding duplicates, works without any additional configuration which at least proves the concept could work for you.

This method is using the ODBC drivers which are (probably) already installed. It’s effectively treating your CSV file as a SQL database table.

You will need to understand how to use SQL queries and how to call the driver’s properties and methods though - that’s the hard bit.

ah, I see, I saw several sql queries in the file, so I assumed it does need sql database, thanks, I’ll try to wrap my head around it!