Import-csv text qualifier not functioning

I have a csv file with user information, and data has the default comma sign as delimeter. There is data with commas and their are quote signs between them like you see below (“Department-TH Office, MT & HRM”)

False,,Netherlands,"Department-TH Office, MT & HRM",True,John Doe,,John,2024-01-29 21:44:43Z,Doe,2024-01-29 21:32:44Z,,Enterprise Mobility + Security E5+Microsoft Power Automate Free+Microsoft 365 E5 Security for EMS E5+Microsoft 365 E3,,,objectid,Hoofdkantoor Den Haag,True,088-00000000,,,,smtp:john@domainname.nl+smtp:johnmt@domainname.nl+smtp:hdoe@domainname.mail.onmicrosoft.com+smtp:hdoe9717@domainname.onmicrosoft.com+smtp:hdoe@domainname.onmicrosoft.com+SMTP:john.doe@domainname.nl+smtp:john.doe@domainname.com,,,,,True,,NL,john.doe@domainname.com,2020-06-17 13:19:04Z

I use the import-csv function in PowerShell to get the Licenses and the Users principal name colum via below script. But each row with a comma that needs to be sorted as text, are used as delimiter. So, the text qualifier does not work per default. Also used the -Delimiter function , but still problem is there. It is because of the text qualifier that does not work.

How can I solve this via PowerShell?

# Navigate to the directory with the CSV files
Set-Location "$PSScriptRoot\exports"

# Import the data from the CSV file
$data = Import-Csv -Path .\users.csv

# Display the names of the columns
$data[0].PSObject.Properties.Name

# Select the desired columns
$selectedData = $data | Select-Object 'Licenses', 'User principal name'

# Export the selected data to a new Excel file
$selectedData | Export-Excel -Path .\users.xlsx -WorksheetName "Sheet1" -AutoSize

If I do text to columns in Excel manually, Excel is using the text qualifier in a good way, but the import-csv function is the problem here.

Back up a step, what’s your actual issue? Please clarify what your actual problem is, don’t just state ‘it doesn’t work’ or it ‘does not work per default’. State the problem, what your actual output is, and what your expected output is.

Consider the following in a text file:

Column1,Columns2
Test,"Test,Test"

Running a quick test:

$data = Import-CSV -Path '.\test.txt'
$data

OUTPUT

Column1 Columns2 
------- -------- 
Test    Test,Test

This is behaving normally. Likewise, re-exporting the data back out behaves normally (to a csv)

$data | Export-Csv .\test2.csv

image

From these you can see, the ‘text qualifer’ or delimiter is working as intended, unless i’m misunderstanding you. What specific part isn’t working?

I have a csv file with user information, coming from O365 portal. In the CSV file, in the index column 12 there is licenses information (in excel, after text to colum, it would be column 13). What I let PowerShell do, is to abbreviate the licenses information, and sort it in specific way, removing free and trial subs… etc. That info I use later on in my analysis. But that is more the background story, I do not need help on that part of the script because that works perfectly fine.

The problem is, is that the import-csv does count each comma as a column

False,,Netherlands,"Department-TH Office, MT & HRM",True,John Doe,,John,2024-01-29 21:44:43Z,Doe,2024-01-29 21:32:44Z,,Enterprise Mobility + Security E5+Microsoft Power Automate Free+Microsoft 365 E5 Security for EMS E5+Microsoft 365 E3,,,objectid,Hoofdkantoor Den Haag,True,088-00000000,,,,smtp:john@domainname.nl+smtp:johnmt@domainname.nl+smtp:hdoe@domainname.mail.onmicrosoft.com+smtp:hdoe9717@domainname.onmicrosoft.com+smtp:hdoe@domainname.onmicrosoft.com+SMTP:john.doe@domainname.nl+smtp:john.doe@domainname.com,,,,,True,,NL,john.doe@domainname.com,2020-06-17 13:19:04Z

So in this example, the script sees “Department-TH Office, MT & HRM” as 2 seperate columns so he ignores the " text qualifier. Because of that, for these rows, the Licenses column is not index column 12, but 13… So I dont get the licenses output back.

If I do text to column manually in excel, then the text qualifier works like a charm… So. what can be a solution on that?

here examples:
Input version 1 including " text qualifier

# Navigate to the directory with the CSV files
Set-Location "$PSScriptRoot\exports"

# Import the data from the CSV file
$data = Import-Csv -Path .\users.csv

# Get the names of the columns
$columnNames = $data[0].PSObject.Properties.Name

# Select the desired columns
$selectedData = $data | Select-Object $columnNames[12], $columnNames[30]

# Export the selected data to a new Excel file
$selectedData | Export-Excel -Path .\users.xlsx -WorksheetName "Sheet1" -AutoSize

If your csv data is not valid (in this case it sounds like it’s not properly quoted) then it can’t be expected to work properly. If this already in an excel sheet properly, then you may use Import-Excel to import the data and then export it out with Export-Csv which will appropriately quote the data.

no the input file is csv allready… and I use this script to get the index columns 12 and 13 in a new xlsx

# Navigate to the directory with the CSV files
Set-Location "$PSScriptRoot\exports"

# Import the data from the CSV file
$data = Import-Csv -Path .\users.csv

# Get the names of the columns
$columnNames = $data[0].PSObject.Properties.Name

# Select the desired columns
$selectedData = $data | Select-Object $columnNames[12], $columnNames[30]

# Export the selected data to a new Excel file
$selectedData | Export-Excel -Path .\users.xlsx -WorksheetName "Sheet1" -AutoSize

This is input 1 (including the text qualifiers on row 1, 2, 3, 5)

input 1 gives this output


Only gives back row 4, because that is without the " qualifier

Then I removed the text qualifier " and the , that is part of the text :

Then I get the right output:

Sorry to use multiple replies, but I cannot use more then one upload image in 1 reply (why is that!)
So, problem seems to be that the text qualifier is ignored, so the extra , sign is counted as extra column.

If I do text to column in excel, it works, also with the text qualifier, but with import-csv function, it gives this problem.

No worries, thanks for the additional information, I’ll look it over when I can to see if I can help any, might be until tomorrow til i can take a good look tho.

Regarding the inability to post multiple images - I think it’s related to this: Understanding Discourse Trust Levels. In short, it’s related to how discourse works. New users have a trust level of 0, and therefore have certain limitations.

As earlier replies suggest, and as your screenshot confirms, the CSV file is not properly quoted.

If you open it in a text editor you should see this:

"column1","column2","column3"
"some,data","more,data"

But from your screenshot, you actually have something like this:

"column1","column2","column3"
"some,data",""more,data""

If it is just 2 double quotes you may be able to fix it with -replace.

(Get-content .\broken.csv) -replace '""','"' | Set-Content .\fixed.csv
2 Likes

I have tried your steps based on some data. I have deleted some columns so it is easier to digest. First the original Powershell script:

# Import the data from the CSV file
$data = Import-Csv -Path .\broken.csv

# Display the names of the columns
$data[0].PSObject.Properties.Name

# Select the desired columns
$selectedData = $data | Select-Object 'Licenses', 'User principal name'

# Export the selected data to a new Excel file
$selectedData | Export-Excel -Path .\users.xlsx -WorksheetName "Sheet1" -AutoSize

The original broken.csv (opened in notepad)

Block credential,City,Country/Region,Department,Display name,First name,Last name,Licenses,Office,User principal name,When created
"False,,Netherlands,""Reizen-TH Office, MT & HRM"",john doe,john,doe,Enterprise Mobility + Security E5,Hoofdkantoor Den Haag,john.doe@domainname.nl,2020-06-17 13:19:04Z"
"False,,Netherlands,""Reizen-TH Office, MT & Customer Service"",alex pete,alex,pete,Microsoft Power Automate Free,Hoofdkantoor Den Haag,alex.pete@domainname.nl,2020-06-17 13:19:04Z"
"False,,Netherlands,""Reizen-TH Office, MT & HRM"",john doe,john,doe,Microsoft 365 E5 Security for EMS E5,Hoofdkantoor Den Haag,john.doe@domainname.nl,2020-06-17 13:19:04Z"
"False,,Netherlands,""Reizen-TH Office, MT & Customer Service"",alex pete,alex,pete,Microsoft 365 E3,Hoofdkantoor Den Haag,alex.pete@domainname.nl,2020-06-17 13:19:04Z"

After I run my original script: no data in the xlsx, only the 2 column names: Licenses and User Principal Name.

Then i tried this:

(Get-content .\broken.csv) -replace '""','"' | Set-Content .\fixed.csv

Outputof the fixed.csv in notepad is:

Block credential,City,Country/Region,Department,Display name,First name,Last name,Licenses,Office,User principal name,When created
"False,,Netherlands,"Reizen-TH Office, MT & HRM",john doe,john,doe,Enterprise Mobility + Security E5,Hoofdkantoor Den Haag,john.doe@domainname.nl,2020-06-17 13:19:04Z"
"False,,Netherlands,"Reizen-TH Office, MT & Customer Service",alex pete,alex,pete,Microsoft Power Automate Free,Hoofdkantoor Den Haag,alex.pete@domainname.nl,2020-06-17 13:19:04Z"
"False,,Netherlands,"Reizen-TH Office, MT & HRM",john doe,john,doe,Microsoft 365 E5 Security for EMS E5,Hoofdkantoor Den Haag,john.doe@domainname.nl,2020-06-17 13:19:04Z"
"False,,Netherlands,"Reizen-TH Office, MT & Customer Service",alex pete,alex,pete,Microsoft 365 E3,Hoofdkantoor Den Haag,alex.pete@domainname.nl,2020-06-17 13:19:04Z"

If I use the fixed.csv, still no data, only the column names.

Now I saw that that each line begins with a " and ends with " . I removed that with:

# Get the content of the CSV file
$lines = Get-Content -Path .\fixed.csv

# Process each line
foreach ($line in $lines) {
    # Remove the first and last double quotes
    $line = $line.TrimStart('"').TrimEnd('"')

    # Write the modified line to a new CSV file
    Add-Content -Path .\fixed_no_quotes.csv -Value $line
}

And now I got this:

Block credential,City,Country/Region,Department,Display name,First name,Last name,Licenses,Office,User principal name,When created
False,,Netherlands,"Reizen-TH Office, MT & HRM",john doe,john,doe,Enterprise Mobility + Security E5,Hoofdkantoor Den Haag,john.doe@domainname.nl,2020-06-17 13:19:04Z
False,,Netherlands,"Reizen-TH Office, MT & Customer Service",alex pete,alex,pete,Microsoft Power Automate Free,Hoofdkantoor Den Haag,alex.pete@domainname.nl,2020-06-17 13:19:04Z
False,,Netherlands,"Reizen-TH Office, MT & HRM",john doe,john,doe,Microsoft 365 E5 Security for EMS E5,Hoofdkantoor Den Haag,john.doe@domainname.nl,2020-06-17 13:19:04Z
False,,Netherlands,"Reizen-TH Office, MT & Customer Service",alex pete,alex,pete,Microsoft 365 E3,Hoofdkantoor Den Haag,alex.pete@domainname.nl,2020-06-17 13:19:04Z

And this csv files works! But is that a good workaround, to delete the first and lost " ? What If in some cases have no first and last " ?

Instead of trying to fix the bad data, I’d first recommend addressing the source of the issue; whatever is producing the poorly structured data in the first place. If you allow invalid CSV for errant double quotes, what happens the next time you are provided different invalid csv data? You should hold others to standards just like you do yourself.

You can also use other delimiters for csv if it’s easier for the production process to switch to something other than a comma

1 Like

It is an export of active users, from the admin.microsoft.com portal that one of my customers downloaded. It is automatically saved in csv format, nothing processed manually. There is option to download it with an other delimiter

Lines without the starting and end double quote will be unaffected so if it works, it works.

Have you considered pulling the data some other way? From a cursory look at that export, you can grab that data with PowerShell and the Graph API.

1 Like

not all information… like Licenses information is something more difficult. But thanks! it is solved now!

I’ve personally never experienced any issues with CSV reports from microsoft. I’m glad it’s solved, but it sounds more to me like someone tinkered with the export or opened it in excel and re-saved it, something like that. Could you imagine if people got invalid CSV data from microsoft? That would be a huge issue that people would sound the alarm on.

1 Like

I tested an export from that portal this morning and it’s definitely not right but I would guess most people who are relying on exporting via a GUI are dumping it straight into Excel (which works) and not using it with Import-Csv.

1 Like

The problem is getting someone that actually cares and actually can understand and get the problem to someone who cano fix the issue on MS’s end. They put you through the ringer when you find an issue. I’ve submitted several bug reports and they just don’t get attention. Even for simpler things, like… documentation updates, they literally just don’t care, so I end up doing the fix myself in a PR. They outsource their support and it’s not necessarily their fault but MS support is just rough.

4 Likes