Using -Unique with CSV file

All, for some reason I am not getting what I expected and would love to get some help if possible… Thanks in advance

Below is my test.csv file

risk_level email breach_added_date
Very High jim@abc.com 2020-07-25T20:00:44Z
low daveg@abc.com 2020-07-25T20:00:44Z
Very High daveg@abc.com 2018-07-25T20:00:44Z
Very High daveg@abc.com 2016-07-25T20:00:44Z
Very High jim@abc.com 2019-07-25T20:00:44Z
medium daveg@abc.com 2015-07-25T20:00:44Z

I run the below and all is good so far

Import-Csv c:\temp\eecpro\test.csv | sort email, breach_added_date -descending –Unique | export-csv test1.csv

then

import-csv test1.csv

risk_level email breach_added_date


Very High jim@abc.com 2020-07-25T20:00:44Z
Very High jim@abc.com 2019-07-25T20:00:44Z
low daveg@abc.com 2020-07-25T20:00:44Z
Very High daveg@abc.com 2018-07-25T20:00:44Z
Very High daveg@abc.com 2016-07-25T20:00:44Z
medium daveg@abc.com 2015-07-25T20:00:44Z

I run the below and this is what I get which is the issue…

import-csv eecpro1.csv | sort email -Unique

risk_level email breach_added_date


Very High daveg@abc.com 2018-07-25T20:00:44Z
Very High jim@abc.com 2020-07-25T20:00:44Z

I want the output to be like the below . Why is it grabbing daveg@abc.com with 2018-07-25T20:00:44Z like above and not 2020-07-25T20:00:44Z like below?
risk_level email breach_added_date


Very High daveg@abc.com 2020-07-25T20:00:44Z
Very High jim@abc.com 2020-07-25T20:00:44Z

Thanks again

David, welcome to Powershell.org.

When you post code, error messages, sample data or console output format it as code, please.
In the “Text” view you can use the code tags “PRE”, in the “Visual” view you can use the format template “Preformatted”. You can go back edit your post and fix the formatting - you don’t have to create a new one.
Thanks in advance.

$CSVData = Import-Csv -Path D:\sample\test.csv -Delimiter ',' |
    Select-Object -Property risk_level,email,@{Name='breach_added_date';Expression={Get-Date $_.breach_added_date}}

$CSVData | 
    Sort-Object -Property email, breach_added_date -descending –Unique | 
        Sort-Object -Property email -Unique

Import-CSV provides string output. Even if it looks like dates it is actually not. If you need a certain type you have to convert it.

Thanks for the info regarding posting protocol…

So are you saying that the reason why its not working is because of the date field? And I should convert it? to a yyy-MM-dd type format?

I ran the code you posted above with the same results.

Thanks,

Hmmm … the expected result you posted is actually impossible. You don’t have a line like this in your source data:
Very High daveg@abc.com 2020-07-25T20:00:44Z:wink:

Let’s say this is your source data and this is how you import them:

$CSVData = @'
risk_level, email, breach_added_date
"Very High", "jim@abc.com", "2020-07-25T20:00:44Z"
"low", "daveg@abc.com", "2020-07-25T20:00:44Z"
"Very High", "daveg@abc.com", "2018-07-25T20:00:44Z"
"Very High", "daveg@abc.com", "2016-07-25T20:00:44Z"
"Very High", "jim@abc.com", "2019-07-25T20:00:44Z"
"medium", "daveg@abc.com", "2015-07-25T20:00:44Z"
'@ | 
    ConvertFrom-Csv  |
        Select-Object -Property risk_level,email,@{Name='breach_added_date';Expression={Get-Date $_.breach_added_date}}

And you use the code you posted to get the desired results:

$CSVData | 
    Sort-Object -Property email, breach_added_date -descending –Unique | 
        Sort-Object -Property email -Unique

You have to get this as results:

risk_level email         breach_added_date
---------- -----         -----------------
low        daveg@abc.com Sa, 25.07.2020 22:00:44
Very High  jim@abc.com   Sa, 25.07.2020 22:00:44

The dates/times are shown in Germand format and time zone. :wink:

I made a new directory and new .PS1 file and this is what I have and the result

CSV File import
PS C:\temp\Eecpro> import-csv eecpro1.csv

risk_level email breach_added_date


Very High jim@abc.com 2020-07-25T20:00:44Z
Very High jim@abc.com 2019-07-25T20:00:44Z
low daveg@abc.com 2020-07-25T20:00:44Z
Very High daveg@abc.com 2018-07-25T20:00:44Z
Very High daveg@abc.com 2016-07-25T20:00:44Z
medium daveg@abc.com 2015-07-25T20:00:44Z

This is the code within ./fix.ps1

$CSVData = Import-Csv -Path c:\temp\eecpro\eecpro1.csv -Delimiter ',' | 
Select-Object -Property risk_level,email,@{Name='breach_added_date';Expression=
{Get-Date $_.breach_added_date}}

$CSVData | Sort-Object -Property email, breach_added_date -descending –Unique | 
Sort-Object -Property email -Unique

Here is the output
PS C:\temp\eecpro> ./fix

risk_level email         breach_added_date
---------- -----         -----------------
Very High  daveg@abc.com 7/25/2018 4:00:44 PM
Very High  jim@abc.com   7/25/2020 4:00:44 PM

I would like
Very High daveg@abc.com 7/25/2018 4:00:44 PM
to be
Very High daveg@abc.com 7/25/2020 4:00:44 PM

I dont understand why its skipping the fist daveg line in the .CSV file and going to the second.
Sorry, I am missing something!!

You are getting the most recent VERY HIGH, your desired output doesn’t exist in the source. It would be LOW 2020

I don’t know how you get to this output. I used your data and your code and it is different for me:

risk_level email         breach_added_date
---------- -----         -----------------
low        daveg@abc.com Sa, 25.07.2020 22:00:44
Very High  jim@abc.com   Sa, 25.07.2020 22:00:44

[quote quote=233164]I would like
Very High daveg@abc.com 7/25/2018 4:00:44 PM
to be
Very High daveg@abc.com 7/25/2020 4:00:44 PM[/quote]

But again … you dont have a row like this in your source data. Where should this come from?

No clue, very frustrating so say the least.

I am running Version : 5.1.17134.112 of Powershell…

My computer will be no match to kick boxing soon!

Here is the actual content of the eecpro1.CSV file

"risk_level","email","breach_added_date"
"Very High","jim@abc.com","2020-07-25T20:00:44Z"
"Very High","jim@abc.com","2019-07-25T20:00:44Z"
"low","daveg@abc.com","2020-07-25T20:00:44Z"
"Very High","daveg@abc.com","2018-07-25T20:00:44Z"
"Very High","daveg@abc.com","2016-07-25T20:00:44Z"
"medium","daveg@abc.com","2015-07-25T20:00:44Z"

OK. And where do you have the line you’re after? There is no line with the content: “Very High","daveg@abc.com”,“7/25/2020 4:00:44 PM”.

If you want the VERY HIGH record of Dave, that is 2018. If you want the 2020 record of Dave, that is LOW. There is no record with VERY HIGH, Dave, and 2020… it just doesn’t exist. Now if you’re wanting to take the VERY HIGH portion and the 2020 date, those are from two different lines and you’re going to have to figure out how to code for that.

Sorry, copy, paste , typo on my part. The thing I want is just 1 jim and 1 daveg with the 2020 date.

Very High jim@abc.com 2020-07-25T20:00:44Z
low daveg@abc.com 2020-07-25T20:00:44Z

Regarding format difference on date, when I run get-date, this is what I get. Is it possible you return something different on your get-date and thats the reason for the discrepancy with your results VS mine…

PS C:\temp\eecpro> get-date

Thursday, June 4, 2020 5:57:37 AM

[quote quote=233224]The thing I want is just 1 jim and 1 daveg with the 2020 date.
Very High jim@abc.com 2020-07-25T20:00:44Z
low daveg@abc.com 2020-07-25T20:00:44Z[/quote]

That’s exactly what the code provides as the result. But its the wrong order actually. Are you sure you’re using the code you posted?

PS C:\temp\eecpro> ls

Directory: C:\temp\eecpro
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 6/3/2020 8:02 PM 339 eecpro1.csv
-a---- 6/3/2020 6:22 PM 302 fix.ps1

PS C:\temp\eecpro> more eecpro1.csv

"risk_level","email","breach_added_date"
"Very High","jim@abc.com","2020-07-25T20:00:44Z"
"Very High","jim@abc.com","2019-07-25T20:00:44Z"
"low","daveg@abc.com","2020-07-25T20:00:44Z"
"Very High","daveg@abc.com","2018-07-25T20:00:44Z"
"Very High","daveg@abc.com","2016-07-25T20:00:44Z"
"medium","daveg@abc.com","2015-07-25T20:00:44Z"

PS C:\temp\eecpro> import-csv eecpro1.csv

risk_level email breach_added_date
---------- ----- -----------------
Very High jim@abc.com 2020-07-25T20:00:44Z
Very High jim@abc.com 2019-07-25T20:00:44Z
low daveg@abc.com 2020-07-25T20:00:44Z
Very High daveg@abc.com 2018-07-25T20:00:44Z
Very High daveg@abc.com 2016-07-25T20:00:44Z
medium daveg@abc.com 2015-07-25T20:00:44Z

PS C:\temp\eecpro> more fix.ps1

$CSVData = Import-Csv -Path c:\temp\eecpro\eecpro1.csv -Delimiter ',' | Select-Object -Property risk_level,email,@{Name='breach_ad
ded_date';Expression={Get-Date $_.breach_added_date}}

$CSVData | Sort-Object -Property email, breach_added_date -descending -Unique |
Sort-Object -Property email -Unique

PS C:\temp\eecpro> ./fix

risk_level email breach_added_date
---------- ----- -----------------
Very High daveg@abc.com 7/25/2018 4:00:44 PM
Very High jim@abc.com 7/25/2020 4:00:44 PM

Thats it in a nutshell. I am not getting daveg with the 2020 date…

Thanks and I am really sorry about the typo confusion…

You may start troubleshooting your computer. Did you try to turn it off and on again? :wink: Can you try another computer?

BTW: That’s the result:

risk_level email         breach_added_date
---------- -----         -----------------
Very High  jim@abc.com   Sa, 25.07.2020 22:00:44
Very High  jim@abc.com   Do, 25.07.2019 22:00:44
low        daveg@abc.com Sa, 25.07.2020 22:00:44
Very High  daveg@abc.com Mi, 25.07.2018 22:00:44
Very High  daveg@abc.com Mo, 25.07.2016 22:00:44
medium     daveg@abc.com Sa, 25.07.2015 22:00:44

when I run only the first Sort-Object:

$CSVData |
Sort-Object -Property email, breach_added_date -descending -Unique

Tried another computer, same result. Is it possible the date that I return on the get-date is different than yours?

Because we are using {Get-Date $_.breach_added_date} in the code…

Why don’t you compare them? I posted my results in my last post. I’d expect only the time to be different not the date because I’m in another time zone. But the order of the time stamps should stay the same.

I did compare and your results are for the first entry

Very High jim@abc.com Sa, 25.07.2020 22:00:44

and mine is

Very High jim@abc.com 7/25/2020 4:00:44 PM

after running the first command

Import-Csv -Path c:\temp\eecpro\eecpro1.csv -Delimiter ',' | Select-Object -Property risk_level,email,@{Name='breach_added_date';Expression={Get-Date $_.breach_added_date}}

I have /'s and you have .'s in the date and you also have Sa in yours and I dont… I also noticed you are in a 24 hour format? and I am not.

Not sure where to go from here… I do appreciate your time but something is very weird…

I have trued 3 different machines now with the same result… Did you try more than 1 machine?

The format of the date does not matter. I’ve even changed my short date format sting system wide.

I actually did but I always tested on Powershell 7.0.1. Now I tried on Windows Powershell 5.1 and I’ve got the same results like you. It seems to be a issue with Windows Powershell. Wow … I wouldn’t have expected something like this … :wink:

So either you install Powershell 7 or you can try the following approach:

$CSVData | 
    Sort-Object -Property email, breach_added_date -descending |
        Group-Object -Property email |
            ForEach-Object {
                $_.Group[0]
            }

Whew, I thought I was going nuts… Still do at times… LOL… I may just upgrade to 7.x and report back… Thanks for persistence…