Sorting Table, Results Not As Expected

by qjudge at 2012-10-16 06:56:31

Hi All,

I am trying to use the following code to import a .csv file as a table, then sort the table numerically:

$ResultsTable = Import-CSV ‘C:\Statistics.csv’
$SortedTable = $ResultsTable | Sort-Object Occurrences


The .csv file contains two columns, with the titles ‘Results’ and ‘Occurences’ in the first row. The code "functions’, but the sorting is not correct. Output of $SortedTable shows:

Results----------Occurrences
45----------------10
26----------------11
28----------------2
22----------------3

…and so on. How do I get the table to be sorted by the actual number in the ‘Occurrences’ column?

Thanks!

Quinn
by Makovec at 2012-10-16 07:45:50
Hi,
the reason is that data from your file are loaded as string and therefore sorted as text. It’s visible when you import CSV and send it to Get-Member:
PS> Import-Csv .\statistics.csv | Get-Member

TypeName: System.Management.Automation.PSCustomObject

Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
Occurences NoteProperty System.String Occurences=10
Results NoteProperty System.String Results=45


See the last two records, both are Strings. There are more ways to do what you want, first I tried is this one:
PS> Import-Csv .\statistics.csv | Select-Object @{l='Results';e={[int]$.Results}},@{l='Occurences';e={[int]$.Occurences}} |Sort Occurences

I used calculated properties and converted Strings to numbers. As you see in my definition, I am taking Occurences (and Results) columns and using [int] converting it to number. Then it’s passed to Sort-Object which then sort it as numbers. You can check it this way (if running Get-Member against result):

TypeName: Selected.System.Management.Automation.PSCustomObject

Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
Occurences NoteProperty System.Int32 Occurences=10
Results NoteProperty System.Int32 Results=45


Occurences and Results are now Int32 (numbers). There is an article about calculated properties here: http://technet.microsoft.com/en-us/library/ff730948.aspx

David
by qjudge at 2012-10-16 08:12:52
David,

That worked perfectly, thank you!

I understand what ‘[int]’ means, but I’m still a little foggy on what exactly is happening in the ‘Select-Object’ section of code in your example. Could you explain that more (or point me to a webpage)? Still trying to ‘un-noob’ myself…

Thanks a bunch either way!

Quinn
by Makovec at 2012-10-16 15:52:07
Good to hear :slight_smile: Sure, that Select-Object thing is a bit messy anyway :wink:

Let’s have the same example but just with one column to have it shorter. This is input file input.csv:

Number
1
5
2
10
3

so when you import it to PowerShell you’ll receive this:
PS> Import-Csv .\input.csv
Number
------
1
5
2
10
3


Let’s say you want to create completely new column based on that data (real world example should be converting kB to GB, file size to MB, etc.) I want to have second column showing power of given number. So I need these calculated properties for that. It has this pattern:

@{ Label='Label name'; Expression={} }
which is a hash table containing new column name (Label) and value (Expression). Label is just easy text - name of newly created property, but the whole thing happen in Expression part. In our case, I’d like to just write something like "Number*Number". If I would need this in console I’ll write:
PS> $number * $number
and it’s exactly (OK - almost :slight_smile: the same I can put to my Expression part. As I am working in pipeline I can access that Number property using $_ placeholder, so should be:
$.Number * $.Number
the last change we will do is related to string to number conversion, so final statement should be:
[int]$.Number * [int]$.Number
which we can add to Select-Object cmdlet:
Import-Csv .\input.csv | Select-Object Number, @{Label='Power'; Expression={ [int]$.Number * [int]$.Number } }

What will happen now: For every record in CSV file, Select-Object will take the Number (first property), then check Expression of second (calculated) property and will calculate power of this number. For this result creates property named Power . Then will do the same for all incoming numbers. So in general - you are taking properties from pipeline and doing some counts with them. Result is then saved to new property. Which we can see with Get-Member:
TypeName: Selected.System.Management.Automation.PSCustomObject

Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
Number NoteProperty System.String Number=1
Power NoteProperty System.Int32 Power=1

Hope it’s clearer now :slight_smile:
You can see more at following addresses:
http://technet.microsoft.com/en-us/library/ff730948.aspx
http://powershell.com/cs/blogs/ebookv2/archive/2012/02/07/chapter-4-arrays-and-hashtables.aspx#using-hash-tables-to-calculate-properties
In help for Format-Table

David