Sorting Not working as expected

Hi Guyz,

I am trying to sort object by its name, but I cannot manage to bring the output in proper structure. The sample is at the bottom.

Following are the file names from a directory. I want to sort them sequentially based on the starting number of each file. But unfortunately, it’s not happening. Any idea what to do?

I tried with all the switches of sort-object, but no luck. And have to filter on name only. These files are generated at random time. I can not rely on the creation time property. That is why the file names are created in this manner.

Name

1 incident.mp4
10 Action.mp4
11 Isolation.MP4
12 Interogation.mp4
13 Decision Making.MP4
14 Final result.mp4
2 report.mp4
3 filing.mp4
4 investigation.mp4
5 report collection.mp4
6 examine.mp4
7 report.mp4
8 Rep Analysis.mp4
9 case study.mp4

Any steps that I am missing?

Thanks,
Roy.

The numbers in front of your file names are actually not numbers they are strings. If you like them to sort correctly you will have to change the names to use leading zeros or you have to convert them to actual numbers with ‘cutting’ the numbers and cast them to [INT] before sorting.

Hi Olaf,

I agree with you about the string. Now, the doubt is how GUI reflecting those files properly? Also, the way you mentioned, can you please elaborate the concept. I can substring the numbers from those string, after that, how to arrange, what to do? Just explain your concept a little bit more.

Thanks, n Regards,
Roy.

Now, the doubt is how GUI reflecting those files properly?
Why don't simply try it?

You can iterate over your files and “extract” the digits with a regex for example. Then you fomrat the numbers smaller than 10 with leading zeros and the sorting will be correct even in the GUI.
BTW: the sorting in the GUI depends on the Windows version and the according setting. Here you read a little more about it: Numerical File Name Sorting vs. Classic Literal Sorting.

You or whomever is creating these files, need to rethink the file naming scheme and pad leading zeros to them when they are created vs forcing you to deal with this after. It just leads to a bunch of unnecessary string gymnastics. Just as coding / scripting has standards that should be followed, file naming is part of this as well.

The sorting you are seeing, is not a PS issue. If you’d open this same list in MS Excel, you’d get the same thing. Sorting is always by character representation. 1 and 01, etc., are different of course. So, to sort, the leading part must have the same number of characters. In your case. 001…014, for example or whatever the max number span might be.

So, either these files as they are created are already properly formed or you are going to have to do a far more cumbersome effort, inline code (lead zero padding using padleft for the string, or using the custom format switch while convert the number string to an integer ), or renaming the files on disk to get what you are after.

Also what you posted, has leading spaces. If that is the case for real on the file system, vs a bad copy / paste here, you’ll have to deal with that as well on that string using trim(). Again, common naming taxonomy would make this moot.

IMHO, I’d just get the owner to rename the files on creation properly or get permission to rename them if they chose not to. It will make things far easier on you to use normal filesystem cmdlets to act on the files.

As far as the string gymnastics, this is the sort of stuff you are getting yourself into if you do not address this at its root cause.

Assuming from your posted code the way you are getting this if not, it also leading spaces So, you’ll have to deal with that as well.

# Trim leading spaces
Clear-Host
((Get-Content -Path '.\MusicData.txt').Trim()) | Sort-Object

# Results

1 incident.mp4
10 Action.mp4
11 Isolation.MP4
12 Interogation.mp4
13 Decision Making.MP4
14 Final result.mp4
2 report.mp4
3 filing.mp4
4 investigation.mp4
5 report collection.mp4
6 examine.mp4
7 report.mp4
8 Rep Analysis.mp4
9 case study.mp4

Then you can use .padLeft or custom formatting to add leading zero, before you later sort.

# Split on the first space and pad the number line
Clear-Host
((Get-Content -Path '.\MusicData.txt').Trim() -split " ",2) | 
%{If ($_ -match '^\d'){"{0:D4}" -f [int]$_}} | Sort-Object

# Results

0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014

Now, the string gymnastics, which can be avoided if a standard naming construct is used, or you just rename them before doing filesystem stuff with them.

# Putting it all together
#    Collect the files from disk. Note I am using a filename list since of course I don't have these and no real reason to create them
#    Trim any leading spaces
#    Parse the numbers in the string and pad with 4 zeros (using custom formatting vs padleft) so, this supports up to 0 - 9999 files
#    Parse the string again, the keep the remaining text on the same line.
Clear-Host
($MediaFiles = ForEach($Line in ((Get-Content -Path '.\MusicData.txt').Trim()))
{$Line -replace '^\d*',("{0:D4}" -f ("{0:D4}" -f ([int]($Line -split ' ',2)[0])))}) | Sort-Object

# Results 

0001 incident.mp4
0002 report.mp4
0003 filing.mp4
0004 investigation.mp4
0005 report collection.mp4
0006 examine.mp4
0007 report.mp4
0008 Rep Analysis.mp4
0009 case study.mp4
0010 Action.mp4
0011 Isolation.MP4
0012 Interogation.mp4
0013 Decision Making.MP4
0014 Final result.mp4

There may be far more elegant ways in dealing with this. Yet, my original opinion, still is prudent. Save yourself the unnecessary headaches as you are experiencing now based on this thread thus far.

Anyway HTH

There is another way where you do not have to change the file names, by using an expression for the sort.

Sample Code:

$files = @'
Name
1 incident.mp4
10 Action.mp4
11 Isolation.MP4
12 Interogation.mp4
13 Decision Making.MP4
14 Final result.mp4
2 report.mp4
3 filing.mp4
4 investigation.mp4
5 report collection.mp4
6 examine.mp4
7 report.mp4
8 Rep Analysis.mp4
9 case study.mp4
'@ | ConvertFrom-Csv

$files | Sort-Object -Property @{ Expression = { [int]($_.Name -split ' ')[0] } }

Result:

Name                   
----                   
1 incident.mp4         
2 report.mp4           
3 filing.mp4           
4 investigation.mp4    
5 report collection.mp4
6 examine.mp4          
7 report.mp4           
8 Rep Analysis.mp4     
9 case study.mp4       
10 Action.mp4          
11 Isolation.MP4       
12 Interogation.mp4    
13 Decision Making.MP4 
14 Final result.mp4    

Thanks, Postanote, It really helps me.
The thing is, these files are placed in a folder and there are multiple folders having the same type of files with different scenarios. Everything was good, now it’s a request to put all those files from those different folders into a single folder with some predefined prefix.

So I tried with PowerShell to achieve the same. If Files are more than 9 in a folder, then it’s creating the problems.

Two thing is not clear, which is highlighted in bold. Why you use the formatting lines two times? And what is the meaning of the comma used after the Regular expression?
{$Line -replace ‘^\d*’,(“{0:D4}” -f (“{0:D4}” -f ([int]($Line -split ’ ',2)[0])))}) | Sort-Object

Thanks again for your contribution.

Regards,
Roy.

@Christian Sandfeld

See, another elegant / simpler solution.
However, @Christian, @Sankhadip is pulling files directly from disk not a file as I used or the construct you are using.
So, @Sankhadip would have to add the header ‘Name’ dynamically on the Get-ChildItem request, which you do not show in your sample.

Doing this and the append Name to the variable content as the first entry, before the Get-ChildItem …

($Files = (Get-ChildItem -Path D:\Temp).Name)

… doing this and dealing with the header.

($Files = Get-ChildItem -Path D:\Temp | Select Name)

So, taking the former

Clear-Host

$MediaFiles = $null
$MediaFiles = @()
$MediaFiles = "Name"

ForEach($Line in (Get-ChildItem -Path D:\Temp -File).Name)
{$MediaFiles = $MediaFiles + "`n$Line" }

$MediaFiles

# Results

Name
1 passwordchangelog.txt
10 passwordchangelog.txt
11 passwordchangelog.txt
4 passwordchangelog.txt


$MediaFiles | ConvertFrom-CSV | Sort-Object -Property @{ Expression = { [int]($_.Name -split ' ')[0] } }

# Results

Name                    
----                    
1 passwordchangelog.txt 
4 passwordchangelog.txt 
10 passwordchangelog.txt
11 passwordchangelog.txt

and taking the later

Get-ChildItem -Path D:\Temp -File | 
Sort-Object -Property @{ Expression = { [int]($_.Name -split ' ')[0] } } | 
Select Name

# Results
Name                    
----                    
1 passwordchangelog.txt 
4 passwordchangelog.txt 
10 passwordchangelog.txt
11 passwordchangelog.txt

@Sankhadip, as for…

Two thing is not clear, which is highlighted in bold. Why you use the formatting lines two times? And what is the meaning of the comma used after the Regular expression? {$Line -replace '^\d*',("{0:D4}" -f ("{0:D4}" -f ([int]($Line -split ' ',2)[0])))}) | Sort-Object

The two were needed in this construct to deal with the two strings. As for the comma in the split. It’s part of other options that can be used. See this article.

Using the Split Method in PowerShell

Split on an array of strings with options
The option to specify an array of strings to use for splitting a string offers a lot of possibilities. The StringSplitOptions enumeration also offers a way to control the return of empty elements. The first thing I need to do is to create a string.

See also…

PowerShell Sort-Object gotcha’s

Thanks, Postanote and Christian,

@christian, I saw your post, but it was not working as you directed and after clarification from postanote and with the help of mentioned steps, it’s now working. It is really an easiest one. Thanks a lot.

From all of you and with this post, I learned a lot. Thank you guyz. Hats off to all of you. :slight_smile: