command line fun – listing and sorting

Linux is not Unix yet it is very similar.  They are so similar I never really notice the difference, even when writing scripts.  I needed to get a list of sorted files, so I tried the usual commands.

ls -lS

However, there is a small difference for sorting files by size, well in Solaris.  I was able to come up with a solution, but it was a bit more convoluted.

ls -lS | sort -n 

This actually worked just fine, despite how odd it looked to me.  However, the sort command can be used to do a lot more than just sorting file sizes.

Most of the time I use the sort command for the simplest of uses.  Usually it is to take a list of values and produce a sorted list, usually to be used later as a form of input.

ls -l | nawk '{print ($6 " " $7 " " $8)}' | sort

May 30 2012
May 30 2012
May 30 2012
May 30 2012
May 30 2012
May 30 2012
May 30 2012
May 30 2012
May 31 2012
May 31 2012
May 31 2012
Sep 27 2011
Sep 27 2011
Sep 27 2011
Sep 27 2011
Sep 30 2011

It depends on the actual input, in this case the values don’t produce a unique list of dates.  It is possible to have some sort of control break logic to deal with changing data, but this is actually a bit heavy programming considering this is shell scripting. It is actually easier to filter out these values while sorting.

This filtering can be done by passing the “-u” command to the sort command.  This causes the control break logic to be executed by sort and have it remove the duplicates.

ls -l | nawk '{print ($6 " " $7 " " $8)}' | sort -u
  
May 30 2012
May 31 2012
Sep 27 2011
Sep 30 2011

This works great depending on the type of input, but this example doesn’t work quite well with numeric values.

ls -l | nawk '{print ($5 )}' | sort 

13752
1423360
2527
2918400
4096
4096
4096
4096
4096
4096
4096
47
55
577
616
686080

The sorted list doesn’t make all that much sense, unless you consider that the list has been sorted as alpha numeric data.  The values that begin with ‘1’ are listed before those starting with ‘9’ despite the entire number having a much different value.

This problem was obviously foreseen as it is possible to use the “-n” parameter to have the sorted values to be treated as numeric instead of alpha numeric.

ls -l | nawk '{print ($5 )}' | sort -n 

47
55
577
616
2527
4096
4096
4096
4096
4096
4096
4096
13752
686080
1423360
2918400

This covers most of the situations that I have needed the sort command for, but this just scrapes the surface of how flexible this command is.  It is possible to sort not only simple lists but delimeted lists.  The sorting can be done for one or more columns and the values can be reversed if necessary.

sort argument Description
-u only unique values
-n treat values as numeric
-r sorted values in reverse order
-t field delimeter
This is used when the input contains multiple fields.
-k<#> when multiple fields are part of the input,
the field “#” will be the field that is sorted.It is possible to add multiple “-k” parameters to sort on multiple fields.  The fields will be sorted based on the order that they show up on the command line.(-k2 -k1 will sort first on field 2 within field 2 will sort on field 1)
This entry was posted in Command line and tagged , . Bookmark the permalink.