Tag Archives: Sorting

Natural sort in Javascript

Sorting strings naturally/numerically in Javascript is very simple. You just need to remember the existence of the Intl.Collator… which I keep forgetting…

You simply create one and use its compare function:

const collator = new Intl.Collator(undefined, {
    usage: 'sort',
    numeric: true
})

const simple = ['a1', 'a12', 'a2']
simple.sort(collator.compare)
// => ['a1', 'a2', 'a12']

const objects = [{"id": "a1"}, {"id": "a12"}, {"id": "a2"}]
const by = (key) => (a, b) => collator.compare(a[key], b[key])
objects.sort(by('id'))
// => [{id: 'a1'}, {id: 'a2'}, {id: 'a12'}]

C#: Ordering by a list of ordered predicates

In a project at work we were going to merge a bunch of PDfs and Word documents into a single PDF. The ordering irrelevant except that certain files had to be before all the others. Solved it initially like this:

    var filesInOrder = GetFiles()
        .OrderByDescending(x => x.Filename.StartsWith("ProjectDescription_"))
        .ThenByDescending(x => x.Filename.StartsWith("Budget_"))
        .ThenByDescending(x => x.Filename.StartsWith("CV_"))
        .ToArray();

Found it a bit ugly though, and decided to ask a question about other ways on StackOverflow. Got several interesting answers and inspired by those and some further thinking I tried to make a generic solution myself, which I thought I could also blog here so I definitely know where to find it if I ever need it again…

My solution

public class OrderedPredicateComparer<T> : IComparer<T>
{
    private readonly Func<T, bool>[] ordinals;
    public OrderedPredicateComparer(IEnumerable<Func<T, bool>> predicates)
    {
        ordinals = predicates.ToArray();
    }

    public int Compare(T x, T y)
    {
        return GetOrdinal(x) - GetOrdinal(y);
    }

    private int GetOrdinal(T item)
    {
        for (int i = 0; i < ordinals.Length; i++)
            if (ordinals[i](item))
                return i - ordinals.Length;
        return 0;
    }
}

One issue here might be that the predicates will be called several times per item, which could be bad if working on a huge list. Haven’t really benchmarked it though so could be fine for all I know. For smaller uses it shouldn’t matter much either way. Very curious to know about ways to optimize this though, so do let me know in the comments below if you have any good ideas!

The nice thing about it being an IComparer is that you could push this into both OrderBy and ThenBy, and also use it in for example ordered dictionaries, priority queues, etc, and since it uses a list of fully generic predicates you could order things by pretty much anything with a yes/no answer 🙂

Usage

var ordering = new Func<string, bool>[]
    {
        x => x.StartsWith("ProjectDescription_"),
        x => x.StartsWith("Budget_"),
        x => x.StartsWith("CV_"),
    };

var files = GetFiles()
    .OrderBy(x => x.Filename, new OrderedPredicatesComparer<string>(ordering))
    .ToArray();

To make the final code even cleaner the ordering could be encapsulated in a sublcass like following, which is what I did in my actual code too:

public class MySpecificOrdering : OrderedPredicatesComparer<string>
{
    private static readonly Func<string, bool>[] order = new Func<string, bool>[]
        {
            x => x.StartsWith("ProjectDescription_"),
            x => x.StartsWith("Budget_"),
            x => x.StartsWith("CV_"),
        };

    public MySpecificOrdering() : base(order) {}
}

var files = GetFiles()
    .OrderBy(x => x.Filename, new MySpecificOrdering())
    .ToArray();

C#: Natural sorting

When you create an application that displays data in lists or tables, you often run into the problem of sorting. When dealing with only numbers it isn’t a big deal, but when sorting text it can be. Regular sorting is often done by alphanumerically, which means that ‘bear’ comes before ‘cat’ and ‘5’ comes before ‘7’. The problem is that this is done letter by letter, which works for most of the time, except when you get numbers in with your text. Then you end up with for example ‘2’ coming after ’10’, since ’10’ starts with a ‘1’ which comes before ‘2’. The solution to this is something called natural sorting.

I won’t write a lot about that here, but just say that it tries to sort things the way humans do. Anyways, below you find a C# class that handles this for you. I put it together by looking around and taking some bits and pieces from here and there, so I can’t really take credit for it. I only post it here so that I won’t lose it, cause it was really helpful.

The class uses some built-in sorting functions in windows and implements the IComparer interface. It can for example be used with the OrderBy extension methods and the Sort methods of List.

using System.Collections.Generic;
using System.IO;
using System.Runtime.InteropServices;
using System.Security;

namespace Geekality
{
    public sealed class NaturalStringComparer : IComparer<string>
    {
        private readonly int modifier = 1;

        public NaturalStringComparer(bool descending)
        {
            if (descending)
                modifier = -1;
        }

        public NaturalStringComparer()
            :this(false) {}

        public int Compare(string a, string b)
        {
            return SafeNativeMethods.StrCmpLogicalW(a ?? "", b ?? "") * modifier;
        }
    }

    public sealed class NaturalFileInfoComparer : IComparer<FileInfo>
    {
        public int Compare(FileInfo a, FileInfo b)
        {
            return SafeNativeMethods.StrCmpLogicalW(a.Name ?? "", b.Name ?? "");
        }
    }

    [SuppressUnmanagedCodeSecurity]
    internal static class SafeNativeMethods
    {
        [DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
        public static extern int StrCmpLogicalW(string psz1, string psz2);
    }
}