About Me

My photo
I'm an IT professional living in Lisbon

Thursday, August 8, 2013

3/7: Algorithmic performance


This article is part of a series on performance guidelines.


We shall walk the steps needed to choose the right algorithm for a particular task. These steps are ordered from the most to the least important. Even though written with sorting algorithms in mind, most of these topics are relevant regardless of the algorithm’s nature.

Temporal complexity

Above all other considerations, the search for the right algorithm to a particular problem should focus on worst-case temporal computational complexity. Indeed, in real-world large scale applications, the mere possibility that a contrived input vector may bring the whole system down to a crawl is simply not acceptable, no matter how unlikely. Every mainstream algorithm has been studied throughout and its worst-case complexity made known, so there is really no excuse to use an uninformed choice.

Locality

Of course, there’s a long distance from theory to practice, and one often finds that algorithms exhibiting very good asymptotic complexity turn out to have a very bad performance once implemented in a specific real-world scenario. We’ll see a good practical example of this effect in our last chapter.
For example, when working with a relatively small problem size N, the choice between Quicksort and heapsort isn’t exactly trivial. At first sight, heapsort should win hands-down, since it has a much better O(n*Log(n)) worst-case temporal complexity when compared to quicksort’s O(n^2). But another aspect comes into play – space and temporal locality. Quicksort tends to work for long periods on relatively small patches of memory, which means that a cached memory bank, once loaded, will be thoroughly exploited by the algorithm, before it moves its focus elsewhere in the input array. In contrast, heapsort accesses memory all over the place, with no apparent rhyme or reason. It’s a nightmare from a caching standpoint. This has such a great impact on performance that quicksort will end up being faster. Of course, once one enters the realm of large data sets, the n-squared algorithmic complexity quickly raises its ugly head and quicksort soon loses its advantage.
The same can be said of radix sort – its linear temporal complexity only becomes relevant for significantly large data sets, due to a fairly large constant multiplicative factor on top of the linear complexity, coupled with it being very cache unfriendly.

Space complexity

Once we’ve decided on a subset of algorithms, the next factor that will help us make the final decision is space complexity. When space is at a premium, we may not be able to afford the extra temporary storage that many sorting algorithms use. For example, a cache-optimized mergesort is by far the best jack-of-all-trades sorting algorithm, exhibiting superb performance together with sorting stability (a sorting algorithm is said to be stable when the original relative position of identical keys is preserved on the sorted list). Alas, mergesort needs temporary storage of the same size as the original list, which isn’t much of a problem in these days and age when CPU RAM is cheap. But in specialized hardware, such as a GPU, memory is still at a premium, which may condition algorithm choice.

Best case performance

In many cases, the specifics of our problem at hand may give us additional guidelines in our algorithm choice. For example, if we know that in most cases our lists are nearly sorted, then we could have significant improvements if we chose an algorithm with a linear-time best case for already sorted lists, such as Insertion sort (if our lists are really small) or smoothsort (if locality isn’t a big problem).

Stability

Stability is often a desirable intrinsic property of a sorting algorithm. In practice, a non-stable algorithm can be made stable by adding a secondary indexer to be used when resolving disputed comparisons with identical keys. While requiring twice as much memory, this addition doesn’t increase the asymptotic temporal complexity, so in many cases it’s an acceptable trade-off.