# QuickSort: Haskell vs C

I was playing around with some sorting algorithms and I decided to try to compare the implementation of the QuickSort algorithm in Haskell and in C.

**C in-place partition implementation**

Starting by C, I implemented an in-place partition algorithm. This means that no auxiliary memory is used by the program besides the memory needed to store the array. Here is the definition of QuickSort:

void quicksort(long unsigned int *array, long int min, long int max) { long int pivotIndex; if (max > min) { pivotIndex = partition(array,min,max); quicksort(array,min,pivotIndex-1); quicksort(array,pivotIndex+1,max); } }

The basic idea is to select a pivot and move all the numbers smaller than the pivot to the left side of the array and the bigger numbers to the right side of the array. Then, proceed with the same algorithm for both sub-lists.

The partition procedure swaps the elements of the array around according to the pivot:

long int partition(long unsigned int *array, long int min, long int max) { long int left = min, right = max, pivotValue = array[min], pivot = min; while ( left < right ) { while( array[left] <= pivotValue ) left++; while( array[right] > pivotValue ) right--; if ( left < right ) swap(array,left,right); } array[min] = array[right]; array[right] = pivotValue; return right; }

Finally, a swap procedure:

void swap(long unsigned int *array,long int from, long int to) { long unsigned int temp = array[from]; array[from] = array[to]; array[to] = temp; }

As one can see, it is a lot of code just to implement the QuickSort algorithm. The execution time is of course related to the choice of a good pivot. In order to test it and check its performance, I have generated a file with 10.000 random integers using this script. I also include the number of integers in the top of the script to make it easier to allocate memory 🙂 Then I read this file into an array, sort it and write it back to a file. Here is the main function (please note that no error handling was made in order to keep the code easier to read for its purpose):

int main() { long int size = 0, min = 0; long unsigned int *array; long int i, j = 0; FILE *fpin = fopen("numbers.txt","r"); FILE *fpout = fopen("sorted_numbers.txt","w"); long unsigned int line = 0; char buffer[256]; //Step 0 - Read first line and get the size of the file if(feof(fpin) == 0) { fscanf(fpin,"%d\n",&size); } array = (long unsigned int*) malloc(size*sizeof(long unsigned int)); //Step 1 - read input file into array while( fgets(buffer,256,fpin) != NULL ) { array[j++] = strtoul(buffer,NULL,0); } fclose(fpin); //Step 2 - sort array quicksort(array,min,size-1); //Step 3 - write array into output file for(i=0; i < size; i++) { fprintf(fpout,"%lu\n",array[i]); } fclose(fpout); return 0; }

Here are the results:

This is really fast! However, it didn’t work in the first try. I had to change the stack size in order to run it. For that purpose, I used the command *ulimit -S -s unlimited*.

**Haskell implementation**

It is shocking how compact this definition can be in Haskell. Here is my implementation:

qsort :: Ord a => [a] -> [a] qsort [] = [] qsort (x:xs) = let smaller = filter (<x) xs bigger = filter (>= x) xs in qsort smaller ++ [x] ++ qsort bigger

This Haskell implementation is around 85% smaller than the C version!! It was much easier to test and I found out almost no errors compared to the C version.

Here is a main function in order to run it:

main = do y <- readFile "numbers.txt" l <- parseList y let s = qsort l writeFile "sorted_numbers.txt" (show s) print "done." where parseList :: String -> IO [Integer] parseList = readIO

I compiled this with the command: *ghc -O3 -o Sort Sort.hs* and I executed it with: time ./Sort +RTS -H64k -k64k

Even though the definition is much smaller than the C version, the execution time was not faster:

real 0m0.099s

user 0m0.092s

sys 0m0.006s

I found it quite inefficient when compared to the C version, so I compiled it with profiling information: *ghc -prof -auto-all -O3 -o Sort Sort.hs.* Here is the result:

As one can see, *qsort* uses few resources compared to the main function, which is much better than I expected.

- In conclusion, although the Haskell version is much cleaner and faster to develop/test, it is though slower than the C version. I also experimented with 90.000 integers and the time difference is much more evident, with the C version achieving 0m0.035s while the Haskell version achieved 0m0.635s of real execution time.

Hello,

(I’m French, excuse my English)

Maybe it’s a little late to post a reply (last article = September 26, 2010), but I think you can optimize the Haskell implementation :

qSort [] = []

qSort (x:xs) = qSort inf ++eg++ qSort sup

where (inf, eg, sup) = sep x xs ([],[x],[])

where sep mil (y:ys) (i,e,s)

| (y < mil) = (x:i,e,s)

| (y == mil) = (i,x:e,s)

| otherwise = (i,e,x:s)

(I recently discovered Haskell, it's amazing !)

Right. Now try to change it into a random pivot quick sort (which has an expected run time of O(n lgn) instead of your O(n^2) implementation).

You will find that changing the C version is just a matter of adding a single line; changing the Haskell version requires changing everything, even the function interface. Again, Haskell requires you to change your function interface in order to make an **implementation** change.

So, if you think twice before using Haskell, don’t. Just *never* use it.