Optimized Quick sort in Python

Question

I have read that Quick sort is one of the most efficient sorting algorithms (if not THE most), but I don't understand why it is so slow.

Before you try to close as duplicate, of course I have read this question, but it was posted almost a decade ago, and none of the answers were optimized.

I tested code from many answers and found them all inefficient, and I implemented a version of mine own with rudimentary optimization:

import random
from collections import deque

def qsort(arr):
    if len(arr) <= 1:
        return arr
    else:
        return qsort([x for x in arr[1:] if x < arr[0]]) + [arr[0]] + qsort([x for x in arr[1:] if x >= arr[0]])

def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    lt = deque()
    ge = deque()
    e0 = arr.popleft()
    for e in arr:
        if e < e0:
            lt.append(e)
        else:
            ge.append(e)
    return quick_sort(lt) + deque([e0]) + quick_sort(ge)

def quick_sort_helper(arr):
    return quick_sort(deque(arr))

order = list(range(256))
chaos = random.choices(range(666), k=256)

In [2]: %timeit qsort(chaos)
480 µs ± 2.83 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [3]: %timeit qsort(order)
4.05 ms ± 45.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]: %timeit quick_sort_helper(chaos)
394 µs ± 9.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [5]: %timeit quick_sort_helper(order)
3.06 ms ± 57.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

They are all extremely slow, but my optimization did help a little. (I have done many other tests, they are not included for brevity)

For comparison, here is the performance of a version of insertion sort I wrote in under 3 minutes:

def bisect_right(a, x):
    lo, hi = 0, len(a)
    while lo < hi:
        mid = (lo + hi) // 2
        if x < a[mid]:
            hi = mid
        else:
            lo = mid + 1
    return lo

def insertion_sort(arr):
    out = deque()
    for e in arr:
        out.insert(bisect_right(out, e), e)
    return out

In [12]: %timeit insertion_sort(chaos)
311 µs ± 6.28 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [13]: %timeit insertion_sort(order)
267 µs ± 7.93 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

How to optimize the algorithm so as to reduce recursion and identify sorted sub sequences to eliminate unnecessary computation?

The code from the answer linked in a comment below is surprisingly NOT faster than mine:

In [22]: %timeit qsort(chaos, 0, 255)
391 µs ± 6.95 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

It seems to perform better in the worst case, but that's about it.

In [28]: %timeit qsort(order, 0, 255)
360 µs ± 6.43 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

As for comparing to the native `sorted` or `list.sort`, you will never get close to that. The sorting there is implemented and optimized in C. — user2390182, Jul 05 '23 at 13:04
Quick sort is an *inplace* sorting algorithm, but you have written an algorithm that creates deques and new lists allover the place. That's not how to do it. — trincot, Jul 05 '23 at 13:05
Pivoting on the first element is particularly bad for already-sorted data: see https://stackoverflow.com/a/53734274/765091 — slothrop, Jul 05 '23 at 13:06
There are many variations of implementations of the Quicksort algorithm. Some will be more efficient than others. Python's built-in sort is an implementation of a TimSort which is most likely to be quicker than a pure Python implementation of Quicksort. The question here is... Why would you want to write your own pure-Python implementation when the language already offers you a highly optimised solution — DarkKnight, Jul 05 '23 at 13:07
@ΞένηΓήινος if your focus is on understanding your implementation's performance for already-sorted relative to random lists (as opposed to your implementation relative to native Python .sort()), maybe edit the title to reflect that specifically. — slothrop, Jul 05 '23 at 13:11
Coincidentally brought up by the CS Community Bot: [When does Quicksort go from O(n log n) to O(n^2)?](https://cs.stackexchange.com/q/160503) — greybeard, Jul 05 '23 at 13:33

score 0 · Answer 1 · answered Jul 05 '23 at 15:52

As @trincot said, your implementation is not optimized in any way. A short code doesn't mean an efficient one. You recursive call is:

qsort([x for x in arr[1:] if x < arr[0]]) + [arr[0]] + qsort([x for x in arr[1:] if x >= arr[0]])

in which you construct auxiliary lists and then concatenate them. This is very inefficient. Not mentioning the use of the dequeue.

Quicksort is fast if implemented in-place, that means without the use of any auxiliary structure, just by moving data inside the array.

Even your Insertion Sort is not good, it must also be in-place.

To compare sorting algorithm, at least sort huge arrays, let say with more than hundred thousand elements...

Optimized Quick sort in Python

1 Answers1