Explanation of quicksort in APL

Question

I am attempting to understand the classic quicksort in APL:

Q←{1≥≢⍵:⍵ ⋄ S←{⍺⌿⍨⍺ ⍺⍺ ⍵} ⋄ ⍵((∇<S)⍪=S⍪(∇>S))⍵⌷⍨?≢⍵}

There are some things I don't understand, and some stylistic choices that bother me, so I'm going to list all of them out. I hope someone can explain them to me.

I understand that inside a { } defn, ⍺ is the left argument, and ⍵ is the right argument. What is ⍺⍺ in S←{⍺⌿⍨⍺ ⍺⍺ ⍵}? Similarly, is there an ⍵⍵? Does the ⍺ inside the S refer to the left argument of S or the left argument of Q?

My guess is that ⍺ inside the S refers to the left argument of S. The ⍺⍺ refers to the ⍺ of the enclosing function (ie, the ⍺ of Q).

Why the copious use of commute (⍨)? Is the code not far clearer written as:

Q←{1≥≢⍵:⍵ ⋄ S←{(⍺ ⍺⍺ ⍵)⌿⍺} ⋄ ⍵((∇<S)⍪=S⍪(∇>S))⍵[?≢⍵]}

The only use I can think of using commute is to reduce the use of brackets () and [], but that hardly seems worth the loss in legibility. Am I missing something of the "APL way" here?

This is not actually performing quicksort, is it? Quicksort is defined as being in-place. However, my understanding of the APL semantics is that this piece of code in fact builds new arrays on the recursive sub-calls, and the concatenates them using ⍪. Indeed, this is the same criticism that is levelled at Haskell's quicksort. Is there something I am missing in the APL semantics that informs that this operation is done "in-place"? Do note that I am not interested in "sufficiently smart compiler" arguments, since array analysis is fundamentally challenging. If the APL compiler does actually turn this into an in-place algorithm, I would greatly value details on how it manages to perform this analysis --- that's quite an accomplishment!
Why the use of ≢⍵ to find the dimension size? Why not ⍴⍵? In general, I find that people use ≢ over ⍴ to query for the size along the outermost dimension, even when the only case where the function work is in 1D. Again, I presume there is something in the APL way that I am missing.

Thanks a lot.

score 5 · Accepted Answer · answered Mar 06 '20 at 07:56

I understand that inside a { } defn, ⍺ is the left argument, and ⍵ is the right argument. What is ⍺⍺ in S←{⍺⌿⍨⍺ ⍺⍺ ⍵}? Similarly, is there an ⍵⍵?

S←{⍺⌿⍨⍺ ⍺⍺ ⍵} is called a dop. Similar to a dfn being a user-defined function, dop is a user-defined operator which behaves like ¨, ⍨, or ∘.

Summary of its semantics:

If a dop mentions only ⍺⍺ (not ⍵⍵), it becomes a monadic operator, which you can use as x (m S) y.
If a dop mentions ⍵⍵, it becomes a dyadic operator, which you can use as x (m S n) y.
In both cases, ⍺⍺ (which will have the value of m) and ⍵⍵ (which will have that of n) can be an array or a function.
You can also decide to not use ⍺ in the dop's body, in which case you can call it as either (m S) y or (m S n) y, omitting the left argument.
m is called the left operand, and n is the right operand. These are different things from a left argument (x) and right argument (y).

In your example, S mentions only ⍺⍺, so it is called as x (m S) y. If you call S like 1 2 3 >S 2, it will evaluate to 1 2 3⌿⍨1 2 3 > 2, which will be a single 3.

Does the ⍺ inside the S refer to the left argument of S or the left argument of Q?

Inside the body of S, everything made of ⍺ and ⍵ characters refers to the argument/operand of S. The original arguments of Q are invisible (unless they are assigned to a variable first, in which case they are visible as the variable name).

Why the copious use of commute (⍨)?

I believe it is mostly a stylistic choice. I also prefer to write parenthesized code instead of using it in production code, except when the usage is easily recognizable as an APL idiom. I do write e.g. 3÷⍨whatever instead of (whatever)÷3 for division by constant.

This is not actually performing quicksort, is it?

You're correct. As you mentioned already, Quicksort is designed to run in-place to truly become Quick (TM). APL can do memory pre-allocation and array sharing to reduce some of the memory copy and allocation, but at least some copies are unavoidable when the three sub-arrays (having elements less than / equal to / greater than the pivot) are created and later concatenated.

One thing to note is that, unlike Haskell, APL does have in-place assignment which looks like x[i]←v. If one were to implement Quicksort properly in APL, one would have to code like one would code in C (passing indices to recursive calls and such).

Why the use of ≢⍵ to find the dimension size? Why not ⍴⍵?

≢ is called "tally" while ⍴ is called "shape". ≢ always returns a scalar value, while ⍴ returns a vector (it would be a length one vector if given a vector argument). While a scalar and a length-one vector behave the same in most scenarios, they are different things, e.g. 1≡,1 is false.

I believe it's a good habit to differentiate the two, favoring a scalar over a length-one vector whenever possible. One notable exception is when you need an explicitly enclosed array (a scalar can't be enclosed).

Explanation of quicksort in APL

1 Answers1