Back to my world
To see the formulae, you must download and install the free MathPlayer plugin.
Download MathPlayer

Ordering of permutations

The problem

Imagine a piece of software that maintains a list of arbitrarily rearranged items (of any nature). The ordering of the items is stored as a setting. At some point, a new version comes out and it supports few more items in that list. At upgrade time we would want to retain the existing value of the order setting, but make sure it makes sense in the newly expanded list of items.

For a large list the only viable solution is to store the order in the natural format - as an array (list, collection, string) of item positions. A design like this inevitably needs special processing at the upgrade time - the array of item positions must be expanded to match the size of the item list. However, if the list is small and not likely to grow substantially, a more elegant (if more coding intensive) solution is possible. The order can be stored as an integer number that uniquely identifies a permutation. The seamless upgrade feature dictates that the value of the integer must accommodate the increase of the number of items in such a way that the permutation of the formerly existing items remains intact and the ordering of the newly added items is trivial. The lexicographic ordering, which was used in [1] and forms the basis for the factoradic number system [2] obviously does not fit.

More about the motivating case.

The math

For the rest of the text we'll be using the term "natural numbers" and the letter `NN` for the set of all nonnegative integers, {0, 1, 2, 3, ...}. The suggested implementation is easier and clearer if indices are zero-based. The finite permutations of the said set form a group - let's call it `S_NN`. We're going to use `S_n` (where n is a number) to denote the permutation group of n elements - that is, the permutation group of the set {0, 1,...,n-1}. Note that `S_NN` is not the same as the symmetric group of natural numbers - the latter includes infinite permutations as well, ones that permute an infinite number of integers. `S_NN` can be informally thought of as the countably infinite union of all groups `S_n`.

These groups form a chain of subgroups of `S_NN` in such a way that `S_1 sub S_2 sub S_3 sub ... sub S_NN`. This fact is crucial for the further logic; let's demonstrate it on the example of `S_3` and `S_2`. `S_3` contains the following 6 permutations:
(0,1,2)
(1,0,2)
(0,2,1)
(2,0,1)
(2,1,0)
(1,2,0)

The first two permutations form a subgroup of `S_3` that is isomorphic to `S_2`. They are trivial in their rightmost element - that is, they map 2 to 2, they permute only two numbers. From this standpoint, we can introduce a sligtly sloppy notation that "smaller permutation groups belong to (or "are in") larger permutations groups". Rigorously phrased, that means that for each natural m<n, there's a subgroup in `S_n` that is isomorphic to `S_m` and maps all numbers no less than m to themselves. In the same vein, we'll be talking about "elements of `S_m` that are in `S_n`" - meaning those elements of `S_m` that belong to the subgroup of `S_m` that is isomorphic to `S_n`, and permute only up to n numbers.

Since all permutations in `S_NN` are, by definition, finite, every permutation in `S_NN` has the highest permuted number and belongs to some finite `S_n`. We'll call that number "the degree of the permutation". For example, the degree of permutation (3,4,1,0,5,6) in `S_6` is 4, since it's trivial in the 4th and 5th position. Note that the term "degree of permutation" is used in a different sense in [3].

So, the task at hand, on the algebraic level, is to devise a bijective mapping between `NN` and `S_NN` - i. e. a sequence of finite permutations - in such a way that for every `n in NN` the image of the interval [0, (n-1)!] coincides with `S_n`. The subgroup chain that forms `S_NN` is crucial to the building of that sequence - in some sense, what we're trying to do is ordering of permutations by their degree. Alternatively, we might say we're trying to come up with a bijective mapping between `NN` and `S_NN` under which the images of all integers below n! only permute numbers up to, but not including, n.

Considered mathematically, the said sequence covers the whole countably infinite set of naturals; in real-world computing we're necessarily limited by the specs of the target machine. While there are packages out there to manipulate integers of arbitrarily large size, the implementation below works with plain vanilla int datatype. On a 32-bit machine, that lets us handle permutations of up to 12 items - that's a whole lot of permutations already. A 64-bit integer can handle up to 20 elements.

The sequence

The proposed set of requirements does not yield a unique sequence. The first two elements of the sequence are pretty much set in stone - that's the identity and the only nontrivial permutation of two elements, (1,0). The further order may vary - the proposed sequence and the algorithm were driven mainly by ease of implementation. It also has some neat algebraic properties that will be explored below.

The algorithm itself does not have to be recursive, but it's convenient to describe the sequence in inductive terms. We already know the right sequence for `S_1` and `S_2`. For a given value of n, we'll assume we already have the desired sequence of permutations for the `S_(n-1)` and we'll try to build the sequence for the remainder of `S_n`.

The group `S_n` contains n! permutations. Inductively, we'll assume that the sequence of the first (n-1)! of them is already known - that's the permutations that are in `S_(n-1)`, by our requirements they occupy the first (n-1)! positions. Let's break the permutations of `S_n` into n blocks, each the size of (n-1)!, based on the number  in their rightmost position. In the very first block (which is really `S_(n-1)`), when the permutations are considered as elements of `S_n`, the rightmost position is, trivially, occupied by the very number (n-1). We'll then arrange those blocks in the order of decrease of the rightmost number. So the structure of the permutation sequence for `S_x` is like this:

This is really `S_(n-1)`: `{((cdots,n-1)), (cdots), ((cdots,n-1)):}`
This is the 2nd block out of n: `{((cdots,n-2)), (cdots), ((cdots,n-2)):}`
This is the 3rd block out of n: `{((cdots,n-3)), (cdots), ((cdots,n-3)):}`
...
This is the last (n'th) block: `{((cdots,0)), (cdots), ((cdots, 0)):}`

From now on, let's denote the zero-based number of the block as i. So, the abovementioned convention about the rightmost element spells out to: in all permutations that form the i'th block of `S_n`, at the rightmost position we have, by construction, the number n-i-1.

We assume that the sequence of permutations in the first block is already known. We still have to come up with the values of the inner positions of permutations in all blocks after the first one. To do that, we'll replicate the structure of inner positions from the first block (where it's known), amended with respect to the convention about the rightmost position. Since in the permutations that form `S_(n-1)` the number n-i-1 is already present somewhere in the middle (i. e. in position other than the rightmost), we have to replace it with the number n-1, which is never present in the middle in `S_(n-1)`.

Let's expand this logic manually for some small permutation group. The sequence for `S_2` is obvious:

0: (0)
1: (1,0)

So by our logic, `S_3` will consist of 3 blocks, with 2 permutations in each, the first of them being the copy of `S_2` extended trivially, the 2nd block having 1 in the rightmost position, the 3rd having a 0:

Block 0 (`S_2`):
0: (0,1,2)
1: (1,0,2)
Block 1:
2: ( , ,1)
3: ( , ,1)
Block 2:
4: ( , ,0)
5: ( , ,0)

Now let's replicate the first block into the second block. Since 1 is the rightmost number in the 2nd block, we have to replace 1 with 2 when copying and leave the 0 be:

0: (0,1,2)
1: (1,0,2)
2: (0,2,1)
3: (2,0,1)
4: (.,.,0)
5: (.,.,0)

By the same logic, we retain 1 in the 3rd block and replace 0 with 2:

0: (0,1,2)
1: (1,0,2)
2: (0,2,1)
3: (2,0,1)
4: (2,1,0)
5: (1,2,0)

All permutations in `S_3` were listed. Spelling out all blocks in `S_4` will take quite some space - there's 4 of them with 6 permutations in each, so let's just spell out the second block. For the second block, we take a copy of `S_3` from above, attach 2 in the rightmost position, and replace 2 with 3 in inner positions.

 6: (0,1,3,2)
 7: (1,0,3,2)
 8: (0,3,1,2)
 9: (3,0,1,2)
10: (3,1,0,2)
11: (1,3,0,2)

In this manner, we can extend the sequence as far as we desire. However, enumerating permutations is not the task at hand - the applied problem we're trying to solve is fast random access in the sequence.

By the way, another sequence that satisfies the requirement is described here.

The algorithm

Now let's think of an algorithm that would implement all of this. Enumerating permutations is more of a sophomore CS assignment than a real-life task. What we need for real life is two functions - one takes an integer and returns a permutation, the other works the other way around. The sequence is a bijection - it works both ways, and it needs to work both ways for a real life application. The UI for control of ordering is built more naturally in terms of position arrays.

To generate a permutation (as an array of numbers) from an index, the first thing we need is the degree of the permutation - that is, we need to identify the smallest group among all `S_n` that the desired permutation will fit in. This is done by finding an inverse factorial of the index, rounded up. Then we have to build the actual permutation. It's convenient to do so from right to left. Checking the index of the permutation against the block structure, we can know which block of `S_n` does the permutation go into, and on which position within that block. That gives us the number in the rightmost position - and an index within `S_(n-1)` that will determine the numbers on inner positions.

The next iteration of the loop must take into account the fact that the `S_(n-1)` it's considering permutes not numbers from 1 to n-2, but numbers from 1 to n-1 with n-i-1 excluded and n-1 on its position. The very target array serves as a state storage for this information - on each iteration of the loop the first n elements of it store the numbers the permutation is supposed to act upon; before the first iteration of the loop, the array is initialized to the identity permutation.

During the first iteration of the loop, the algorithm has the whole set of numbers from 0 to n-1 at its diposal, and they are in the array at their right (i. e. trivial) places.

During later iterations of the loop (say, when we're down to `S_m`, m<n), the logic has to take into account the fact that some of the numbers in the interval [0,m-1] might have been already "used up" by the earlier iterations. The pool of numbers that are subject to permuting is stored in the target array - on positions from 0 to m-1.

The algorithm has a nice mathematical interpretation. It's a known result that each permutation can be decomposed into a product of transpositions (i. e. permutations that exchange only two elements). Our algorithm works by reproducing this decomposition and having it act on the identity permutation, which is the initial state of the target number array.

There's a connection to the factoradic numbers as well. The sequence of "indices within a block" on each step matches the factoradic presentation of the permutation index number from left to right.

The implementation

The algorithm above were implemented in C++. The following function Seq() takes the position index in the sequence as an argument, and fills an array of integers with the permutation. It returns the value of n - that is, the degree of the permutation. Conveniently, this value coincides with the number of array positions filled. Providing the array of sufficient size is the caller's responsibility.

int Seq(int i, int a[])
{
    int n, size, block, retval, temp;

    a[0] = 0;
    //First, we have to calculate n 
    //and initialize the array to the trivial permutation.
    n = 1;
    size = 1;
    while(size <= i)
    {
        a[n] = n;
        n++;
        size *= n;
    }
    //n now is the degree of the smallest
    //Sn that has our permutation.
    //size now is the order of Sn - that is, n!.

    //Now let's build the permutation. 
    retval = n;
    //We need to cache the value of n here - 
    //we'll use n to count down, and we need that value as a return value.
    while(n > 1) //We loop all the way back to S1.     
    {
        size /= n;
        //size now has the value of (n-1)! -
        //that's the size of the block in Sn.         
        block = i / size;
        //Now block contains the index of the block
        //within Sn where the current permutation belongs.
        i -= block*size;
        //i now has the position within the block.
        n--;
        //We decrease n because the next iteration of the loop will be
        //dealing with Sn-1. Conveniently, this value of n matches
        //the zero-based index of the rightmost element in Sn.
        if(block > 0) 
        //If the permutation in question goes into a block other than the first one
        {
            //We have to exchange the element at the end
            //with an element in the middle of the array
            temp = a[n-block];
            //That's what will become the rightmost element
            a[n-block] = a[n];
            a[n] = temp;
        }
    }

    return retval;
}

The meaning of the parameters is:

i - the 0-based index of the permutation;
a[] - the integer array to fill with the numbers that make up the permutation. The existing content of the array is overwritten.

The function consists of two distinct parts. The first loop calculates n by comparing i against the subsequent values of n!. The minimal number such as n!>i is the one we're looking for. Along the way, it initializes the array to the trivial permutation - up to but not including the n'th element. Once the loop quits, n has the proper value, and the variable "size" has n! in it - we'll need it to calculate block size down the road.

The second loop implements the blocking and transposition logic. On each iteration of the loop, it places the rightmost element in the array, puts whatever was there previously inside of the array, and lets the subsequent iterations consider smaller groups.

You may test the function with the following program:

#include <iostream>
#include <iomanip>
using namespace std;

static const char Alphabet[] = "0123456789ABC";
int Seq(int, int[]);

void main(void)
{
    int i, a[20];
    for(i=0; i<120; i++)
    {
        cout << setw(6) << i << ": ";
        int j, n = Seq(i, a);
        for(j=0; j<n; j++)
            cout << Alphabet[a[j]];
        cout << endl;
    }
}

This program will type out all permutations of `S_5`. The output from it can be found here.

Further considerations

The function Seq() only initializes the part of the array that contains the nontrivial part of the permutation. If the caller needs to initialize the permutation up to a known, fixed degree, there might be a need for padding logic. This logic can be incorporated straight into Seq() or implemented over Seq(), for example:

void SeqPad(int i, int a[], int n)
{
    int m = Seq(i, a);
    while(m < n)
    {
        a[m] = m;
        m++;
    }
}

You might have noted that the function Seq() builds the permutation by applying transpositions to the identity permutation, which is the initial state of the array. We can easily modify the function to act on a preconstructed array of any type - this way, the permutation is immediately applied to that array. This can be implemented as a C++ template, or a .NET generic, with the type of elements to be permuted as a template parameter. The logic is very similar - you can find the source here. The difference (other than variable types) is that the array initialization is omitted in the first loop. In this scenario there's no need for padding logic - the function assumes that the array was prebuilt by the caller.

The fact that the algorithm works by applying transpositions illuminates a neat property of the sequence. Consider a permutation that transposes natural numbers m and n (m < n). Let's think of its index. Naturally, the degree of that transposition is n+1. Since the number in the rightmost position is m, it belongs to the block n-m of `S_(n+1)`. And the inside elements are trivial for that block - so the permutation is the first one in the block. The index of the first permutation in the i'th block of `S_(n+1)` is `i*n!`, so the index of that transposition in the sequence is `(n-m)*n!`.

The converse is true: the first permutation of each (i'th) block other than the first one (i. e. i>0) of `S_(n+1)` is a transposition of n and n-i. The first permutation in the first block is the identity permutation. This follows from the fact that there are only n transpositions of this kind in `S_(n+1)`.

The inverse problem

Now let's consider the algorithm for calculating the permutations's index from its array representation. Like the direct function, it works from right to left. However, it needs to know the degree of the permutation beforehand - with C++ dumb arrays, there's no way to figure out the rightmost nontrivial element that won't end up looking past the array's bound, and introducing a hard-coded right bound within the function is clearly unelegant. This is not a problem with .NET's arrays, or with STL vectors - but those data structures do store the size of the underlying array inside them, so it's really the same thing.

Another thing, it needs some extensive runtime state. The Seq() function was supposed to fill (or permute) a target array, so that array was fair game for state storage. The inverse function (let's call it Index() in advance) cannot treat the input array the same way - if just for the sake of const correctness. Meanwhile, it needs to store the information about the transposition sequence that in inherent to the blocking logic - and that needs up to n integer variables. So a state array will be introduced.

The algorithm's main loop, like in Seq(), implements blocking logic subsequently for groups starting from `S_n` and down to `S_1`. In the first iterarion, it looks at the rightmost element of the permutation and decides which of the (n-1)!-sized blocks does it go. The starting index of the block, multiplied by the block size, will become an addend for the permutation's index in the sequence; the position within the block is to be determined by the subsequent iterations in a similar fashion. However, the subsequent iterations need to know that the values they are working with belong not to the small permutation, but to the small permutation with some numbers replaced with larger numbers. To store that info, we're introducing a local array; at the n'th position of the array is the number that was at the rightmost position when considering `S_n`. So during the loop iterations, when the function encounters a number on the rightmost position that is no less than n, it has a place to look, what did this number stand for in the untransposed permutation.

In the beginning of the function Index(), just like in Seq(), there's a loop that calculates (n-1)!. We need that for block size calculation down the road.

The inverse function

int Index(const int p[], int n)
{
    int *a, j, size, block, i, temp;
    a = new int[n];
    size = 1;
    j=2;
    while(j < n)
    {
        size *= j;
        j++;
    }
    //Now size contains (n-1)! - that's the block size in Sn, and j contains n
    i = 0;
    while(n > 1)
    {
        n--;
        temp = p[n];
        while(temp > n)
            temp = a[temp];
        block = n - temp;
        if(block > 0)
        {
             i += size * block;
             a[n] = temp;
        }
        size /= n;
    }
    delete [] a;
    return i;
}

The meaning of the parameters is:

p - the permutation, represented as an integer array
n - the degree of the permutation, which is the same as array size.

The function's return value is the index of the permutation.

The nested loop implements the logic of the tracing back the transpositions that made up the block.

Unfortunately, compared to Seq(), this function is slightly more resource-hungry. Seq() required `O(n)` of time and O(1) of memory; this function requires `O(n^2)` of time and O(n) of memory.

Algebraic properties of the sequence

Armed with both direct and inverse algorithms for our sequence, we can find out some nice properties of the target sequence - specifically, how does it relate to the group structure of `S_NN`.

Let's start by saying that the only elements of the permutation array that contribute to the index are the ones where the permutation acts nontrivially (i. e. permutes, p[n] != n). When processing those, the function may or may not look into the state array for translating this number into a smaller number. The contribution to the index is calculated only once p[n] < n.

From this we can see that the value of the Index() function is invariant under the trivial extension of the permutation on the right (as in, (2,3,0,1) in `S_4` vs. (2,3,0,1,4,5,6) in `S_7`). This was the design goal from the very beginning, but still, important to remember.

Let's consider the action of Index() upon transpositions of two elements, n and m (m<n). First and foremost, there's no contribution to the index from the trivial elements on the right, so we can start by considering the rightmost nontrivial element - in the group of degree (n+1). During the processing of n'th element, the temp variable gets the value of m, so the block number is (n-m), and block size is n!. So the addend to the index from that iteration is `(n-m)*n!`. And the state array gets the number m in its n'th position. When the loop is down to its iteration where the m'th position is considered, the temp variable is initialized to n; by the logic of lookup in the state array that translates to m, so there's no addend to the index on this iteration. And in the rest of the iterations the permutation is trivial. So we arrive, in a roundabout way, to the previous result - the index of transposition is `(n-m)*n!`.

Let's now consider permutations that are formed by the composition of two disjoint transpositions: `(m->n)*(k->l)` (assuming k<l). Like in the single-transposition example, this permutation is nontrivial only in 4 positions - that's m, n, k and l. Like in the example above, the iteration of index n initializes the n'th element in the state array, and the iteration of index m uses that number for lookup. Similarly, the l'th iteration initializes the l'th position in the state array and the k'th iteration looks at it to find the very number k. Since the indices add up, the resultant index is `(m-n)*n! + (l-k)*l!)`.

The same logic extends to the composition of any number of disjoint transpositions. Since disjoint cycles commute in the permutation world, the order of composition does not matter. So, the index of a product of any number of pairwise disjoint transpositions is the sum of indices of the factors. This stems from the fact that the only positions in the permutation that matter to Index() are the ones where the permutation is nontrivial, and that the lookup logic follows the disjoint cycle.

Not every permutation can be decomposed into a product of disjoint transpositions. Yet every permutation can be decomposed into a product of disjoint cycles. Let's consider the action of Index() on disjoint cycles of length greater than 2.

First, we'll consider the simpler case of a cycle where the elements go in the increasing order. That is, a cycle that maps elements to those larger than their positions except for the largest one. The permutation (0, 2, 4, 3, 1)  - which can presented as the cycle `(1->2->4)` - is one of those, and the following: (0, 4, 1, 3, 2) (AKA `(1->4->2)`) is not. Any transposition can be presented as an ordered cycle. The logic of handling those in the Index() function is very similar to the logic of transpostion handling.

For the sake of notation, let's assume we're dealing with cycle `(c_1->c_2->...c_m)`. In the usual array notation, this is a permutation where `c_2` is at position `c_1`, `c_3` is at position `c_2` and so forth, up until the end, where `c_m` is at position `c_(m-1)` and `c_1` is at position `c_m`.

On the first nontrivial step, the loop encounters the value of `c_1` on the position `c_m`. So temp becomes `c_1`, and block number is  `c_m - c_1`. And the state array gets the value of `c_1` at position `c_m`. On the next step, at position `c_(m-1)`, the value of temp initially becomes `c_m`. There's a need for lookup - by construction, we assume that `c_(m-1) < c_m`. The lookup looks at the state array in the position of `c_m` and yields the value of `c_1`. The modification of the state array places the value of `c_1` at position `c_(m-1)` as well. On the next stage of the loop, temp gets the value of `c_(m-1)`, looks in the state array at position `c_(m-1)` and yields, again, the value of `c_1`. And so on and forth - on each iteration of the loop, the On the very last stage of the loop, the lookup yields, again, `c_1`, but this time, it's trivial - `c_1` at the position `c_1`. What matters is the fact that the only elements of the state array that are written or read by the loop are the ones at positions `c_2`, `c_3`, etc.

Now let's consider the more general case of cycles that are not necessarily ordered - like the abovementioned `(1->4->2)`. For these, the nice uniformity of the lookup behavior from the case above  - always looking one step up at the previous nontrivial position - cannot be assumed. Yet we can still see that the only positions in the state array that are ever filled and referenced are those at indices `c_1` to `c_m`. The only positions in the permutation array where lookup might be needed (i. e. value at position greater than the position) are the positions `c_1`..`c_m`. Again during these, the only positions in the state array that are filled are the ones at `c_1`..`c_m`. And the only possible values that go there are, again, the values of `c_1`..`c_m`. The values at other positions of the permutation are trivial and therefore don't contribute to the index, don't require a lookup and don't go into the state array.

And from this, the following, even more general property of the sequence can be seen: the indices not only are additive for products of disjoint transpositions, but also for products of any number of disjoint cycles. In fact, the following can be stated: the index of a product of any number of pairwise disjoint permutations (not just transpositions) is the sum of the indices of the factors.

The disjoint permutations commute, yet the statement above doesn't generalize to all commuting permutations. For example, every permutation commutes with self. On the other hand, the square of any transposition is the identity permutation. There's no way two positive indices would yield a zero sum.

1. Using Permutations in .NET for Improved Systems Security
2. http://en.wikipedia.org/wiki/Factoradic
3. Seress A. Permutation group algorithms
4. Knuth D. The Art of Computer Programming