5.3 Randomized algorithms

5.3-1¶

Professor Marceau objects to the loop invariant used in the proof of Lemma 5.5. He questions whether it is true prior to the first iteration. He reasons that we could just as easily declare that an empty subarray contains no $0$-permutations. Therefore, the probability that an empty subarray contains a $0$-permutation should be $0$, thus invalidating the loop invariant prior to the first iteration. Rewrite the procedure $\text{RANDOMIZE-IN-PLACE}$ so that its associated loop invariant applies to a nonempty subarray prior to the first iteration, and modify the proof of Lemma 5.5 for your procedure.

Modify the algorithm by unrolling the $i = 1$ case.

swap(A[1], A[RANDOM(1, n)])
for i = 2 to n
    swap(A[i], A[RANDOM(i, n)])

Modify the proof of Lemma 5.5 by starting with $i = 2$ instead of $i = 1$. This resolves the issue of $0$-permutations.

5.3-2¶

Professor Kelp decides to write a procedure that produces at random any permutation besides the identity permutation. He proposes the following procedure:
PERMUTE-WITHOUT-IDENTITY(A)
    n = A.length
    for i = 1 to n - 1
        swap A[i] with A[RANDOM(i + 1, n)]
Does this code do what Professor Kelp intends?

The code does not do what he intends. After executing the procedure, the original A[1] element will end up in the range A[2:n], so the procedure will exclude all the permuatations in which A[1] still ends up in A[1].

5.3-3¶

Suppose that instead of swapping element $A[i]$ with a random element from the subarray $A[i..n]$, we swapped it with a random element from anywhere in the array:
PERMUTE-WITH-ALL(A)
    n = A.length
    for i = 1 to n
        swap A[i] with A[RANDOM(1, n)]
Does this code produce a uniform random permutation? Why or why not?

Consider the case of $n = 3$ in running the algorithm, three IID choices will be made, and so you'll end up having $27$ possible end states each with equal probability. There are $3! = 6$ possible orderings, these should appear equally often, but this can't happen because $6$ does not divide $27$. For more detailed analysis, see answers on Quora and the related python code

5.3-4¶

Professor Armstrong suggests the following procedure for generating a uniform random permutation:
PERMUTE-BY-CYCLIC(A)
    n = A.length
    let B[1..n] be a new array
    offset = RANDOM(1, n)
    for i = 1 to n
        dest = i + offset
        if dest > n
            dest = dest - n
        B[dest] = A[i]
    return B
Show that each element $A[i]$ has a $1 / n$ probability of winding up in any particular position in $B$. Then show that Professor Armstrong is mistaken by showing that the resulting permutation is not uniformly random.

Fix a position $j$ and an index $i$. We'll show that the probability that $A[i]$ winds up in position $j$ is $1 / n$. The probability $B[j] = A[i]$ is the probability that $dest = j$, which is the probability that $i + offset$ or $i + offset − n$ is equal to $j$, which is $1 / n$.

This algorithm can't possibly return a random permutation because it doesn't change the relative positions of the elements; it merely cyclically permutes the whole permutation.

For instance, suppose $A = [1, 2, 3]$,

if $offset = 1$, $B = [3, 1, 2]$,
if $offset = 2$, $B = [2, 3, 1]$,
if $offset = 3$, $B = [1, 2, 3]$.

Thus, the algorithm will never produce $B = [1, 3, 2]$, so the resulting permutation cannot be uniformly random.

5.3-5 $\star$¶

Prove that in the array $P$ in procedure $\text{PERMUTE-BY-SORTING}$, the probability that all elements are unique is at least $1 - 1 / n$.

Let $\Pr\{j\}$ be the probability that the element with index $j$ is unique, then the $\Pr\{j \mid 1 \cap 2 \cap \ldots \cap (j-1)\} = 1 - \frac{j - 1}{n^3}$.

$$ \begin{aligned} \Pr\{1 \cap 2 \cap 3 \cap \ldots \cap n \} & = \Pr\{1\} \cdot \Pr\{2 \mid 1\} \cdot \Pr\{3 \mid 1 \cap 2\} \cdots \\ & = 1 (1 - \frac{1}{n^3})(1 - \frac{2}{n^3})(1 - \frac{3}{n^3}) \cdots (1 - \frac{n-1}{n^3})\\ & \ge (1 - \frac{n}{n^3})^n \\ & = (1 - \frac{1}{n^2})^n \\ & \ge 1 - \frac{1}{n}, \\ \end{aligned} $$

where the last step holds for $(1 + x)^\alpha \ge 1 + \alpha x$, for any $\alpha > 1$ and $x > -1$.

5.3-6¶

Explain how to implement the algorithm $\text{PERMUTE-BY-SORTING}$ to handle the case in which two or more priorities are identical. That is, your algorithm should produce a uniform random permutation, even if two or more priorities are identical.

PERMUTE-BY-SORTING(A)
    let P[1..n] be a new array
    for i = 1 to n
        P[i] = i
    for i = 1 to n
        swap P[i] with P[RANDOM(i, n)]

5.3-7¶

Suppose we want to create a random sample of the set $\{1, 2, 3, \ldots, n\}$, that is, an $m$-element subset $S$, where $0 \le m \le n$, such that each $m$-subset is equally likely to be created. One way would be to set $A[i] = i$ for $i = 1, 2, 3, \ldots, n$, call $\text{RANDOMIZE-IN-PLACE}(A)$, and then take just the first $m$ array elements. This method would make $n$ calls to the $\text{RANDOM}$ procedure. If $n$ is much larger than $m$, we can create a random sample with fewer calls to $\text{RANDOM}$. Show that the following recursive procedure returns a random $m$-subset $S$ of $\{1, 2, 3, \ldots, n\}$, in which each $m$-subset is equally likely, while making only $m$ calls to $\text{RANDOM}$:
RANDOM-SAMPLE(m, n)
    if m == 0
        return Ø
    else S = RANDOM-SAMPLE(m - 1, n - 1)
        i = RANDOM(1, n)
        if i ∈ S
            S = S ∪ {n}
        else S = S ∪ {i}
        return S

Solution 1: We prove that it produces a random $m$ subset by induction on $m$. It is obviously true if $m = 0$. Suppose $S$ is a uniform $m − 1$ subset of $n − 1$, that is, $\forall j \in [n - 1]$, $\Pr[j \in S] = \frac{m - 1}{n - 1}$.

If we let $S'$ denote the returned set, suppose first $j \in [n − 1]$,

$$ \begin{aligned} \Pr[j \in S'] & = \Pr[j \in S] + \Pr[j \notin S \wedge i = j] \\ & = \frac{m - 1}{n - 1} + \Pr[j \notin S]\Pr[i = j] \\ & = \frac{m - 1}{n - 1} + \left(1 - \frac{m - 1}{n - 1}\right) \frac{1}{n} \\ & = \frac{n(m - 1) + n - m}{(n - 1)n} \\ & = \frac{nm - m}{(n - 1)n} = \frac{m}{n}. \end{aligned} $$

Since the constructed subset contains each of $[n − 1]$ with the correct probability, it must also contain $n$ with the correct probability because the probabilities sum to $1$.

Solutioin 2: I think the Solution 1 is not correct. "Every element in [1, n] has a probability $\frac{m}{n}$ to appear in the set S" is a weak condition, not strong enough to ensure that the set S is a random sample. For example, a window with a capacity of m elements moves along the input array, so you can get a m-element subset of the input array A[1 + offset : m + offset] , in which offset is chosen randomly.

You need to prove the probability to get any particular set S is $\frac{1}{\binom{n}{m}}$

The corrected solution is as follows:
Use mathematic induction on both m and n. (You can not just use the induction only on m or only on n.) Suppose the result is true for any m and n such that $0 \le m \le n$, $m \lt m_0$, $n \lt n_0$

We prove that the result is true for $m = m_0$ and $n = n_0$. Denote one particular m-element subset as $S_{m_, n}$.
case 1: $n \in S_{m_, n}$

$$ \begin{aligned} \Pr[S_{m_, n}] & = \Pr[S_{m_, n} \backslash n] * \Pr[n \mid S_{m_, n} \backslash n] \\ & = \frac{1}{\binom{n-1}{m-1}} * (\frac{m-1}{n} + \frac{1}{n}) \\ & = \frac{1}{\binom{n}{m}} \end{aligned} $$

case 2: $n \notin S_{m_, n}$. Set the last element that is added to $S_{m_, n}$ as $a_i$ where $i \in [1, m]$.

$$ \begin{aligned} \Pr[S_{m_, n}] & = \Sigma_{i=1}^{m} \Pr[S_{m_, n} \backslash a_i] * \Pr[a_i \mid S_{m_, n} \backslash a_i] \\ & = \Sigma_{i=1}^{m} \frac{1}{\binom{n-1}{m-1}} * \frac{1}{n} \\ & = \frac{1}{\binom{n}{m}} \end{aligned} $$

Actually, what we need to prove is that:

the result is true for $m = m_0$ and $n = n_0 - 1$
the result is true for $m = m_0 - 1$ and $n = n_0$

but the proof will be the same as the senario "$m = m_0$ and $n = n_0$"