Categories
Technical

Collapse Sort

Collapse sort is a brand new sorting algorithm that takes full advantage of the AMAZING qualities of Javascript arrays and their dynamic nature.

While applying for a software engineering position ten years ago, I was asked to code an implementation of the quicksort algorithm. Fortunately, my interviewer was gracious enough to provide me with a refresher of how the quicksort algorithm worked. I somehow managed to muddle my way through to a reasonably acceptable “solution” to the problem. By which I mean that my algorithm was neither quick nor sorted items nor was it an empire. Discuss. Despite this, it did leave me with a lasting fascination with all sorting algorithms in general.

Years later, I had since pivoted away from C/C++ and now found that the majority of my work was in the lawless frontier world of JavaScript. In one of my first frontend jobs, I found myself debugging a piece of code that did something like this: (I’m paraphrasing a lot here)

const someArray; // defined elsewhere
const finalEntry = someArray[someArray.length];

Given that arrays in JS start at 0, this should be corrected to const finalEntry = someArray[someArray.length - 1]. To my surprise, however, finalEntry didn’t explode with a segmentation fault, out-of-bounds, index error, etc. It just continued on its merry way, with the value of finalEntry being undefined. This was one of my first experiences with dynamically sized arrays.

The Discovery

Later that same year, I was mulling over the concepts of sorting and just noodling around in the developer console window when I wrote the following:

collapseSort = A => A.reduce((B, x) => (B[x] = x, B), []).filter(x => x)

collapseSort([2, 7, 1, 8, 28, 18])
(6) [1, 2, 7, 8, 18, 28]

Now, take a careful look at this “algorithm”. Before I explain what it’s doing, try to analyze it as an exercise to the reader. Ask yourself the following questions:

  • Will this work for all numbers?
  • What are the situations in which this kind of sort will fail?
  • What is the Big O Little O What Begins with O?

How does it work?

The logic

1. For each number X in the input array A:
- Place the value X into a new array B at the index also equal to X

2. For each value Y in the output array B:
- Remove the value Y if null/undefined value

The key reason that this works is that Javascript arrays are dynamic in nature. If we have an array of 3 elements ["A", "B", "C"] and we try to set a value “X” at position 5, the array is automatically resized to a length of 6, i.e. ["A", "B", "C", null, null, "X"]. Since we are positioning our unsorted elements at indices equal to their actual element value, you end up with an almost sparse-like sorted array.

[6, 1, 2] -> [null, 1, 2, null, null, null, 6]

Then, we can collapse the array by simply stripping out the null value elements and we’re done. Unusually, Big O is influenced more by the largest element’s value rather than the number of elements in the unsorted array.

What are the limitations?

Duplicate elements

    naturalNumbersCollapseSort([1, 6, 1, 8])
    // result [1, 6, 8]

    The fix

    collapseSort = A => A.reduce((B, x) => (B[x] = [...(B[x] || []), x], B), []).filter(x => x).flat();

    Bonus points, we’re guaranteed significantly faster runtimes since the interpreter only has to evaluate a single line of javascript! However, if you’re working for a large company, this could also negatively impact your Jellyfish LOC metric, in which case here’s an expanded look at what’s going on:

    const collapseSort = (A) => {
        const B = A.reduce((acc, curNum) => {
            // if the bucket at index of the current number doesn't exist,
            // create a new array. If it does, append to it.
            acc[curNum] = [...(acc[curNum] || []), curNum];
            return acc;
        }, []);
    
        // Flatten the array and filter out empty slots
        const flattenedAndFiltered = B.filter(element => element).flat();
    
        return flattenedAndFiltered;
    };

    Negative values

    Since we are treating the values of the unsorted array as “indices”, it’s obviously going to fail pretty spectacularly if we attempt to access negative indices in the array. Astute observers would have also realized that the very first algorithm would have also failed for 0 values since it is not considered truthy.

    const integersCollapseSort = (A) => {
        // use a "max" offset to offset all the values to positive indices
        const max = Math.abs(Math.min(...A));
    
        const B = A.reduce((acc, curNum) => {
            // if bucket at the index of the current number doesn't exist,
            // create a new array. If it does, append to it.
            acc[curNum + max] = [...(acc[curNum + max] || []), curNum];
            return acc;
        }, []);
    
        // Flatten the array and filter out empty slots
        const flattenedAndFiltered = B.filter(element => element).flat();
    
        return flattenedAndFiltered;
    };

    Expansion

    As an exercise to the reader, we have left off how this could be adapted to support floating point numbers.

    Number of easter eggs in article: 2

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.