Mergesort in Python

And here is Mergesort, also in Python. I don’t like it as much as I like heapsort, but it is better in some situations. If your data is in a linked list, this might be the way to go, since you can’t access the elements the way you need to heapsort. Mergesort is more tape-friendly, and was in fact used with tapes back in the day.

One might argue that Mergesort is good for when you have new data arriving, but that is arguable. In that case what you might really need is a self-balancing binary tree like AVL. Threaded if possible.

Here it goes:

#!/usr/bin/python

## Implementation of the merge sort algorithm from the
## venerable John von Neumann (1945).
## Python code by Nicolau Werneck <nwerneck@gmail.com>,
## 2010.

## The algorithm has a recursive nature, but it is not a
## tail-recursion. We implement in another way here: walking
## by the list considering segments in chunks with
## successive powers of two. This is a bit like the
## tape-reader implementation described in the wikipedia
## article. More or less of a bredth-first instead of
## depth-first style...

import random

## This function merges two chunks into an output list. It
## read the values from thw two in order, appending the
## smaller ones first. When one of the tapes is empty, the
## remaining elements of the other (if any) are appended.
def merge(out,a,b):
    while len(a) and len(b):
        if a[0]<=b[0]:
            out.append(a.pop(0))
        else:
            out.append(b.pop(0))
    if len(a)==0:       ## Append remaining elements
        out.extend(b)
    elif len(b)==0:
        out.extend(a)

## Generate random (hopefuly unsorted) list of integers.
arr=[random.randint(0,100) for k in range(101)]

## Generate solution to test.
sol=arr[:]
sol.sort()  ## Timsort

## For r in the sequence of integer powers-of-two and
## smaller than the size of the list:
r=1
while r<len(arr):
    out=[] ## Create accumulator for this step
    k=0    ## Chunk index
    while k<len(arr)/r:
        a=arr[r*k:r*k+r] ## First chunk
        k+=1
        b=arr[r*k:r*k+r] ## Second chunk
        k+=1
        merge(out,a,b)   ## Merge both
    out.extend(arr[r*k:]) ## Append incomplete chunk
    arr=out
    r=r*2

assert sol==arr
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s