Problem
I would like to get my code for implementation of simple sorting algorithms in idiomatic python code reviewed. I am looking for feedback on python idioms used, better ways of doing the same thing. I am familiar with FP, and I am willing to put in effort to get myself familiar with pythonic idioms such as list comprehensions and enumerators. Hence I would also like feedback on if any of the explicit loops can be converted to these.
import doctest
Bubble Sort
def bubblesort(items):
"""
>>> bubblesort([])
[]
>>> bubblesort([1])
[1]
>>> bubblesort([1,2,3,4,5])
[1, 2, 3, 4, 5]
>>> bubblesort([4,5,3,1,2])
[1, 2, 3, 4, 5]
"""
for i in range(len(items)):
mod = False
for j in range(len(items)-1):
if items[i] < items[j]:
mod = True
items[i], items[j] = items[j], items[i]
if not mod: break
return items
Selection Sort
def selectionsort(items):
"""
>>> selectionsort([])
[]
>>> selectionsort([1])
[1]
>>> selectionsort([1,2,3,4,5])
[1, 2, 3, 4, 5]
>>> selectionsort([4,5,3,1,2])
[1, 2, 3, 4, 5]
"""
for k in range(len(items)-1,0,-1):
v,i = max((v,i) for i,v in enumerate(items[:k]))
if items[k] < v:
items[k],items[i] = items[i], items[k]
return items
Insertion Sort
def insertionsort(items):
"""
>>> insertionsort([])
[]
>>> insertionsort([1])
[1]
>>> insertionsort([1,2,3,4,5])
[1, 2, 3, 4, 5]
>>> insertionsort([4,5,3,1,2])
[1, 2, 3, 4, 5]
"""
def find_the_place(v, till):
"""
>>> items = [1,2,4]
>>> find_the_place(items, 3, 3)
2
"""
for i in range(0, till):
if v < items[i]:
return i
return till
def shift_things_from(i, till):
items[i+1:till+1] = items[i:till]
for k in range(1,len(items)):
v = items[k]
i = find_the_place(v, k)
shift_things_from(i, k)
items[i] = v
return items
Shell Sort
def shellsort(items):
"""
>>> shellsort([])
[]
>>> shellsort([1])
[1]
>>> shellsort([1,2,3,4,5])
[1, 2, 3, 4, 5]
>>> shellsort([4,5,3,1,2])
[1, 2, 3, 4, 5]
"""
def sort_gap(start, gap):
for k in range(len(items)-1, 0, -1*gap):
v,i = max((v,i) for i,v in enumerate(items[:k:gap]))
if items[k] < v:
items[k], items[i] = items[i], items[k]
gap = len(items)//2
while gap > 0:
for i in range(gap):
sort_gap(i, gap)
gap = gap//2
return items
Merge Sort
def mergesort(items):
"""
>>> mergesort([])
[]
>>> mergesort([1])
[1]
>>> mergesort([1,2,3,4,5])
[1, 2, 3, 4, 5]
>>> mergesort([4,5,3,1,2])
[1, 2, 3, 4, 5]
"""
def merge(left, right):
"""
>>> merge([2,4,6], [])
[2, 4, 6]
>>> merge([], [2,4,6])
[2, 4, 6]
>>> merge([1,3,6], [2,4,6])
[1, 2, 3, 4, 6, 6]
"""
if not left: return right
elif not right: return left
elif left[0] < right[0]:
return left[:1] + merge(left[1:], right)
else:
return right[:1] + merge(left, right[1:])
if len(items) <= 1 : return items
mid = len(items)//2
left = mergesort(items[:mid])
right = mergesort(items[mid:])
return merge(left, right)
Quick Sort
def quicksort(items):
"""
>>> quicksort([])
[]
>>> quicksort([1])
[1]
>>> quicksort([1,2,3,4,5])
[1, 2, 3, 4, 5]
>>> quicksort([4,5,3,1,2])
[1, 2, 3, 4, 5]
"""
def partition(f, l, pivot):
while f < l:
while items[f] < pivot: f+=1
while items[l] > pivot: l-=1
items[l], items[f] = items[f], items[l]
l, f = l-1, f+1
f,l = l,f # swap because while switches it at the end.
return (f, l)
def qsort(fst, lst):
if fst >= lst: return
pivot = items[fst]
(f, l) = partition(fst, lst, pivot)
qsort(fst, f)
qsort(l, lst)
if not items: return items
qsort(0, len(items)-1)
return items
Heap Sort
def heapify(items):
"""
>>> heapify([])
[]
>>> heapify([1])
[1]
>>> heapify([2,1])
[2, 1]
>>> heapify([2,1,3])
[3, 2, 1]
>>> heapify([2,1,4,3])
[4, 3, 1, 2]
>>> heapify([5,3,6,7,1,9,4,8])
[9, 8, 6, 7, 1, 3, 4, 5]
"""
for i,t in enumerate(items):
if i == len(items): break
while i > 0:
p = i//2
if items[p] < items[i]: items[p], items[i] = items[i], items[p]
i = p
return items
def siftdown(items, i, size):
l = i * 2
r = l * 2 + 1
largest = i
if l < size and items[i] < items[l]:
largest = l
if r < size and items[i] < items[r]:
largest = r
if largest != i:
items[largest], items[i] = items[i], items[largest]
items = siftdown(items, largest, size)
return items
def heapsort(items):
"""
>>> heapsort([])
[]
>>> heapsort([1])
[1]
>>> heapsort([1,2,3,4,5])
[1, 2, 3, 4, 5]
>>> heapsort([4,5,3,1,2])
[1, 2, 3, 4, 5]
"""
items = heapify(items)
for i in range(len(items)-1, 0, -1):
items[i], items[0] = items[0], items[i] #swap first and last
items = siftdown(items, 0, i)
return items
Tests
doctest.testmod()
New question after fixing is asked here.
Solution
I’ll do a proper review later, but for now, I’ll just point out that there seem to be some bugs in your implementation:
>>> from sorting import *
>>> bubblesort([1, 0])
[1, 0]
>>> quicksort([0, 0, 0, 1, 0, 0, 0])
[0, 0, 0, 0, 0, 1, 0]
>>> heapsort([1, 0, 1, 1, 0, 0])
[0, 0, 1, 0, 1, 1]
I used Hypothesis to create some randomised list of integers, throw them into your function, and compare them to the builtin sorted()
. I couldn’t get any crashers, but those results seem a bit off to me.
The lesson here: you need more than a few simple test cases. Only having a few tests means that you’re more likely to miss edge cases, and have subtle bugs in your code.
Bubble sort
As above, this doesn’t actually work. There are a couple of things wrong here:
-
This isn’t a bubble sort. In bubble sort, you compare adjacent pairs of items until the list is sorted – here, you’re doing pairwise comparisons of non-adjacent elements. If I try with a list of length 4, and print the values of
i
andj
, this is what I get:0 0 0 1 0 2 1 0 1 1 1 2 2 0 2 1 2 2 3 0 3 1 3 2
I would expect to see
0 1
,1 2
and2 3
. This is actually much more inefficient than bubble sort. -
If you want to iterate over multiple ranges like this, you’d be better off using something like the itertools module, than constructing multiple nested loops. This products the same iterations of
i
andj
as your code:for i, j in itertools.product(range(len(items)), range(len(items) - 1)): print(i, j)
Selection sort
-
This is a slightly unusual implementation of selection sort – you’re building up the sorted subset from the right (with the largest elements sorted first) rather than the left. Usually, I see selection sort as building up a sorted subset with the smallest elements first. But this seems to work, so it’s fine.
-
Personally I’m not a fan of
range()
with three parts, especially if the step is –1, i.e.range(len(items)-1,0,-1)
. I think they’re a little cramped, and prefer to usereversed(range(len(items)-1))
– I find that easier to read. -
Small thing, but make sure you add a space after commas. Good for readability.
Insertion sort
-
It would be helpful to have some docstrings on your utility functions. It’s not obvious to me what
find_the_place()
orshift_things_from()
are supposed to do, which makes it hard to tell if they’re doing that correctly. -
You can improve the for loop in
find_the_place
withenumerate()
:for idx, item in enumerate(till): if v < item: return idx