Couple of things to do: - We want this algorithm to work with current API - Remove erroneous behaviour and have more tests - Look for speed and memory improvements - Write its GPU version (Could be done in later PR but let's see