# merge sort

9 messages
Open this post in threaded view
|

## merge sort

 I have written an implementation of merge sort, which I am using to sort a list of polygon vertex pairs by the distance between each pair. The problem I am having is that when the list goes over about 350 pairs, I blow up the stack. The problem isn't the recursion in the sort function, it is the merge function. It seems like there is a better way to do it, that I am missing. Here is a simplified version of my code This only sorts direct comparables, the version I am using is fancier, but this version blows up at about 5000 values. Does anybody know a good way of implementing the merge function to be less recursive? I am limited to the 2015.03 version of OpenSCAD, as my final code has to work in the Thingiverse customiser COUNT = 100; function fixStartIndex(v, index) = index < 0 ? len(v) + index : index; function fixEndIndex(v, index) = index < 0 ? len(v) + index + 1 : index; function sublist(list, start, end = -1) =   let(     startIndex = fixStartIndex(list, start),     endIndex = fixEndIndex(list, end)   )   (len(list) < startIndex) ?     undef   : (endIndex > len(list)) ?     undef   : (endIndex < startIndex) ?     undef   : startIndex == endIndex ?     []   :     [       for (i = [startIndex : endIndex - 1])         list[i]     ]; function merge(left, right, l = 0, r = 0) =   l >= len(left) && r >= len(right) ?     []   : l >= len(left) ?     sublist(right, r)   : r >= len(right) ?     sublist(left, l)   : left[l] <= right[r]  ?     concat(       [ left[l] ],       merge(left, right, l = l + 1, r = r)     )   :     concat(       [ right[r] ],       merge(left, right, l = l, r = r + 1)     ); function mergeSort(v) =   len(v) < 2 ?     v   :     let (       pivot = ceil(len(v) / 2),       left = sublist(v, 0, pivot),       right = sublist(v, pivot)     )     merge(mergeSort(left), mergeSort(right)); module test(v) {   echo("unsorted", v);   echo("sorted", mergeSort(v)); } v = rands(0, 1000, COUNT); test(v); _______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
Open this post in threaded view
|

## Re: merge sort

 Do you have a specific reason not to just use quicksort, which is easily implemented?  Like do you need your sort to be stable?   I can't think of a nonrecursive way to write the merge function, but it seems like you could rewrite it using tail recursion, which would solve the recursion depth problem.   To get the advantage your function has to have the form function merge(...) = test ? merge(....) : alternative; or function merge(...) = test ? alternative : merge(...); So you need to accumulate the answer in a parameter, something like function merge(left,right,l=0, r=0, result=[]) = l >= len(left) && r>= len(right) ? result :      merge(left,right,l+(r>=len(right) || left[l]<=right[r]?1:0), r+(l>=len(left) || left[l]>right[r]?1:0), concat(result, [r>=len(right) || left[i]<=right[r]?left[l]:right[r])); The above code was just a stab to get you started.  Totally untested.  But the basic method should work.  You can't use "let", so everything has to be crammed into the recursive function call.   Actually it crosses my mind that I don't know when tail recursion unwrapping was added.  If it's not in the old version you need to use then I think you're out of luck and you need to change algorithms.     -- Sent from: http://forum.openscad.org/_______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
Open this post in threaded view
|

## Re: merge sort

Open this post in threaded view
|

## Re: merge sort

 The question with stability is do you care that items equal in the input list might get permuted in the output list.  If the answer is yes, then stability matters.  If the answer is no, then it does not.  Quicksort performance on partially sorted data should be very good, because pivot selection will be reasonable and the data will be divided in half.  If you implement mergesort (with tail recursion unrolling) I'd be interested in seeing if it can compete with quicksort.   The advantage of mergesort is a bound on worst case run time instead of just average run time.  (Quicksort could fail if the data were perversely organized.)   -- Sent from: http://forum.openscad.org/_______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
Open this post in threaded view
|

## Re: merge sort

Open this post in threaded view
|

## Re: merge sort

Open this post in threaded view
|

## Re: merge sort

 For quicksort, "perverse" is whatever order of data causes the pivot to be selected so that all the data is on one side at every iteration.  Then quicksort turns into an n^2 algorithm.   There are different ways of selecting the pivot, so the exact definition of the perverse input may vary. But if you pick the pivot to be the middle data value then already sorted data is the ideal input.  So for example, if you picked the pivot to be the first entry in the input then sorted data would in fact be the perverse input.   The call to concat will not break tail recursion as long as it happens *within* the recursive call.  That's why you have to organize the code like I showed in my example.   Consider this simpler case: function sum_bad(v,i=0) = i==len(v) ? 0 : v[i]+sum_bad(v,i+1); This function will *not* get tail recursion optimization.  How can it be fixed?  Like this: function sum_good(v,i=0,total=0) i==len(v) ? total : sum_good(v,i+1,total+v[i]); In this case, the accumulating operation happens within the function argument, so that is fine.  You can do the same thing with the mergesort, following the example code I showed to rewrite your merge() function so that the concatenation happens in the argument and the list is accumulating as an argument instead of as a return value.   It's not pretty to pack all the computation into the function arguments, and it kind of seems inefficient to have to repeat your conditions over and over again, but I think it will work.   This will work for the function that combines two lists (the merging operation).  You cannot do this for the recursion that calls mergesort once on each half of the data.  But that should be OK because you only need log(n) depth on that recursion.   -- Sent from: http://forum.openscad.org/_______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org