I have written an implementation of merge sort, which I am using to
sort a list of polygon vertex pairs by the distance between each pair. The problem I am having is that when the list goes over about 350 pairs, I blow up the stack. The problem isn't the recursion in the sort function, it is the merge function. It seems like there is a better way to do it, that I am missing. Here is a simplified version of my code This only sorts direct comparables, the version I am using is fancier, but this version blows up at about 5000 values. Does anybody know a good way of implementing the merge function to be less recursive? I am limited to the 2015.03 version of OpenSCAD, as my final code has to work in the Thingiverse customiser COUNT = 100; function fixStartIndex(v, index) = index < 0 ? len(v) + index : index; function fixEndIndex(v, index) = index < 0 ? len(v) + index + 1 : index; function sublist(list, start, end = -1) = let( startIndex = fixStartIndex(list, start), endIndex = fixEndIndex(list, end) ) (len(list) < startIndex) ? undef : (endIndex > len(list)) ? undef : (endIndex < startIndex) ? undef : startIndex == endIndex ? [] : [ for (i = [startIndex : endIndex - 1]) list[i] ]; function merge(left, right, l = 0, r = 0) = l >= len(left) && r >= len(right) ? [] : l >= len(left) ? sublist(right, r) : r >= len(right) ? sublist(left, l) : left[l] <= right[r] ? concat( [ left[l] ], merge(left, right, l = l + 1, r = r) ) : concat( [ right[r] ], merge(left, right, l = l, r = r + 1) ); function mergeSort(v) = len(v) < 2 ? v : let ( pivot = ceil(len(v) / 2), left = sublist(v, 0, pivot), right = sublist(v, pivot) ) merge(mergeSort(left), mergeSort(right)); module test(v) { echo("unsorted", v); echo("sorted", mergeSort(v)); } v = rands(0, 1000, COUNT); test(v); _______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org |
Do you have a specific reason not to just use quicksort, which is easily
implemented? Like do you need your sort to be stable? I can't think of a nonrecursive way to write the merge function, but it seems like you could rewrite it using tail recursion, which would solve the recursion depth problem. To get the advantage your function has to have the form function merge(...) = test ? merge(....) : alternative; or function merge(...) = test ? alternative : merge(...); So you need to accumulate the answer in a parameter, something like function merge(left,right,l=0, r=0, result=[]) = l >= len(left) && r>= len(right) ? result : merge(left,right,l+(r>=len(right) || left[l]<=right[r]?1:0), r+(l>=len(left) || left[l]>right[r]?1:0), concat(result, [r>=len(right) || left[i]<=right[r]?left[l]:right[r])); The above code was just a stab to get you started. Totally untested. But the basic method should work. You can't use "let", so everything has to be crammed into the recursive function call. Actually it crosses my mind that I don't know when tail recursion unwrapping was added. If it's not in the old version you need to use then I think you're out of luck and you need to change algorithms. -- Sent from: http://forum.openscad.org/ _______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org |
I tend to avoid quick sort because of the stability problem, as at
least partially sorted data is rather more common than not in the real world. I didn't think tail recursion unrolling was implemented in openSCAD 2015.03? On Sat, Apr 13, 2019 at 9:29 AM adrianv <[hidden email]> wrote: > > Do you have a specific reason not to just use quicksort, which is easily > implemented? Like do you need your sort to be stable? > > I can't think of a nonrecursive way to write the merge function, but it > seems like you could rewrite it using tail recursion, which would solve the > recursion depth problem. To get the advantage your function has to have > the form > > function merge(...) = test ? merge(....) : alternative; > > or > > function merge(...) = test ? alternative : merge(...); > > So you need to accumulate the answer in a parameter, something like > > function merge(left,right,l=0, r=0, result=[]) = l >= len(left) && r>= > len(right) ? result : > merge(left,right,l+(r>=len(right) || left[l]<=right[r]?1:0), > r+(l>=len(left) || left[l]>right[r]?1:0), concat(result, [r>=len(right) || > left[i]<=right[r]?left[l]:right[r])); > > The above code was just a stab to get you started. Totally untested. But > the basic method should work. You can't use "let", so everything has to be > crammed into the recursive function call. > > Actually it crosses my mind that I don't know when tail recursion unwrapping > was added. If it's not in the old version you need to use then I think > you're out of luck and you need to change algorithms. > > > > > > -- > Sent from: http://forum.openscad.org/ > > _______________________________________________ > OpenSCAD mailing list > [hidden email] > http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org _______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org |
The question with stability is do you care that items equal in the input list
might get permuted in the output list. If the answer is yes, then stability matters. If the answer is no, then it does not. Quicksort performance on partially sorted data should be very good, because pivot selection will be reasonable and the data will be divided in half. If you implement mergesort (with tail recursion unrolling) I'd be interested in seeing if it can compete with quicksort. The advantage of mergesort is a bound on worst case run time instead of just average run time. (Quicksort could fail if the data were perversely organized.) -- Sent from: http://forum.openscad.org/ _______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org |
The biggest problem with Quicksort is that partially sorted data is
the particular case which qualifies as "perversely organised". On the other hand, I was looking at some example timing runs, and it isn't that bad, and the quicksort implementatio is very simple.... I am working on a tail-recursive version of the merge function, bit I have the problem of the call to concat. I think that wil always break tail recursion, at least by OpenSCAD's limited ability to optimise it. On Sat, Apr 13, 2019 at 10:45 AM adrianv <[hidden email]> wrote: > > The question with stability is do you care that items equal in the input list > might get permuted in the output list. If the answer is yes, then stability > matters. If the answer is no, then it does not. Quicksort performance on > partially sorted data should be very good, because pivot selection will be > reasonable and the data will be divided in half. If you implement mergesort > (with tail recursion unrolling) I'd be interested in seeing if it can > compete with quicksort. The advantage of mergesort is a bound on worst > case run time instead of just average run time. (Quicksort could fail if > the data were perversely organized.) > > > > > -- > Sent from: http://forum.openscad.org/ > > _______________________________________________ > OpenSCAD mailing list > [hidden email] > http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org _______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org |
I appear to have been misguided, in that Quicksort performance even
with completely sorted data isn't that bad. I am still curious if there is a better way to write the merge function, though On Sat, Apr 13, 2019 at 11:04 AM A. Craig West <[hidden email]> wrote: > > The biggest problem with Quicksort is that partially sorted data is > the particular case which qualifies as "perversely organised". On the > other hand, I was looking at some example timing runs, and it isn't > that bad, and the quicksort implementatio is very simple.... > I am working on a tail-recursive version of the merge function, bit I > have the problem of the call to concat. I think that wil always break > tail recursion, at least by OpenSCAD's limited ability to optimise it. > > On Sat, Apr 13, 2019 at 10:45 AM adrianv <[hidden email]> wrote: > > > > The question with stability is do you care that items equal in the input list > > might get permuted in the output list. If the answer is yes, then stability > > matters. If the answer is no, then it does not. Quicksort performance on > > partially sorted data should be very good, because pivot selection will be > > reasonable and the data will be divided in half. If you implement mergesort > > (with tail recursion unrolling) I'd be interested in seeing if it can > > compete with quicksort. The advantage of mergesort is a bound on worst > > case run time instead of just average run time. (Quicksort could fail if > > the data were perversely organized.) > > > > > > > > > > -- > > Sent from: http://forum.openscad.org/ > > > > _______________________________________________ > > OpenSCAD mailing list > > [hidden email] > > http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org _______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org |
For quicksort, "perverse" is whatever order of data causes the pivot to be
selected so that all the data is on one side at every iteration. Then quicksort turns into an n^2 algorithm. There are different ways of selecting the pivot, so the exact definition of the perverse input may vary. But if you pick the pivot to be the middle data value then already sorted data is the ideal input. So for example, if you picked the pivot to be the first entry in the input then sorted data would in fact be the perverse input. The call to concat will not break tail recursion as long as it happens *within* the recursive call. That's why you have to organize the code like I showed in my example. Consider this simpler case: function sum_bad(v,i=0) = i==len(v) ? 0 : v[i]+sum_bad(v,i+1); This function will *not* get tail recursion optimization. How can it be fixed? Like this: function sum_good(v,i=0,total=0) i==len(v) ? total : sum_good(v,i+1,total+v[i]); In this case, the accumulating operation happens within the function argument, so that is fine. You can do the same thing with the mergesort, following the example code I showed to rewrite your merge() function so that the concatenation happens in the argument and the list is accumulating as an argument instead of as a return value. It's not pretty to pack all the computation into the function arguments, and it kind of seems inefficient to have to repeat your conditions over and over again, but I think it will work. This will work for the function that combines two lists (the merging operation). You cannot do this for the recursion that calls mergesort once on each half of the data. But that should be OK because you only need log(n) depth on that recursion. -- Sent from: http://forum.openscad.org/ _______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org |
That is the implementation of merge sort I have done some years ago. The merge function is tail recursive.
I have not used this code since then because quick_sort is much, much faster. For an array with 10000 random entries, merge_sort spends 9sec. while quick_sort nearly spends nothing. _______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org |
It seems I have picked a wrong version. The correct version has a different next_entry() function:
_______________________________________________ OpenSCAD mailing list [hidden email] http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org |
Free forum by Nabble | Edit this page |