Wednesday, November 5, 2008

Groovy Sort Follow-up

Here is another Groovy post, but after last weeks post about sorting in Groovy, I just wanted to make a couple of points on that.

First I had left with a question:

how Groovy distinguishes between the Closure version of the Comparator and the straight Closure which returns the Comparable object

My guess was that there is special code to deal with it, and a little digging into the Groovy source code showed this to be correct.

The magic happens in the file src\main\org\codehaus\groovy\runtime\DefaultGroovyMethods.java:
public static List sort(Collection self, Closure closure) { 
List list = asList(self); 
// use a comparator of one item or two 
int params = closure.getMaximumNumberOfParameters(); 
if (params == 1) { 
   Collections.sort(list, new OrderBy(closure)); 
} else { 
   Collections.sort(list, new ClosureComparator(closure)); 

return list; 

As you can see, the Groovy Java support code does in fact check what type of Closure it gets, and then builds an appropriate Comparator to pass into the regular Java sort function.

At work, a standard question that always comes up as we consider doing a production project in Groovy is: How does Groovy perform versus Java. It is true that the code is much nicer in Groovy, but how much slower will it run?

As we can see above, both Groovy and Java end up in the exact same function. The only difference is the work the Comparator will do. In Java, it is a straightforward compare. In Groovy it is either doing closure work or some other kind of comparator.

I timed a number of different possibilities in Groovy and Java. (This of course comes with the standard caveat of Beware of Microbenchmarks).

In order to avoid problems I did the following:

  • I randomly built lists with 20,000 Things to sort. (see thel last Groovy post as a reminder on what Things are). I created the same lists for all the sorting code at the same time.
  • After sorting the lists, I printed out the first element of each list to ensure that the optimizer wasn’t cutting it out or something.
  • I created and sorted lists 3 times to “warm up” the code
  • I then ran 100 times and averaged the runs.

So keeping those risks in mind, I timed the following codes (the time in parens is the average time for sorting the list using the given code):
  1. Basic Java sort (15 ms):
    Collections.sort(inList,new Comparator<Thing>() {
    public int compare(Thing o1, Thing o2) {
    return (o1.val.compareTo(o2.val));
    }
    });



  2. Basic Groovy Closure (487 ms):

    inList.sort {
    it.val
    }



  3. Groovy using the two param closure (596 ms):

    inList.sort {o1, o2 ->
    o1.val.compareTo(o2.val)
    }

  4. Groovy using a dynamically built Comparator (612 ms):

    inList.sort([compare:{a,b->
    a.val.compareTo(b.val) } ] as Comparator)



  5. Groovy using a prebuilt Comparator (444 ms):

    The Groovy Comparator:

    class StringComparator implements Comparator {
    int compare(o1, o2) {
    return o1.val.compareTo(o2.val)
    }
    }
    StringComparator sc = new StringComparator();

    And here is the sort code:

    inList.sort(sc)



Some interesting things can be seen here:


  • Java is shockingly faster – so much so that I am worried that I am falling for one of the micro-benchmark problems


  • The dynamic comparator versus the pre-built comparator is 50% slower – probably caused by the extra redirection and lookup the dynamic one requires.


  • I am also surprised that the two param closure was slower then the one param version. I would have thought that they would do similar work.

2 comments:

Anonymous said...

very instrusting

Jarek said...

In-depth,short and useful; just the way I like it!