Monday, November 15, 2010

Ehcache for JRuby and Rails: Now with more flavor and fewer calories

In my last blog post I discussed how you can achieve Terabyte scale for JRuby and Rails, and judging from the response I received this is a topic of some interest to Ruby and Rails developers. In that post I explained how it is possible to use Ehcache as a Rails caching provider, or use its API directly from any JRuby application. Since that time, I've done some further work to make Ehcache integration with JRuby and Rails more robust and production ready. In this post I'll describe what's new and improved, including complete coverage of the Ehcache Java API. In a followup post I will discuss how to utilize the new Ehcache JRuby and Rails integration for large scale enterprise applications, including fully coherent distributed caching with Ehcache and Terracotta, and also how to use BigMemory for Enterprise Ehcache to make sure your application can handle any load you can throw at it.

More Flavor

In previous iterations of jruby-ehcache we took the approach of providing Ruby wrapper classes to encapsulate the functionality of Ehcache behind a nice Ruby interface. This had the advantage of making the API more idiomatic to Ruby, but it also meant that we did not provide full API coverage and that any time the Ehcache API changed we also had to update our Ruby wrapper API to accommodate the changes. As you can imagine, we weren't entirely satisfied with this approach and so went in search of a better mechanism. It turns out that JRuby's Java integration combined with Ruby's dynamic open classes provide exactly what we need for this.

The Java integration provided by JRuby makes it incredibly easy to use any Java API from Ruby code. All that is required is adding a simple require 'java' to your Ruby code and the whole of the Java landscape is opened up to you. We built on top of this in the latest jruby-ehcache gem by having the gem automatically set up the CLASSPATH and invoke require 'java'for you, and now with one line of code you instantly have the complete Ehcache API available to your Ruby application:

require 'ehcache'

With that one line of code in place, you can now use any part of the Ehcache API just as you would in a Java application, and JRuby even provides some extra niceties to make the Java API more Ruby-friendly:

require 'ehcache'
  cache_manager =
  cache = cache_manager.getCache("myCache")
  cache.put("answer", "42")
  answer = cache.get("answer")  # Returns Ehcache Element object
  puts "Answer: #{answer.value}"
  question = cache.get("question") # Returns nil
  if question
    puts "Question: #{question.value}"
    puts "I don't know the question"

I won't cover every detail of JRuby Java integration (see Calling Java From Ruby on the JRuby wiki for full details), but I do want to point out a couple of important details. First, notice how Java classes are referenced from Ruby code. The expression Java::NetSfEhcache::CacheManager is a reference to the Java class net.sf.ehcache.CacheManager. More generally, any Java class can be accessed within the Java module by transforming the package path by removing the dots and converting to CamelCase. Second, JRuby performs some magic to convert Java method names and JavaBeans property accessors to more Ruby-like equivalents. Thus, you can call the getCache method in any of three ways: getCache, get_cache, or simply cache.

That is nice enough, but as any good Rubyist will attest, Java APIs tend to be bloated and difficult to use compared to an equivalent Ruby API. Luckily, we can take advantage of Ruby's dynamic nature and support for open classes to provide a much more Rubyesque API without sacrificing full access to the underlying Java API. For instance, it would be nice if we could use the familiar array access notation to access cache entries, and while we're at it couldn't we also do away with the Ehcache Element object and just access the cache entry value directly? Let's see how this is done.

class Java::NetSfEhcache::Cache
    # Gets an element value from the cache.  Unlike the #get method, this method
    # returns the element value, not the Element object.
    def [](key)
      element = self.get(key)
      element ? element.value : nil

  # Later...
  forty_two = cache['answer']   # Returns the value, not the Element object

Here we open up the net.sf.ehcache.Cache class and add our own custom method to it to provide array access notation. Note that this is not inheritance and we are actually modifying the Cache class directly, so you can now use the [] operator on any Cache object, whether you created it yourself in Ruby code or it was created deep in the bowels of some legacy Java code. The world is yours.

Another bit of Ruby goodness we've added in the latest version is to make the Ehcache::CacheManager and Ehcache::Cache classes include the Ruby Enumerable module. Anyone who's done a significant amount of Ruby programming knows how powerful this module is, but for those who might not be familiar with it let's have a look at a few example usages that illustrate it's power.

# Find all cache entries with time to live greater than one minute.
  cache.find_all {|e| e.ttl > 60}

  # Which cache entry has the largest time to live value?
  cache.max {|e1, e2| e1.ttl <=> e2.ttl}

  # Are all cache entries strings?
  cache.all? {|e| e.value.is_a?(String)}

  # Does cache contain the ultimate question of life, the universe, and everything?
  cache.any? {|e| == 'The Ultimate Question'}

  # Sum of all numeric cache entries.
  cache.inject(0) {|sum, e| e.value.is_a?(Numeric) ? sum + e.value : sum}

  # Find the email address of creators of inferior programming languages
  cache.reject {|e| e.value == 'Ruby'}.map {|e| e.value.creator_email}.uniq

This is just a small preview of what Ruby's Enumerable provides. For full reference, see the documentation on Be aware that if you have a large cache, this kind of iteration over every element could be prohibitively expensive, but for smaller caches it provides a very powerful querying mechanism. For large caches, there is a new search API for Ehcache currently in the works, which uses indexing for efficient searching and will be available in an upcoming Ehcache release.

There are several other ways in which the Ruby API has been enhanced but I can't describe all of them here. If you're curious, see the RDoc API documentation that is bundled with the jruby-ehcache gem. And, of course, because the full Java API is available to you, you can also use the Ehcache Javadocs for reference.

Fewer Calories

In addition to adding the above features, we've also done some fat trimming for this latest release. First and foremost, we have deprecated the YAML configuration option in favor of using the Ehcache native XML configuration. We know that some Rubyists will be disappointed by this decision ("What? More XML?"), but we feel it was the right decision for several reasons:

  • The YAML configuration code is by far the most complicated bit of code in the Ehcache JRuby integration and we feel that it is a likely source of bugs.
  • YAML configuration is handled by pure Ruby code, and the Ehcache Java code is completely ignorant of it. Any time that there is a change to the Ehcache configuration format, it would require an update and new release to the Ruby YAML code, meaning that we'd be playing a continual catch-up game with Ehcache core.
  • There are subtle differences between the YAML configuration and the XML configuration that we feel can only lead to confusion in the long run.
  • Java developers who already use Ehcache will already have ehcache.xml configuration files that they can now use directly instead of translating to YAML.

While we're talking about configuration, I should mention that we've made some improvements to how your configuration files are located. Previous versions of jruby-ehcache required that you place a config.yml file in your $HOME/lib/config directory, which of course made it less than practical to have more than one application using jruby-ehcache at any given time. With version 1.0.0 you have a lot more options available to you. Now you can put your ehcache.xml either in the same directory as the Ruby file that creates the CacheManager object, or place it in your Java CLASSPATH, or you can specify any location in your call to the CacheManager constructor. If you are using Rails, then ehcache.xml will continue to reside in the canonical Rails config directory.

Finally, we've removed a limitation that prevented you from using versions of Ehcache other than that bundled with the jruby-ehcache gem, and made it easy to drop in Enterprise Ehcache JARs into your application. With the latest updates, jruby-ehcache will use your Java CLASSPATH to locate the Ehcache JARs it should use, instead of forcing the use of the bundled Ehcache. In my next blog post I will discuss how you can take advantage of this to add BigMemory to your application, or utilize distributed caching with Terracotta for linear scale out.

Further Reading

If you are interested in learning more about Ehcache and any of its associated add-ons, there are numerous resources available to you. Here are a few to get you started.

During my development on jruby-ehcache, I heavily utilized Gregory T. Brown's excellent book Ruby Best Practices for tips and techniques. I highly recommend this book to anyone doing serious Ruby development.