Hessian RPC

16 Oct 2009

Over the last few days I’ve been playing with Hessian, “a compact binary protocol for connecting web services”. In my previous company we used Hessian extensively for communicating between a Java thick client and a Java Apache Tomcat HTTP server with good success. These days we talk of JSON and REST and peer our noses down at thick clients so Hessian might seem irrelevant, however around the time we were implementing our client-server communications (2004 / 2005), we were bathing in the waters of SOAP, WSDL and so-called heavyweight web services.

The beauty of Hessian was our ability to take our Plain Old Java Objects which we had already implemented on our thick client and send them down the pipe unchanged to our server. Hessian took care of the marshalling and unmarshalling of data. In fact, because we took advantage of Hessian integration with the Spring Framework, a declarative application framework which encourages defining objects and their relationships and dependencies in configuration files, all it took was a bit of code and a bit of configuration to get everything working.

So does it now make more sense to use JSON / REST? One of the advantages of JSON / REST includes the inherent decoupling of client and server. The client fires a JSON string to the server at the correct URL using an HTTP POST and the server parses what it needs from that string and happily replies. This process is platform agnostic as HTTP and JSON libraries are available for many programming languages and platforms, not least including Javascript in the web browser. This model is widely used by service providers such as Google and Amazon whereby they can provide and update REST interfaces to their services without having to deliver and maintain multiple API client libraries. A drawback of this model is the need to hand code the marshalling and unmarshalling of JSON data by both client and server, though this can also be seen as an advantage as it decouples an application’s internal representation of data from the wire format.

JSON, REST & HTTP

Hessian compares well with the JSON / REST model. Hessian is also designed around HTTP POST whereby a client connects to a URL on the server and sends data, however Hessian goes one step further and encodes an RPC call i.e. a function name and arguments. In fact the Hessian library makes this process transparent by proxying the server i.e. it provides an object on which the client makes function calls without knowing that the call will be sent to a server. Note that there is no “contract” or abstract interface which you are forced to code to – client and server ensure they’re sending and receiving the correct function arguments by “unwritten agreement” much like the JSON / REST model. Unlike JSON, Hessian is a binary protocol meaning that the data exchanged between client and server is very compact. It also encodes type information, in fact, entire object structures are maintained when unmarshalled on either client or server. Hessian is also cross platform and libraries exist for many programming languages including Javascript.

So what’s not to like? Well binary communication and the concept of RPC function calls in general seems to have gained a bad reputation, possibly due to the extra complexity and library support needed over simple JSON / REST and possibly because of the increased coupling an RPC call implies. Experience at my previous company taught us that the communication can be little brittle if the definitions of objects sent over the wire are not kept in step on both client and server. If an object sent from the client to the server has an extra unknown field, there will be an error when the Hessian library on the server tries to unmarshall that data to create an object. (The reverse, however, is not true – any fields missing from data over the wire will simply end up unset on the unmarshalled object).

Passing JSON over HTTP is much more forgiving in that the client or server will blissfully ignore any field it doesn’t know how to handle, though of course if a field that is expected is not found, the server must know handle that. Ordinarily, keeping the client and server in step shouldn’t be a problem, however we had many clients in the field with different versions of our software all connecting to the same server.

It has only recently occurred to me that the brittleness described above is peculiar to statically typed languages such as Java where an Exception is thrown at any attempt to apply a value to a field where that field not been defined in an object’s class. The same is not true of dynamically typed languages such as Python which is forgiving when applying values to arbitrary fields on an object. For many years, hessianlib.py has been the standard Python implementation of Hessian. It has been little unmaintained over that time and includes a Hessian client implementation but no Hessian server implementation. The code is also a little impenetrable. Happily, earlier this year a fork of hessianlib.py called Mustaine has appeared. It doesn’t (yet) contain a server implementation, but the code is more penetrable so I submitted a patch with an implementation of a Hessian WSGI server.

Let’s see some code based on the proposed mustaine.server module. Please note that Mustaine server support is in flux so this example is subject to change. An object can be served via WSGI by wrapping it with mustaine.server.WsgiApp. An object’s methods are only exposed if decorated with the mustaine.server.exposed decorator. For example:

from mustaine.server import exposed

class Calculator(object):
    @exposed
    def add(self, a, b):
        return a + b

    @exposed
    def subtract(self, a, b):
        return a - b

The following code will serve a Calculator() object on port 8080 using the Python reference WSGI server:

from wsgiref import simple_server
from mustaine.server import WsgiApp

s = simple_server.make_server('', 8080, WsgiApp(Calculator()))
s.serve_forever()

This object can now be accessed over the network using the Hessian client:

>>> from mustaine.client import HessianProxy
>>> h = HessianProxy('http://localhost:8080/')
>>> h.add(2, 3)
5

As a result of providing server support to Mustaine, I’ve started developing django-hessian, a library which serves Hessian objects in Django. Objects can be served using djangohessian.Dispatcher at a given URL with an entry in urls.py. The Calculator() object described above can be served at the URL http://localhost:8000/rpc/calculator/ in the Django development server as follows:

# mysite/urls.py:
from django.conf.urls.defaults import *

urlpatterns = patterns('',
    (r'^rpc/', include('mysite.myapp.urls')),
)
# mysite/myapp/urls.py:
from django.conf.urls.defaults import *
from djangohessian import Dispatcher
from server import Calculator

urlpatterns = patterns('',
    url(r'^calculator/', Dispatcher(Calculator())),
)

Full source can be found at http://bitbucket.com/safehammad/django-hessian/.

I wonder whether the Hessian protocol isn’t getting attention it deserves, particularly in environments where both client and server are delivered and maintained by a single provider.