Wednesday, 20 May 2009

Grails Scalability

On one of the projects I have recently been working on, we came up against a scalability problem with the grails framework.

Our initial testing found that grails performed quite well in serving requested for the different web pages. However we found that as the number of concurrent requests being processed increased, the response times for each page increased dramatically.

A quick search of the web revealed that others had experienced this as well.

Most of the discussion around this focussed on the performance overhead of getting the "out" variable which goes through the groovy methodMissing infrastructure to resolve the out reference. Reducing the number of out lookups performed by caching the out reference in a local variable did increase the scalability of our application somewhat. However we are still experiencing a dramatic drop off in performance as the number of concurrent requestes increases.

We then used the Your Kit Java Profiler to profile the application to see what was going on. It seems that there is a lot of locking and waiting for the ExpandoMetaClass.isModified() method.

It turns out that Grails tag libs are implemented as singletons. This means that there is one instance of the tag lib for all requests. Our application was written with one custom taglib, and this has resulting in a lot of locking whenever a dynamic method is invoked (such as request, response or out).

In grails controllers and the classes between gsp pages are not singletons, so we are unsure as to why taglibs have been implemented as singletons.

We are currently looking at ways in which we can fix this problem out initial thought being to make the taglib request scoped in the spring context. This would avoid the locking problem as only the request thread would be accessing the ExpandoMetaClass for the tag lib.

It looks like we will have dig into the grails code to implement this though and we will have to see if the cost of creating the tag lib for each request is too large. If so then perhaps we can look at caching the taglibs as a threadlocal.

Any other suggestions or insights would be most welcome...

UPDATE - JIRA Case is at