At Scout Advertising we have built an ad impression buying engine written in Java and running on Apache Tomcat 7. At peak, our server handles ~10-15K external requests / second. We are in the process of some major re-architecting to help us scale to 5-50x our current volume. As part of that effort, we decided to take steps to remove our JSON deserialization away from using Jersey 1.12, which uses Jettison 1.1.
Our long-term goal is to remove our dependency on Jersey so we can explore different web architectures for handling requests. I was tasked with removing the JSON deserialization step from Jersey and into our own module.
Criteria for new deserialization library
- deserialize to POJOs (plain old Java object) without too much custom code
- comparable speed to Jettison 1.1
- comparable CPU performance to Jettison 1.1
After researching libraries online, the general consensus is that the best JSON libraries for speed are GSON and Jackson:
http://stackoverflow.com/questions/2378402/jackson-vs-gson
http://www.linkedin.com/groups/Can-anyone-recommend-good-Java-50472.S.226644043
There is also a great benchmark for JSON library performance. It gives you a good sense of the libraries available. But you should always run benchmarks for your own use case, which I did below.
Try 1 – Jackson is the right man for the job
I decided to go with Jackson 1.x with data-bind. There is a lot of good documentation and it is widely used. We already used the library elsewhere in our codebase, so this approach wouldn’t add any more dependencies. It also has many supporters. The amount of effort to switch to using Jackson 1.x was minimal. Mainly involved changing the class annotations on our POJOs. After a good amount of testing, we released the code and everything was working fine. We host our bidding engines on AWS and after about a week we realized our servers were running hot (CPU) and we were using ~20% more servers on average (we employ a scaling policy based on CPU). The increase coincided with the release of the new deserialization code.
After digging through our commits, I was able to prove that the extra processing was coming from using the Jackson 1.x serialization vs Jersey’s Jettison library.
I was able to reproduce the results in our load testing environment. Perfect! My load tests showed Jackson was using ~15% more CPU and more memory as well. Here are the graphs of CPU and memory from VisualVM.
Jersey 1.12 w\ Jettison 1.1
CPU is hovering just below 80%.
Jackson 1.x (with databind)
CPU is hovering between 90-95%.
Try 2: Sequels are always better?
Now that I could reproduce the behaviour, the goal was to try other libraries that would perform better. I chose Jackson 2.x (with databind) over GSON since the only thing I had to do to switch was including a different library.
But still no luck. CPU was just as high.
Try 3: EclipseLink MOXy
I stumbled upon MOXy, which also uses the JAXB annotations to build objects. Getting the code up and running took a little bit of time. But once I got it working, I proved that MOXy used much less CPU than Jackson, and slightly less than Jettison. It also didn’t noticeably change our latencies, which was also a requirement.
I will be writing another post on how I used MOXy, since I had some trouble and no other tutorials that I found had worked for me.
In conclusion, we will be trying MOXy in production. It provides the speed without blowing out our CPU. I could’t find anything else on the web that compared CPU performance. Most benchmarks I found, compared speed.