Setting up EclipseLink MOXy

I wrote earlier how I found that the EclipseLink MOXy library performed great in deserialization of JSON.  I wanted to share a workaround that worked for me that I didn’t find anywhere else.

1. First here is some sample code for doing manual deserialization of JSON using MOXy.

2. My root element got annotated with @XmlRootElement.

3. All my objects were annotated with @XmlAccessorType(XmlAccessType.FIELD). Other types here.

4. All my fields with names that didn’t match how they came in over the wire, I annotated with: @XmlElement(name = “<json key>”). For example, if my POJO attribute name is “personId” but it comes in as “id”, I would annotate like:

@XmlElement(name = "id")
public Integer personId;

5. I removed my annotations and my paramters from my servlets.

6. Many sites tell you to create a jaxb.properties file in the directory where deserialization POJO’s live and add the following text. This tells JAXB which deserialization to use.

javax.xml.bind.context.factory=org.eclipse.persistence.jaxb.JAXBContextFactory

But this didn’t work for me. Instead of creating JAXBContext objects statically using

JAXBContext jc = JAXBContext.newInstance(Request.class);

I decided to generate them using the JAXBContextFactory in code:

JAXBContext jc = org.eclipse.persistence.jaxb.JAXBContextFactory.createContext(new Class[] { Request.class }, null);

And this gave me access to my POJO’s I annotated. Once I figured that out, everything worked.

Some links that I found helpful:

Advertisement

JSON Deserialization and CPU

At Scout Advertising we have built an ad impression buying engine written in Java and running on Apache Tomcat 7.  At peak, our server handles ~10-15K external requests / second.  We are in the process of some major re-architecting to help us scale to 5-50x our current volume.  As part of that effort, we decided to take steps to remove our JSON deserialization away from using Jersey 1.12, which uses Jettison 1.1.

Our long-term goal is to remove our dependency on Jersey so we can explore different web architectures for handling requests.  I was tasked with removing the JSON deserialization step from Jersey and into our own module.

Criteria for new deserialization library

  • deserialize to POJOs (plain old Java object) without too much custom code
  • comparable speed to Jettison 1.1
  • comparable CPU performance to Jettison 1.1

After researching libraries online, the general consensus is that the best JSON libraries for speed are GSON and Jackson:

http://stackoverflow.com/questions/2378402/jackson-vs-gson

http://www.linkedin.com/groups/Can-anyone-recommend-good-Java-50472.S.226644043

There is also a great benchmark for JSON library performance.  It gives you a good sense of the libraries available.  But you should always run benchmarks for your own use case, which I did below.

Try 1 – Jackson is the right man for the job

I decided to go with Jackson 1.x with data-bind.  There is a lot of good documentation and it is widely used.  We already used the library elsewhere in our codebase, so this approach wouldn’t add any more dependencies.  It also has many supporters.  The amount of effort to switch to using Jackson 1.x was minimal.  Mainly involved changing the class annotations on our POJOs.  After a good amount of testing, we released the code and everything was working fine.  We host our bidding engines on AWS and after about a week we realized our servers were running hot (CPU) and we were using ~20% more servers on average (we employ a scaling policy based on CPU).  The increase coincided with the release of the new deserialization code.

After digging through our commits, I was able to prove that the extra processing was coming from using the Jackson 1.x serialization vs Jersey’s Jettison library.

I was able to reproduce the results in our load testing environment.  Perfect!  My load tests showed Jackson was using ~15% more CPU and more memory as well.  Here are the graphs of CPU and memory from VisualVM.

Jersey 1.12 w\ Jettison 1.1

CPU is hovering just below 80%.

Image

Jackson 1.x (with databind)

CPU is hovering between 90-95%.

Image

Try 2: Sequels are always better?

Now that I could reproduce the behaviour, the goal was to try other libraries that would perform better.  I chose Jackson 2.x (with databind) over GSON since the only thing I had to do to switch was including a different library.

But still no luck.  CPU was just as high.

Try 3: EclipseLink MOXy

I stumbled upon MOXy, which also uses the JAXB annotations to build objects.  Getting the code up and running took a little bit of time.  But once I got it working, I proved that MOXy used much less CPU than Jackson, and slightly less than Jettison.  It also didn’t noticeably change our latencies, which was also a requirement.

Image  

I will be writing another post on how I used MOXy, since I had some trouble and no other tutorials that I found had worked for me.

In conclusion, we will be trying MOXy in production.  It provides the speed without blowing out our CPU.  I could’t find anything else on the web that compared CPU performance.  Most benchmarks I found, compared speed.