Why does parsing XML in Java need so much code ?

I attended JAX London this week, including a workshop on Monday, during which we had to write a Java application. As part of this I was called upon to load an XML file into the domain model, sounds easy....

Its been a while since I have done any XML reading, with the exception of some XPath in Apache Camel routes, and I had forgotten how annoying the Java XML readers are. Following a quick google the usual suspects presented themselves:

  • SAX
  • DOM
  • XML Beans
Also a selection of other libraries, all of which were complete overkill for the small bit of XML I had to parse.  I looked at some example snippets entitled "Simple examples", do you really need that much code (and mainly boiler-plate) just to parse a bit of structured text ?

I now remember why most of the codebases I have encountered include an "xml-library" or "xml-utilities" to wrap the other XML libraries and provide a higher level ""simple" api.

I was also inspired by several of presenters at JAX to participate in the open source community, so I have decided to start uploading all the projects currently trapped on my laptop to github.  As a learning exercise I have started a new project "xml-mapper" on github to investigate alternative ways to address the issue, it might not go anywhere, who knows......My idea is to cross the simple xpath syntax with the stream processing of SAX.  The intention is not to provide complete support for all XML features, just those that are encountered by most engineers on a day-to-day basis. Let me know if you want to help out.


Popular Posts