CommuterJoy » Logbook

« logbook home

Posted by mattc at May 14, 08 10:45 AM ... Comments (0)

Jeff's rant about the ubiquity of XML got me thinking about the things I like about XML, so here's some notes...

1. Standard API's for parsing. - DOM, SAX, E4X, StAX etc. These are all attempts to provide standard (ie. langauge agnostic, by consensus) ways of reading/writing the XML data.

I know JSON and YAML etc. all come with their own parsers from which you can munge the YAML/JSON in to some internal data structure or object and set about looping over it and extracting bits out in whatever way you see fit but the API-like approach of defining how you should access data XML makes more sense to me. I'd rather developers I work with use standard ways of processing and accessing data than each doing their own thing.

Having said that, DOM isn't perfect, so there's many libraries (like JQuery) that provide convience access methods, and guess what? These differ per language. Argh!

2. Vocabularies. With XML I can define my own vocabularly and mix parts of existing vocabularies. Ascribing meaning the data you are working with forces you to think outside of how the infomation should be structured, more about what it represents. I've found JSON and YAML over-literal in their representations of data, so you end up designing formats that looks like data structures, which *is* great for many situations but loses something of the semantics.

3. Type checking. By defining vocabularies (either in RelaxNG or XML Schema) you will probably end up using XML Schema data types, making it easy to tightly define (and enforce the integrity) of the data you are working with. There's some really helpful default types (like ID + IDREFS) that solve specific problems when working with XML as well as the usual date, duration, uri types.

4. Same-langauge Schema. One useful byproduct of XML Schema (or RelaxNG) documents being defined in the same terms as the data themselves (ie. XML - a common complaint) is that they can very naturally become part of the transformation process (or, say, unit testing process).

Say your Schema includes an implicit attribute with a default value and your XML source documents sometimes include it, sometimes not. The knowledge of this particular attribute's behaviour and properties can be written in to the XML processing language without having to be overly specific about the details.

# if the attribute doesn't exist and is defined as mandatory in the schema, 
# then go and fetch the schema value and output it.
IF not(foo/@bar) and doc(schema.xml//element[@name = foo]/optional/attribute[@name = 'bar'])
   PRINT doc(schema.xml/...)
END

I think JSON schemas (being valid JSON themselves) will probably benefit from the same approach.

5. XPath 2.0. When using other formats I've never understood how to get the data I want from the JSON/YAML/CSV data structure other than having to write little subroutines with temporary data structures, loops etc. to extract, join, compare, transform the info in to what I want. That's sometimes ok, but XPath (particularly XPath 2.0 used with Saxon 9) elimates this problem for me by providing a hugely expressive set of statements for selecting, sorting, and querying parts of the document combined with some more mundane things like regexp and a whole variety of string functions.

I know a lot of people were put off from using XML as general container formats by XSLT & XPath 1.0 but I found version 2.0 feels so much more natural to author without having to jump off to using extension functions every other statement.

I'm not a complete XML zealot. The project I'm doing at the moment uses a variety of XML (for source data & communication from web services), JSON (for browser loaded data) and CSV (for producers to edit), whatever fits really.

Comments (0)

Post Your Comments

random bookmark
link summary month October 2009 (1)
September 2009 (14)
August 2009 (16)
July 2009 (21)
June 2009 (24)
May 2009 (16)
April 2009 (2)
March 2009 (22)
February 2009 (11)
January 2009 (11)
December 2008 (9)
November 2008 (16)
October 2008 (18)
September 2008 (11)
August 2008 (12)
July 2008 (20)
June 2008 (15)
May 2008 (27)
April 2008 (9)
March 2008 (10)
February 2008 (8)
January 2008 (8)
December 2007 (12)
November 2007 (10)
October 2007 (10)
September 2007 (6)
August 2007 (13)
July 2007 (8)
June 2007 (10)
May 2007 (12)
April 2007 (5)
March 2007 (12)
February 2007 (13)
January 2007 (22)
December 2006 (21)
November 2006 (28)
August 2006 (1)
category code (15)
food (4)
notes (4)
photo (18)
project (2)
quote (12)
sketch (13)
soup (10)
travel (2)