Posted by mattc at Mar 16, 09 10:44 AM
... Comments (0)
Before bomber planes came in the existence WWI aircraft crew used to take a sack of bombs in to their biplane cockpits & lob them over the side after flying in to enemy territory [1]. I like practical actions in the face of technical inperfection.
[1] according to a Timewatch DVD I just watched.
Posted by mattc at May 14, 08 10:45 AM
... Comments (0)
Jeff's rant about the ubiquity of XML got me thinking about the things I like about XML, so here's some notes...
1. Standard API's for parsing. - DOM, SAX, E4X, StAX etc. These are all attempts to provide standard (ie. langauge agnostic, by consensus) ways of reading/writing the XML data.
I know JSON and YAML etc. all come with their own parsers from which you can munge the YAML/JSON in to some internal data structure or object and set about looping over it and extracting bits out in whatever way you see fit but the API-like approach of defining how you should access data XML makes more sense to me. I'd rather developers I work with use standard ways of processing and accessing data than each doing their own thing.
Having said that, DOM isn't perfect, so there's many libraries (like JQuery) that provide convience access methods, and guess what? These differ per language. Argh!
2. Vocabularies. With XML I can define my own vocabularly and mix parts of existing vocabularies. Ascribing meaning the data you are working with forces you to think outside of how the infomation should be structured, more about what it represents. I've found JSON and YAML over-literal in their representations of data, so you end up designing formats that looks like data structures, which *is* great for many situations but loses something of the semantics.
3. Type checking. By defining vocabularies (either in RelaxNG or XML Schema) you will probably end up using XML Schema data types, making it easy to tightly define (and enforce the integrity) of the data you are working with. There's some really helpful default types (like ID + IDREFS) that solve specific problems when working with XML as well as the usual date, duration, uri types.
4. Same-langauge Schema. One useful byproduct of XML Schema (or RelaxNG) documents being defined in the same terms as the data themselves (ie. XML - a common complaint) is that they can very naturally become part of the transformation process (or, say, unit testing process).
Say your Schema includes an implicit attribute with a default value and your XML source documents sometimes include it, sometimes not. The knowledge of this particular attribute's behaviour and properties can be written in to the XML processing language without having to be overly specific about the details.
# if the attribute doesn't exist and is defined as mandatory in the schema,
# then go and fetch the schema value and output it.
IF not(foo/@bar) and doc(schema.xml//element[@name = foo]/optional/attribute[@name = 'bar'])
PRINT doc(schema.xml/...)
END
I think JSON schemas (being valid JSON themselves) will probably benefit from the same approach.
5. XPath 2.0. When using other formats I've never understood how to get the data I want from the JSON/YAML/CSV data structure other than having to write little subroutines with temporary data structures, loops etc. to extract, join, compare, transform the info in to what I want. That's sometimes ok, but XPath (particularly XPath 2.0 used with Saxon 9) elimates this problem for me by providing a hugely expressive set of statements for selecting, sorting, and querying parts of the document combined with some more mundane things like regexp and a whole variety of string functions.
I know a lot of people were put off from using XML as general container formats by XSLT & XPath 1.0 but I found version 2.0 feels so much more natural to author without having to jump off to using extension functions every other statement.
I'm not a complete XML zealot. The project I'm doing at the moment uses a variety of XML (for source data & communication from web services), JSON (for browser loaded data) and CSV (for producers to edit), whatever fits really.
Posted by mattc at Dec 13, 06 04:36 PM
... Comments (0)
I was just about to write a regular expression, when suddenly ...
I stumbled on the fact that upon feeding dates formatted as RFC 822 (as commonly found in RSS 2.0) in to a newly instantiated Javascript Date object it just handles it. No ifs or buts, it just works. I didn't expect that.
var foo = 'Fri, 04 Apr 2003 05:04:39 GMT';
var bar = new Date( foo );
var woo = bar.getYear() // woo holds '2003'
How very helpful. This means I could combine some getElementsByTagName construct with Date to give me an array of feed items by date without too much fuss ...
var foo = new Array();
// where 'o' is the response from some xmlHTTP request
var rss = o.responseXML.getElementsByTagName("item");
for ( var j = 0; j < rss.length; j++ ) {
foo.push( new Date( rss[j].getElementsByTagName("pubDate")[0].textContent ) );
}
But wait. Simon and Mark point out that RSS has many dates and times.
So I wonder how JavaScript handles these?
// load each date type in to foo
var foo = new Array('2003-03-21T16:28:40', '2003-04-03T07:45:57-08:00', 'Fri, 04 Apr 2003 05:04:39 GMT', 'Fri, 28 Mar 2003 05:18:59 -0800', '1049379042.0', '2003-03-21T16:28:40', '2003-01-17T13:03:00+00:00', '2003-03-27T19:41:49-06:00' );
// iterate foo and write the year to the screen
for ( var i = 0; i < foo.length; i++ ) {
var bar = new Date( foo[i] );
// print output, attempt to call getFullYear
document.write( foo[i] + " - " + bar.getFullYear() + "\n" );
}
In Opera 9, almost perfectly ...
2003-03-21T16:28:40 - 2003
2003-04-03T07:45:57-08:00 - 2003
Fri, 04 Apr 2003 05:04:39 GMT - 2003
Fri, 28 Mar 2003 05:18:59 -0800 - 2003
1049379042.0 - NaN // bah!
2003-03-21T16:28:40 - 2003
2003-01-17T13:03:00+00:00 - 2003
2003-03-27T19:41:49-06:00 - 2003
The only error is the obscure '1049379042.0', which I assume is a reference to the number of seconds passed since midnight 1970. I'm not sure who is using that in their pubDate fields !?
IE 6, Firefox 1.5 & Safari 2.0.4 do much worse, only managing to parse and return valid Date objects from 2 out of the 7 dates.
2003-03-21T16:28:40 - NaN
2003-04-03T07:45:57-08:00 - NaN
Fri, 04 Apr 2003 05:04:39 GMT - 2003
Fri, 28 Mar 2003 05:18:59 -0800 - 2003
1049379042 - NaN
2003-03-21T16:28:40 - NaN
2003-01-17T13:03:00+00:00 - NaN
2003-03-27T19:41:49-06:00 - NaN
So, to recap, Opera's Date object supports ISO 8601 date parsing upon construction, everything else doesn't.
I find the ECMA standard terse at the best of times, it's unclear whether it's meant to be doing this or not.
I also see MochiKit provides extensions for this sort of thing via it's DateTime library.
Posted by mattc at Nov 26, 06 06:47 PM
... Comments (0)
Here's my notes from the very useful Testing Computer Software by Cem Kaner.
There's some quite concisely expressed profundities in the early chapters, my two favourites being ...
A great programmer is less likely than a incompetent tester. (chapter 2)
... and ...
The 'quality' of software is fundamentally measured in human terms. Therefore, in testing for bugs we are looking to determine the *degree* of usefulness of the system to the user. (chapter 4)