CommuterJoy » Logbook

« logbook home

Posted by mattc at May 31, 08 05:05 AM

Posted by mattc at May 30, 08 05:05 AM

Posted by mattc at May 28, 08 01:13 PM ... Comments (0)

If you wanted to make a tarball but you have a bunch of .svn directories stuffing up the place this will make the tar.gz but exclude hidden directories,

tar -c --exclude '.*' -f - ~root/ | gzip > /tmp/foo.tar.gz

Posted by mattc at May 28, 08 05:05 AM

Posted by mattc at May 27, 08 01:00 PM ... Comments (0)

There's something quite pleasing about the pace of this short film, like watching synchronised swimmers.


(HD) A More Perfect Union from Andrew Sloat on Vimeo.

Posted by mattc at May 27, 08 05:05 AM

Posted by mattc at May 25, 08 05:05 AM

Posted by mattc at May 23, 08 05:05 AM

Posted by mattc at May 22, 08 05:05 AM

Posted by mattc at May 21, 08 09:37 AM ... Comments (0)

I feel a bit dumb for only having just discovered the patch command.

My local file system is full of copies of bits of code that have gradually morphed from what I set out to do to what I ended up with, a path strewn with fruitless (but brave, nonetheless!) diversions. Using Subversion does solve the problem managing ever changing files over time it's pretty bothersome to keep jumping around the revision history in your local working copy or even attempting to compare and run multiple versions of the same file at the same time. Even remembering which revision does what is a bit of a struggle unless you've got a good commit message convention.

I think patch can make this easier by allowing you to store your experiments from the main trunk code as a little library of diff snippets.

Eg.

Let's say you have a JavaScript file called 'original',

-- original --
// returns a charArray of a string
String.prototype.toCharArray = function(){
        var a = this.split("");
        return a;
}

If you copied the above file and added an experiment to it you might end up with this,

-- new --
// returns a charArray of a string
String.prototype.toCharArray = function(){
        if ( this.length == 0 ) // don't want empty arrays
                return false;
        var a = this.split("");
        return a;
}

You can now diff the two files and store the result as a patch file ...

diff orginal new > lengthcheck.patch

... the contents of which looks something like this,

-- lengthcheck.patch --
2a3,4
>       if ( this.length == 0 ) // don't want empty arrays
>               return false;

At some later date you can patch your code with following command.

patch -b orginal lengthcheck.patch

The -b switch makes a backup of your code. Patch will prompt you if it finds a problem and store any rejected patches in a seperate file for you to inspect.

The idea being that in the course of, say, a 2 day hacking session, you can keep the core code in your SVN trunk directory while the deviations, for better or worse, can be stored in a library of diff's that you can periodically merge in and out of your mainline development.


Posted by mattc at May 21, 08 05:05 AM

Posted by mattc at May 20, 08 05:05 AM

Posted by mattc at May 19, 08 06:52 AM ... Comments (0)





Posted by mattc at May 19, 08 05:05 AM

Posted by mattc at May 18, 08 05:05 AM

Posted by mattc at May 17, 08 05:05 AM

Posted by mattc at May 16, 08 05:05 AM

Posted by mattc at May 15, 08 12:48 PM ... Comments (0)

I think I might learn how to do things in Python, sysadmin tasks, mini web apps and the like.

I've come to know a healthy amount about Perl in the past few years, mainly due to it being the only language officially supported at work, but it has some things I've not really got on with.

It's error handling is a bit rubbish if you are used to the try/catch/throw style of some other languages. Errors in Perl are mostly handled by adding conditionals around (or on the end of) a bunch of statements.

# if there's a problem opening 'foo.txt' then exit with the error
open( file, 'foo.txt'  ) or die $!;

The OO stuff in Perl feels a bit contrived and it's easy to cheat or pick up bad habits. Some of Perl's basic functions remain resolute in their non-OOness...

# adding an item to an array, passing the array as an argument to the push functin
my @foo = ("a", "b", "c");
push(@foo, "d");

The feeling of tacked-on OO also manifests itself in calling a classes methods. You have to remember to pluck the object out from an implied argument before operating on it. Normally you'd expect to be able to use a 'this'-like reference without having to manage this sort of low-level stuff yourself.

# if foo was a method of some class, $class would hold a reference to the calling object.
sub foo{
 my $class = @_;
}

Perhaps the main reason I don't want to keep using Perl is that it hasn't seemed to introduce much of interest to the language over the 4 or 5 years since I've known it. Most other languages I know have had pretty significant upgrades and improvements in that time (XSLT, JavaScript ...). In that time Perl has had a few minor version number patches but I can't see anything to motivate a casual user like myself to upgrade so I just stick with whatever is on the box I'm using.

Maybe Python won't do these things any better, but I won't know until I try.

Updated

I forgot one other thing. Because I don't write Perl every day I find it a real struggle to remember the specifics of the often dense and syntax. For example to get the length of an array you need to remember the $# convention, which you eventually remember after the first few times, but something like '[array].length' would be more obvious. There's lots of little ticks like this $_ (implied variable), $! (error message), @_ (arguments to an subroutine) that you don't use so often as a casual developer and have to scout around to trigger your memory...

# assign the length of array 'foo' to $a
my $a = $#foo;

Posted by mattc at May 15, 08 05:05 AM

Posted by mattc at May 14, 08 10:45 AM ... Comments (0)

Jeff's rant about the ubiquity of XML got me thinking about the things I like about XML, so here's some notes...

1. Standard API's for parsing. - DOM, SAX, E4X, StAX etc. These are all attempts to provide standard (ie. langauge agnostic, by consensus) ways of reading/writing the XML data.

I know JSON and YAML etc. all come with their own parsers from which you can munge the YAML/JSON in to some internal data structure or object and set about looping over it and extracting bits out in whatever way you see fit but the API-like approach of defining how you should access data XML makes more sense to me. I'd rather developers I work with use standard ways of processing and accessing data than each doing their own thing.

Having said that, DOM isn't perfect, so there's many libraries (like JQuery) that provide convience access methods, and guess what? These differ per language. Argh!

2. Vocabularies. With XML I can define my own vocabularly and mix parts of existing vocabularies. Ascribing meaning the data you are working with forces you to think outside of how the infomation should be structured, more about what it represents. I've found JSON and YAML over-literal in their representations of data, so you end up designing formats that looks like data structures, which *is* great for many situations but loses something of the semantics.

3. Type checking. By defining vocabularies (either in RelaxNG or XML Schema) you will probably end up using XML Schema data types, making it easy to tightly define (and enforce the integrity) of the data you are working with. There's some really helpful default types (like ID + IDREFS) that solve specific problems when working with XML as well as the usual date, duration, uri types.

4. Same-langauge Schema. One useful byproduct of XML Schema (or RelaxNG) documents being defined in the same terms as the data themselves (ie. XML - a common complaint) is that they can very naturally become part of the transformation process (or, say, unit testing process).

Say your Schema includes an implicit attribute with a default value and your XML source documents sometimes include it, sometimes not. The knowledge of this particular attribute's behaviour and properties can be written in to the XML processing language without having to be overly specific about the details.

# if the attribute doesn't exist and is defined as mandatory in the schema, 
# then go and fetch the schema value and output it.
IF not(foo/@bar) and doc(schema.xml//element[@name = foo]/optional/attribute[@name = 'bar'])
   PRINT doc(schema.xml/...)
END

I think JSON schemas (being valid JSON themselves) will probably benefit from the same approach.

5. XPath 2.0. When using other formats I've never understood how to get the data I want from the JSON/YAML/CSV data structure other than having to write little subroutines with temporary data structures, loops etc. to extract, join, compare, transform the info in to what I want. That's sometimes ok, but XPath (particularly XPath 2.0 used with Saxon 9) elimates this problem for me by providing a hugely expressive set of statements for selecting, sorting, and querying parts of the document combined with some more mundane things like regexp and a whole variety of string functions.

I know a lot of people were put off from using XML as general container formats by XSLT & XPath 1.0 but I found version 2.0 feels so much more natural to author without having to jump off to using extension functions every other statement.

I'm not a complete XML zealot. The project I'm doing at the moment uses a variety of XML (for source data & communication from web services), JSON (for browser loaded data) and CSV (for producers to edit), whatever fits really.

Posted by mattc at May 14, 08 05:05 AM

Posted by mattc at May 13, 08 05:05 AM

Posted by mattc at May 11, 08 05:05 AM

Posted by mattc at May 10, 08 05:05 AM

Posted by mattc at May 8, 08 05:05 AM

Posted by mattc at May 2, 08 05:05 AM

Posted by mattc at May 1, 08 03:36 PM ... Comments (0)

"Cannot write an implicit result document if an explicit result document has been written to the same URI: file:/path/to/my/file.xml" at net.sf.saxon.Controller.checkImplicitResultTree

Odd error of the week. Ant (or Saxon) seems to run over everything in the basedir twice, and at the second pass creates the above error message. I'm using the following to run the transforms over an directory of xml documents,

<xslt basedir="${project.trunk}/xml/"
 destdir="${project.home}www/"
 extension=".xml"
 style="${project.home}/xsl/foo.xsl"
 classpath="${ant.lib}/saxon9.jar;${ant.lib}/ant-trax.jar"
 processor="trax"
 force="true"
 >
 <param name="foo" expression="hello world"/>
</xslt>

I fixed it by adding a include directive (xslt supports implict filesets) as a child of the xslt task.

<include name="**/*.xml"/>

random bookmark
link summary month October 2009 (1)
September 2009 (14)
August 2009 (16)
July 2009 (21)
June 2009 (24)
May 2009 (16)
April 2009 (2)
March 2009 (22)
February 2009 (11)
January 2009 (11)
December 2008 (9)
November 2008 (16)
October 2008 (18)
September 2008 (11)
August 2008 (12)
July 2008 (20)
June 2008 (15)
May 2008 (27)
April 2008 (9)
March 2008 (10)
February 2008 (8)
January 2008 (8)
December 2007 (12)
November 2007 (10)
October 2007 (10)
September 2007 (6)
August 2007 (13)
July 2007 (8)
June 2007 (10)
May 2007 (12)
April 2007 (5)
March 2007 (12)
February 2007 (13)
January 2007 (22)
December 2006 (21)
November 2006 (28)
August 2006 (1)
category code (15)
food (4)
notes (4)
photo (18)
project (2)
quote (12)
sketch (13)
soup (10)
travel (2)