This is My Data, This is My Code. Pt. 1.
A wonderful feature of Lisp, or Scheme, is the way that the code is written as a data structure… or is it that data structures are written as code? Either way, there is not only the opportunity for fancy procedural generation and modification of program code via data structure manipulation, there is also an attendant sense of peace in the grammar of the language. I have been playing with a Ruby version of this concept on-and-off recently, yielding quite hack-ish results of dubious utility, but this idea is also relevant to last week’s discussion of JSON vs. XML.
Dare made a series of posts on the topic that everyone interested should probably read since they, and the comments they generated, cover so much of the relevant territory, but I’d like to focus on this post in particular since I think it provides the best overview of the issue.
Dare links to a post by David Megginson that compares the data exchange syntax of JSON, XML, and Lisp, but dismisses it as being irrelevant to why some people are choosing JSON over XML. I think he’s right, but I also think the syntax comparison is helpful. The key insight into the syntax is that JSON is indeed simpler to both write and visually parse for simple data structures, but that XML’s explicitness, particularly with close tags, is actually helpful when composing documents. Trying to write Lisp code without an editor that helps match up closing parentheses is an exercise in needless frustration, and I’ve seen countless C-style programs that include comments next to each closing brace to indicate the block that they are closing. Such helpers indicate a possible deficiency of clarity in the syntax of the language, and XML addresses this by naming close tags. The conclusion of the post is that all three languages are equally expressive in terms of functionality, and that syntax differences shouldn’t be a significant issue in deciding which to use for data exchange.
I think the comparison of the syntaxes is actually fairly interesting, but let’s return to the issue of JSON’s popularity. I don’t think string encoding issues are paramount on many people’s minds (both JSON and XML can happily be used on an internal project without thinking about this at all) with the reach of UTF-8 and UTF-16, and so-called on-demand usage (i.e. hack the DOM to get around browser security) is almost a wash since it has nothing to do with the data exchange syntax. The issue really seems to boil down to parsing and usage. For many people, JSON parsing consists of var data = eval(myJSON); while XML parsing is automatic as long as one doesn’t mind DOM as the usage mechanism. That’s really it. If you’re using a library that parses XML and returns a JavaScript object, do you care what the wire format was?
If we agree that evaluating executable code from a third party is a risky activity, then, even with JSON, we are forced to use a parsing library. At that point, what have we gained by using JSON over XML?
XML arguably has a lot of design warts in its sometimes confusing way of handling namespaces, potential string encoding complexity, and prototypical verbosity. However one can, perhaps improperly, use XML in such a way that avoids these issues. But the DOM APIs remain, glowering down on the lowly programmer trying to send a data structure that maps to 3 lines of text, but must be accessed by DOM API calls that nobody unfamiliar with DOM could guess. I think that is really the entire issue here.
Footnote: E4X may revive web developer enthusiasm around XML due to its support in ActionScript (and thereby Tamarin, thereby Firefox). The holdup here, if there is one, will be caused by IE’s lack of support for E4X.