Meetups + JavaScript grammar
Tuesday February 26th 2008, 5:57 am
Filed under: Hack/mash/DIY, JavaScript, Yours Truly

It’s amazing what gets accomplished when in the vicinity of multiple competent programmers. Again, the basic tenet of agile programming and open source: the more eyeballs are focused on a problem, the clearer it becomes.

So I knew I had to keep the date for tonight’s JavaScript meetup. And I wasn’t disappointed either. Some of the best talent in the area, including John Resig of Mozilla, were in attendance. We talked about a little bit of everything, had live demos, made in-jokes. And although I didn’t come with any questions in mind, already someone’s managed to challenge one of the major assumptions upon which I base judgments of the feasibility of various tasks in JavaScript: that you can’t have custom parser objects.

If you’d told me it was impossible to support some other unimplemented feature in JavaScript–say, classes with public and private variables–I would have scoffed, of course. Then, with the naive, persistence-conquers-all smugness of a freshly-graduated computer scientist I’d have told you that the necessary groundwork was in place for you to write your own. Douglas Crockford says so. Nyah-nyah-nyah-nyah-nyah-nyah (thumbs nose).

Of course I’d be omitting that it’s kind of friggin hard, because it requires you to wrestle with basic lambda calculus; and that even once you conquer the basic problems, you run into the fact that with no low-level implementation, all the classing systems people dream up are going to work completely differently, resulting in interoperability nightmares to no end.

Nevertheless I would be correct. The capacity of the JavaScript language to emulate programming constructs from other languages is an exemplary case of the CS concept of Turing-completeness, which in layman’s terms says that, once you’ve got the whole kit’n'caboodle, honey you’ve got it.

This applies equally to writing parser functions in JavaScript. A parser is the software implementation of a grammar for some computer-readable text language. Web browsers are, of necessity, loaded with parsers–they need them in order to understand the various web standard languages. What they lack is a public implementation of a parser generator, the yet-stranger object that allows programmers to roll their own. Parser generators are the stuff of compilers, not web browsers.

And given that Microsoft and Mozilla employees put a lot of effort into streamlining their own parser functions to run as fast as possible, they have ample reason to encourage us to make the best possible use of what they provide to us–JavaScript eval, regular expressions, and XML/SGML Document Object Model. But when your needs extend beyond these, the above still applies: because it’s a Turing-complete language, you can roll your own parsers and even parser generators.

I was previously distracted from this glaringly obvious fact by the limitations of the next-best-thing. Regular expressions are a slick mechanism for generating lexers, which are slightly dumber than parsers but good enough to handle the bulk of text processing in languages like Perl and JavaScript. Tasks that they cannot handle are generally avoided like the plague for simplicity’s sake. But their limitations start to become maddening when you are trying to:

  1. Identify and introspect blocks of JavaScript or CSS source code
  2. Consistently handle HTML and XML entities in text
  3. Comprehend any language for which the browser does not implement a grammar

I learned this the hard way when I tried to implement bastardized Java classes in JavaScript. Crockford and others have demonstrated the coding style that makes public and private variables possible, but it’s difficult to master and not very human-readable. My solution was simply to convert between the two forms at runtime*. But it turns out to be a task one notch too hard for any lexer to handle, and requires at least a basic grammar.

What’s missing from this picture? Well, as every CS major knows, there is another class of language processor, to which lexers and grammars are both inferior. That is the Turing Machine–Turing as in Alan Turing, and also as in Turing-completeness–the standard theoretical definition of a computer or software environment. Because JavaScript has that level of power, it’s a foregone conclusion that it can not only create parsers, it can also create a parser generator, the component that lets programmers specify their grammar in some formal syntax like BNF, drastically expediting the process of code generation.

As to the all-important question of whether anybody’s bothered to do so, I’m currently investigating. It appears that possibly one and a half implementations exist, of unknown quality and specificity. I do know that it would not be a particularly pleasant thing to implement; but after a little thought I’m also convinced that it not only can be done, it can be done with decent performance and (for limited cases like mine) decently unobtrusive notation. That it should be done, if only for the likes of me and John Resig.

UPDATE 02/26/08: with a bit more searching, I found a pretty solid library called JS/CC (JavaScript Compiler Compiler) that’s based directly on the Lex/YACC toolchain popularized by Unix and GNU. Its lineage implies much of the parser spec for a given programming language (e.g. Java or Javascript) could be copied, almost verbatim, from an existing compiler or interpreter (e.g. Rhino). It remains to be seen whether its limitations are deal-breakers: creating the parser at runtime is unreasonably slow; storing it compiled makes for unreasonably large and opaque JavaScript code. I imagine it’s for this reason that the other implementations I saw weren’t Lex/YACC workalikes; other kinds of parser generator may be more effective at leveraging the JS interpreter.
*You do this by extracting and modifying the individual declarations based on Java-esque keywords, then encapsulating it all in a constructor function. Thus, the declaration $public x = 0 becomes this.x = 0, while $private x = 0 becomes var x = 0. The former is an instance property; the latter is a constructor-scoped variable, and hence can only be seen by internal class logic.



Innovation, Complacency, Relevance
Thursday February 21st 2008, 4:50 am
Filed under: MATLAB, Ranting and Raving

Clarification: after mentioning this at work, I had an edifying discussion with another MathWorker, which pointed out some flaws in my arguments. Here’s a layman’s summary:

  1. Wolfram Demonstrations isn’t the usage model I took it for. They do provide every Mathematica user with some kind of authoring tool, probably similar to MATLAB’s GUIDE editor, but with the “Manipulate” thingy instead of a Handle Graphic. But the tool does not produce a redistributable program, just a notebook and some metadata.
  2. To make a demo, Mathematica users must login and submit code to the website, which is vaguely like the MATLAB Central File Exchange on steroids. It’s a moderated queue to an in-house team armed with–you guessed it!–a compiler. I’m just filling in blanks here. Mathematica “notebooks” go in one side, Mathematica player applets come out the other.
  3. I’ve mentioned that MathWorks also sells a compiler toolchain and accompanying runtime libraries. Note that Wolfram Demonstrations offers its “blessing” only to a specific kind of graphical application fit to be “played back”. I cannot speak to what they have up their sleeves, except to say it includes this. I do know that developers who purchase MCC/MCR can use the bulk of the feature set with no additional charges for reproduction and redistribution. So if it was just oneupmanship we cared about, then yeah, we probably could. But the stakes are not what I imagined them to be.

My current job, which ends next week, has given me an interesting angle on the schism between commercial and non-commercial attitudes toward open source and toward the software development process. It’s roughly analogous to the schism some of us students and alumni see in Olin and other learning institutions–more on that in a minute.

First of all, we don’t make free or open source software (“FOSS”). Insofar as I am aware, MathWorks-owned code is ~0% free (for MATLAB aficionados: that’s ~ as in circa, not ~ as in logical negation). In terms of the Web, a place dominated by noncommercial projects like Dojo and pro-FOSS shops like Google, this dates us to approximately the Bronze Age–but we don’t live on Web terms. By Desktop terms, we are exceedingly typical, and while our licensing model has been rightly criticized we are probably not the evilest people out there.

What we do do is build, brick-by-brick, on a broad range of lawyer-approved FOSS and proprietary libraries to bring you MATLAB in all its featureful glory. To name a few examples, the product as sold includes, wholesale: the not-quite-free Java SE platform, the free-but-permissive Apache XML toolchain, a free-under-the-covers web browser, and optional support for Maple, the non-free commercial algebra platform. For ourselves, of course, we have prettier and wilder things, including the usual glut of free software programming tools (FSF philosophy still dominates in that area, thanks in large part to savvy licensing), the usual Overpriced Graphical Cxx Debugger Whom Shall Not Be Named, and some spiffy intranet productivity applications.

But for the product, we’re always cautious, and I am coming to understand that the GPL and its brethren are anathema to corporate lawyers, just as SCO and the old System V license agreements are anathema to the hacking world.

We clearly owe a lot to FOSS and to the underlying philosophy, and that influence should only grow louder as the scale of our product ups the demand for agile methods and heavy peer review. MATLAB simply dates to a period in the history of computer science before mortal programmers fully understood these things; a dark ages when all corporations licensed the bejeesus out of their software, and violating trade secret to obtain a more thorough peer review was the way things got done. I think we reflect the distinction in subtle ways–for instance when organizing our user community.

Perl has CPAN; PHP has PEAR; even Java has its community projects page; in such company, MATLAB Central stands out as a different sort of response to the SourceForge imperative. I’m not complaining that the file exchange uses e-mail and ZIP files in place of newfangled source control tools, or that the link exchange doesn’t segregate offsite FOSS projects from MATLAB Toolbox product pages. In light of the largely research-oriented user base, our design decisions are generally well supported.

No, what irks me, just a little bit, is the dearth of copyright and licensing information.

One can safely assume “Copyright 200x [author name], all rights reserved”, but in the FOSS community that’s often not good enough. Users want to know, “What if I want to deploy this in my company? What are my rights?” The answer seems to be, download the files and look for a licensing notice, or if you’re real particular you could e-mail the author and ask. Again, a reasonable design decision? Well, probably… for now. If we ever want to be competitive with generic scripting languages on their home turf, niceties like project licensing are going to start to matter.

Should we be concerned? Well, we’re a growing company, and are constantly expanding the capabilities (and price tag) of our product. We can’t count on the steady growth of scientific research to support us by itself. Similarly, FOSS enthusiasts are always pushing the boundaries of scripting languages, adding more and better math and graphics support, encroaching on us. If we want to convince more people to capitalize on our newly expanded capabilities, and pay us money for them, then we need to speak the lingo, and we need to offer arguments more compelling than FUD.

(As an aside, there’s something surreal and very 1998 about finally hearing salespeople use the term FUD in context.)

True story: 15 years ago now, the Internet came through and made everything different. Used to be your team could satisfy itself with Doing One Thing Well. Now people expect all kinds of silly things from their software. It was at least a hundred years ago, by the Internet’s fickle stopwatch, that Maple began selling as a client-server package, with all the number-crunching taking place across the network from the shiny Java UI. The other heavyweights in the business didn’t immediately follow suit–but they did something equally interesting. They added native support for distributed computing, a trick that makes certain kinds of calculations very scalable. And again, this displaces the act of computation away from the user, over the network, to some other machine.

Now, fast-forward to the present date; the situation gets more interesting. Wolfram’s demos aim for a newly expanded audience, with the common denominator set by installation requirements of the Mathematica Player, available at no cost from the site. MATLAB has a similar component, accessible from programs compiled by MATLAB with one of the Builder toolboxes. Both are basically the same thing–a hardened, locked-down version of the full application that can only be utilized by specialized code (as a consequence both are also very large installs; Mathematica Player is ~90MB). In theory, their potential applications should be similar too.

But it was Wolfram who produced the demonstrations project.

Why them? Or rather, why not us first? After all, MCR is a mature technology with plenty of tricks of its own. The reason is simple: licensing. When we came up with the MATLAB Builder product set, we made a crucial decision to productize rather than commoditize. For the sort of company we have grown up as, it seems like a pretty obvious choice, but it was not without collateral cost. Mathematica users now have an ability most MATLAB users do not: they can produce standalone interactive programs from their code. By giving something away for free, Wolfram’s product gains legs.

The phenomenon isn’t limited to the software industry, either. It’s this same sort of risque, but worthwhile, thinking that some Olin students feel their institution has chosen to eschew in favor of a safer path. So when I see stuff like this on Slashdot, I’m thinking “why not us?” on both counts. The world is moving way too fast now to make allowances for half-assed commitments to revolutionary ideas. I see people like Negroponte and O’Reilly talk the same talk we do, then walk it, because they can. It’s maddening.



And… They’re Off
Tuesday February 05th 2008, 10:11 am
Filed under: Ranting and Raving

Giga Tuesday begins not with a bang, but a whisper, or in households like this one, a mild thrum. We’re all (I think) registered D here and, with varying degrees of enthusiasm, generally in favor of an Obama-McCain race. I won’t waste my breath on berating those who’ve stated their intention *not* to vote, but I would point out to the geek who does not see significant differences between the Dem frontrunners, that there’s more to policy than can be seen in the congressional record. To that end, Barack Obama’s 2007 address to Google was telling.

Whatever your preference, do remember–our votes now determine the kind of contest we’ll see later. Whatever their faults (there are plenty), the primaries are in some ways a lot closer to the level of granularity people are beginning to wish they had in general elections. Take advantage.



How I Spent the Part of the Weekend I Didn’t Sleep Through
Monday February 04th 2008, 12:25 am
Filed under: Ranting and Raving, Yours Truly

Seth and I got out early in the day to stock up at Trader Joe’s–true to his prediction, it was less of a disaster there than I expected, and probably a lot less than Stop’n'Shop would have been. To our chagrin, we managed to forget soda and nacho cheese for the party. I made up for it by plying his friends with beer, chips and salsa, an acceptable penance.

It was a surprising amount of fun, and I finally got to meet more of Seth’s crew, who seem never to be willing to make the long trek out here to MetroWest (they disparagingly refer to us as “the burbs”). Can’t help feeling a tinge of disappointment though, after investing my evening in the game (a very rare thing for me), only to have both of my “home states” suck it big time. Consensus was, both teams equally deserved to lose. The Pats just managed to pull a little more lose out of their hats in the end.

For Bostonians, at least, it won’t change much. Win or lose, we always riot. Gotta hand it to us for consistency.

On the bright side, life feels less chaotic with the room tidied up a bit now. I’m restoring it piecemeal to some semblance of control. I also hung up some of the better sketchwork from my Wellesley drawing class, which has me in a mood to get started back into it (I never did produce a finished version of the “Coupling Laptops” sketches)… now I just have to get my ass back to Cambridge for a couple more sheafs of good paper. Tsk tsk.

And I’m slowly but surely redeveloping the calluses I’ll need to play bass competently and without bleeding, ’cause, you know, that’s no fun. Get my writing voice back in shape, and the trifecta would be complete–but don’t hold your breath just yet. These things take time and painstaking effort, and usually an external impetus of some kind. Meetup says there are two writers’ circles in Newton. Hm.