Tuesday, January 29, 2008

Arc is out!

Arc has been released! Let the hype commence.

As of now, it requires MzScheme version 352. There is also a patch for later versions.

This first release has generated allot of controversy; Check Ethan's blog for a hearty dose of perspective.

Monday, January 28, 2008

Re: "Borrowing at 100% Interest"

Ethan Vizitei has a post up about about the perils of being interrupted whilst hard at work. He makes some good points, so go read it. Go ahead, I'll wait...
...
So as I was saying, he makes good points but he also invokes math. Whenever someone on a blog invokes math to make a point about computer programming, it is healthy to be a tad skeptical. (Even if it's Steve Yegge. Even if it's Paul Graham.... hell, especially if it's Paul Graham.) In that spirit, here is some nitpicking about Ethan's assumptions.

If they don't interrupt you now, then the question they need answered doesn't get addressed in a timely manner. That might also mean lost productivity.

If you aren't on break, you are in "the Zone" — all the time? Are you really in the Zone 6 hours a day? Aren't you ever compiling?

If you are interrupted for N minutes, Ethan assumes it takes exactly N minutes to get back in, regardless of N. That's seems arbitrary.

When you are spending the N minutes getting back into the Zone, is that all pure waste? I would imagine it's just normal work with a bit less momentum.

Still, it is something to think about. Cheers!

P.S. And by "something to think about" I mean to say that Ethan's conclusion is more or less correct. It is only his reasoning that is a bit flawed. There's no doubt that interruptions do waste valuable time. I also think the best reason to avoid interrupting people is they don't like it. Particularly not Ethan (or Dan). However, "Borrowing at 100% interest" is a much more compelling title than "Hey, that's kind of annoying!"

Monday, January 14, 2008

Fluent Interface Grammars: Part III (Nesting)

Last time, we saw the properties of fluent interfaces that use method-chaining only. Namely, they can express any regular language not containing the empty string. We also saw an example of a non-regular language. The language accepts any string with a run of a's followed by an equal number of b's, i.e. ab, aabb, aaabbb, etc...

S ::= a B
S ::= a S B
B ::= b
Today, in Part III of our saga, we examine fluent interfaces that use nesting exclusively (Steve Asher discusses nested interfaces here). Just as one cannot implement this language as chaining interface,

a().b()
a().a().b().b()
a().a().a().b().b().b() // (Can't do this.)
the most obvious nested interface for this language is also impossible, for similar reasons.

a(b())
a(a(b(b())))
a(a(a(b(b(b()))))) // (Can't do this either.)

However, it is quite easy to express this (slightly uglier) version:

a(b())
a(a(b()),b())
a(a(a(b()),b()),b()) // (Yucky, but possible.)

This is done with the following nested interface.

public static class S {}
public static class B {}
public static S a(B arg1) { return new S(); }
public static S a(S arg1, B arg2) { return new S(); }
public static B b() { return new B(); }
Thus, we find that there exists at least one non-regular language expressible with nested interfaces. What else can we do?

All context-free grammars can be reduced to Greibach normal form, which only allows production rules of two types:

1. S ::= $
2. A ::= a X

... where A is any non-terminal, a is any terminal, and X is a list of zero or more non-terminals. Rules of type 1 cannot be expressed because our grammars cannot accept the empty string (as noted in Part II). The rules of type 2 become top-level methods in this fashion:

// A ::= a
public static A a() {return new A();}
// A ::= a B
public static A a(B arg1) {return new A();}
// A ::= a BCD
public static A a(B arg1, C arg2, D arg3) {return new A();}

Does this mean that any context-free grammar (that doesn't accept the empty string) can be mapped to a nested fluent interface? Not quite—there is one more caveat. No two rules may have the same right-hand side, because they would yield two top-level methods with competing signatures. The question then becomes: Can any context-free grammar be reduced to a Greibach normal form without duplicate right-hand sides? After afew minutes on a whiteboard, I'm leaning toward no.

Let us now extend our grammar notation to indicate parameter lists.

S ::= a(B)
S ::= a(S B)
B ::= b()

This will make the transition from grammar to code more obvious. Additionally, it will prepare us for the next chapter: mixed interfaces.

Saturday, January 12, 2008

Fluent Interface Grammars: Part II (Chaining)

Today's episode focuses on the grammars achievable using only method-chaining (as opposed to nesting). For brevity, we shall refer to this type of fluent interface as a chaining interface. See Ordered Fluency by David Laribee for an interesting discussion of chaining interfaces and their grammars.

Before we get started, here are some Wikipedia links that may come in handy.

Recall the Chomsky hierarchy:
  • Type-0, unrestricted grammars
  • Type-1, context-sensitive grammars
  • Type-2, context-free grammars
  • Type-3, regular grammars

There is more or less a one-to-one correspondence between chained interfaces and regular grammars. A regular grammar is expressed using only rules of these forms:
A ::= a
A ::= a B
A ::= $

These are actually called "right regular grammars", but left regular grammars are equivalent in terms of the languages they can express. This regular grammar (with start symbol S) expresses the language consisting of all strings with a run of a's followed by a run of b's.
S ::= a S | a B
B ::= b | b B

The grammar accepts strings such as "aaab" and "abbbbb". Thus, a corresponding chaining interface would accept these chains.
a().a().a().b()
a().b().b().b().b().b()

What does it mean to say an interface accepts a particular chain? Two conditions must hold: the statement must be a valid Java expression, and it must evaluate to the type of grammar's start symbol. In other words, these statements must compile:
S statement = a().a().a().b();
S statement = a().b().b().b().b().b();
This grammar is easily implemented in few classes:

public class Foo {
public static void main() {
S s1 = a().a().b().b().b();
S s2 = a().a().a().a().b();
}
public static class S {}
public static A a() { return new A(); }
public static class A {
public A a() { return new A(); }
public B b() { return new B(); }
}
public static class B extends S {
public B b() { return new B(); }
}
}

Ok, so we've seen that a painfully simple regular grammar does indeed have a chaining interface counterpart. The real question is: How can we get from the grammar to the interface? After all, a richer grammar would not be so intuively simple to translate! With that in mind, here are the steps that will translate an arbitrary regular grammar into a chaining interface. As usual, S is assumed to be the start symbol.
  1. Left factor the grammar.
  2. Make a class for each non-terminal.
  3. Rules with left-hand side S become static methods.
  4. Rule with empty string ($) as their right-hand side make the class of their left-hand extend S.
  5. All other rules are methods of their respective classes.
Rules can become methods in three ways:
  1. Rules of form "A ::= a" become a method named a() in class A returning S.
  2. Rules of form "A ::= a B" become a method named a() in class A returning B.
  3. Rules with left-hand side S follow the same rules, but yield static methods.
Yikes, that sounds pretty complicated. Well, here goes nothing! First let us left factor our grammar, taking care to preserve its regular status.
S ::= a S | a B
B ::= b | b B
becomes
S ::= a A
A ::= a A | b B
B ::= $ | b B
Good. Now, all the non-terminals (S, A, and B) will become classes. S will be empty as always, and we can also see that B will extend S. Next, we translate each rule into a method.
// Top-level:
public static A a() { return new A(); } // S ::= a A

// In class A:
public A a() { return new A(); } // A ::= a A
public B b() { return new B(); } // A ::= b B

// In class B:
public B b() { return new B(); } // B ::= b B

This yields code identical to the implementation above. The same can be done for any regular grammar - with one little tiny exception. Grammars that accept the empty string cannot be encoded in this way because "S statement = ;" would never be legal in Java. This is just a technicality however, because there isn't much of an outcry for fluent interfaces that accept nothing as a statement.

We have shown that (nearly) any regular language can be implemented as a fluent interface using only method chaining. Conversely, non-regular languages cannot be; not proven here, but happens to be true.

Consider the prototypical example of a non-regular language. The language accepts any string with a run of a's followed by an equal number of b's, i.e. ab, aabb, aaabbb, etc...
S ::= a B
S ::= a S B
B ::= b
It is not possible to write an interface that would accept only such chains as these:
a().b()
a().a().b().b()
a().a().a().b().b().b() // and so on...
If you aren't convinced, I would invite you to try it.

Update: That last statement technically wrong. Using generics (fair game, I said Java 5), tkr accepted my invitation and indeed constructed this language. The parameterization trick he discovered doesn't seem applicable to context-free grammars in general, but is worth checking out nevertheless.

Monday, January 7, 2008

Fluent Interface Grammars: Part I

A spectre is looming over the world of object-oriented design patterns—the spectre of fluent interfaces.

This new and unorthodox style has been espoused by such giants as Martin Fowler, such average-heighted people as Steve Asher, and such midgets as Mike Finney. Knowing my place in the order of things, I suppose it is my turn. I'll start out by reiterating some of points being made.
  • Fluent interfaces are a rockin' way to approximate the flexibility of domain specific languages (DSLs) using OO constructs.
  • Fluent interfaces are easy to use, but relatively difficult to design.
  • Fluent interfaces violate several "best practices" of interface design (such as command query separation), but are often a Good Thing nevertheless.
  • If Sun finally gives us closures in Java 7, the skys will part, the rapture will be upon us, and we will all join hands singing "Camptown Races". Also, fluent interfaces will become cooler.
As everyone keeps pointing out, a fluent interface resembles a sort of domain specific language. I would like to explore how far that analogy goes. In other words, what are the limitations on the languages that fluent interfaces can provide? Michael Feathers asks essentially the same question in The Cardinality of a Fluent Interface.

What qualities does a grammar need to have to create an embedded DSL which doesn’t allow nonsensical constructs?

Before addressing this problem, let us set some ground rules. First, we live in Java 5. Second, it is not sufficient for our interface to accept some superset of the desired language; our grammar must be an exact fit. Ungrammatical statements must not compile. They should also be underlined by our spiffy IDE :)

Note: In formal language theory, a grammar is a set of transformation rules and a language is the set of all strings that the grammar accepts. Some background with these concepts is assumed.

Fluent interfaces generally use two ways of combining tokens to form statements: Call-chaining as().seen().here(), and nesting which(works(like(so))). Some interfaces stick to one or the other, while Steve's customer creator uses both:
Customer customer = 
createCustomer.named("Bob").that(bought.item(CompactDisc).on("02/17/2006"));
He explains,

The first thing to notice is that almost half of this code is fully enclosed in parenthesis. The result of 'bought.item(CompactDisc).on("02/17/2006")' is being passed to a method named 'that'.

By contrast, this is what that statement would look like using call-chaining all the way:
Customer customer = 
createCustomer.named("Bob").that().bought().item(CompactDisc).on("02/17/2006");
And here is one of the possible translations using only nesting:
Customer customer = 
createCustomer(named("Bob",that(bought(item(CompactDisc,on("02/17/2006"))))));

The customer creator's grammar would look similar to this.
S ::= createCustomer C
C ::= $
C ::= namedSomething C
C ::= that T C
C ::= and T C
T ::= bought someItem
T ::= bought someItem onSomeDate
Examples of strings accepted by this grammar:
createCustomer
createCustomer namedSomething
createCustomer namedSomething that bought someItem
createCustomer that bought someItem and bought someItem onSomeDate
So far I have introduced the idea of formal grammars for fluent interfaces, and posed a question about their limitations. In future posts I will answer this question for each variant of fluent interfaces: call-chaining, nesting, and mixed. Have faith! I am going somewhere with this.

P.S. Steve and Mike: I was just kidding, you are both well above average height.