Saturday, January 12, 2008

Fluent Interface Grammars: Part II (Chaining)

Today's episode focuses on the grammars achievable using only method-chaining (as opposed to nesting). For brevity, we shall refer to this type of fluent interface as a chaining interface. See Ordered Fluency by David Laribee for an interesting discussion of chaining interfaces and their grammars.

Before we get started, here are some Wikipedia links that may come in handy.

Recall the Chomsky hierarchy:
  • Type-0, unrestricted grammars
  • Type-1, context-sensitive grammars
  • Type-2, context-free grammars
  • Type-3, regular grammars

There is more or less a one-to-one correspondence between chained interfaces and regular grammars. A regular grammar is expressed using only rules of these forms:
A ::= a
A ::= a B
A ::= $

These are actually called "right regular grammars", but left regular grammars are equivalent in terms of the languages they can express. This regular grammar (with start symbol S) expresses the language consisting of all strings with a run of a's followed by a run of b's.
S ::= a S | a B
B ::= b | b B

The grammar accepts strings such as "aaab" and "abbbbb". Thus, a corresponding chaining interface would accept these chains.

What does it mean to say an interface accepts a particular chain? Two conditions must hold: the statement must be a valid Java expression, and it must evaluate to the type of grammar's start symbol. In other words, these statements must compile:
S statement = a().a().a().b();
S statement = a().b().b().b().b().b();
This grammar is easily implemented in few classes:

public class Foo {
public static void main() {
S s1 = a().a().b().b().b();
S s2 = a().a().a().a().b();
public static class S {}
public static A a() { return new A(); }
public static class A {
public A a() { return new A(); }
public B b() { return new B(); }
public static class B extends S {
public B b() { return new B(); }

Ok, so we've seen that a painfully simple regular grammar does indeed have a chaining interface counterpart. The real question is: How can we get from the grammar to the interface? After all, a richer grammar would not be so intuively simple to translate! With that in mind, here are the steps that will translate an arbitrary regular grammar into a chaining interface. As usual, S is assumed to be the start symbol.
  1. Left factor the grammar.
  2. Make a class for each non-terminal.
  3. Rules with left-hand side S become static methods.
  4. Rule with empty string ($) as their right-hand side make the class of their left-hand extend S.
  5. All other rules are methods of their respective classes.
Rules can become methods in three ways:
  1. Rules of form "A ::= a" become a method named a() in class A returning S.
  2. Rules of form "A ::= a B" become a method named a() in class A returning B.
  3. Rules with left-hand side S follow the same rules, but yield static methods.
Yikes, that sounds pretty complicated. Well, here goes nothing! First let us left factor our grammar, taking care to preserve its regular status.
S ::= a S | a B
B ::= b | b B
S ::= a A
A ::= a A | b B
B ::= $ | b B
Good. Now, all the non-terminals (S, A, and B) will become classes. S will be empty as always, and we can also see that B will extend S. Next, we translate each rule into a method.
// Top-level:
public static A a() { return new A(); } // S ::= a A

// In class A:
public A a() { return new A(); } // A ::= a A
public B b() { return new B(); } // A ::= b B

// In class B:
public B b() { return new B(); } // B ::= b B

This yields code identical to the implementation above. The same can be done for any regular grammar - with one little tiny exception. Grammars that accept the empty string cannot be encoded in this way because "S statement = ;" would never be legal in Java. This is just a technicality however, because there isn't much of an outcry for fluent interfaces that accept nothing as a statement.

We have shown that (nearly) any regular language can be implemented as a fluent interface using only method chaining. Conversely, non-regular languages cannot be; not proven here, but happens to be true.

Consider the prototypical example of a non-regular language. The language accepts any string with a run of a's followed by an equal number of b's, i.e. ab, aabb, aaabbb, etc...
S ::= a B
S ::= a S B
B ::= b
It is not possible to write an interface that would accept only such chains as these:
a().a().a().b().b().b() // and so on...
If you aren't convinced, I would invite you to try it.

Update: That last statement technically wrong. Using generics (fair game, I said Java 5), tkr accepted my invitation and indeed constructed this language. The parameterization trick he discovered doesn't seem applicable to context-free grammars in general, but is worth checking out nevertheless.


Ethan Vizitei said...

Good post, Ray. I've been hearing a lot about fluent APIs from Steve, and it looks like the bug is spreading. Very cool.

Michael Finney said...

"...outcry for fluent interfaces that accept nothing as a statement..."

That's called a code comment. ;)

Michael Finney said...

Oh. Very nice, BTW. I appreciate the reposting of the Chomsky hierarchy right into the blog. The post is bite-sized/right-sized too.

Your writing style has attributes that are worthy to imitate, IMHO.

tkr said...

It is possible to express the "a-b-bracket" language with method chaining with the aid of generics.

If you open a bracket with "a()", you have to return a class "OpenA<T>". If you close a bracket with "b()" you return the parametrized type T. If a user chains multiple methods a(), he gets a type "OpenA<OpenA<OpenA<T>>>".

tkr said...

I posted some code to show how to build the language with generic types in Java:
more than regular

Ray Myers said...

Great contribution by tkr. He shows that it is indeed possible to express the non-regular aabb language with method-chaining alone (using generics). The Java code he posted has a bug, so check out his fixed Scala version.

Erick Hagstrom said...

Thanks for a useful and informative post.

I tried to find tkr's post, but it seems that his blog has been taken down. Pity. Does anyone have an updated link, a scraped copy, printout, anything?

Erick Hagstrom said...

Ok, found it. tkr's post can be found at More than regular