Sunday, October 26, 2008

Ladies and Gentlemen, The Editor War

Scene opens. The Lakota coffee shop in Columbia, MO.

Jane: ... and the other has been really interesting. It's Anthropology, with Jim Metcher.

Ray: (lights up) Yes! I actually got to take one of Metcher's classes during my brief stint in the evening program.

Jane: Are a Mizzou student?

Ray: Close. I graduated from MST in '06. I'm workin' now.

Jane: What do you do?

Ray: I help machines think. I'm a programmer.

Jane: Spiffy! So what's new in the fast-paced world of "programming"?

Ray: Well, lot's of things I guess. Let's see... new languages coming out as usual. PHP making some controversial design decisions regarding namespace. 50th anniversary of Lisp was just celebrated at OOPSLA; I bet there were some great talks at that. Oh yeah, and I just read about a new development in The Editor War.

Jane: ... The editor what?

Ray: War. The Editor War... it's this infamous feud. (waves hand dismissively)

Jane: Feud? (interest grows at the scent of scandal) Over what?

Ray: Which text editor to use.

Jane: (Smirk) No, really. What is it about? I'm actually interested this time, you don't have to make things up to amuse me.

Ray: No... (Slowly raises hand to forehead), that's actually what it's about.

Jane: (Confused) That doesn't sound like a very political decision. Is there some, like, protocol or something that has to... (pauses) I thought text was all sort of... compatible. (glances downward) I guess I don't much about this stuff.

Ray: Actually, you've got a decent grasp of it. While there have been a few new standards in recent years to accommodate different languages, text is overall very compatible from one editor to the next.

Jane: In that case, I don't get it. What other technical hurdle would force all programmers to use the same editor?

Ray: Ahh! I see the source of your confusion. That's not the issue. We're not debating the adoption of an official industry-wide standard editor. Nothing along those lines. It's a matter of personal preference.

Jane: (Disappointed) So it's just different people choosing a different tool? Aren't you exaggerating a bit by calling it a feud?

Ray: I see where you might think that, but believe it or not, I've been to parties where shouting matches broke out spontaneously at the mention of either of these editors.

Jane: Really?

Ray: Really.

Jane: Wow. So this has been going on for what, weeks now?

Ray: A bit longer than that.

Jane: Months?

Ray: It has been going on for over 25 years.

Wednesday, July 16, 2008

Clone Wizards of Time and Space

The talk, Code Generation: The Safety Scissors Of Metaprogramming by Giles Bowkett.

The quote, "This is how Lisp guys think of Lisp and the Lisp community: intergalactic voluntarily bald clone wizards of time and space who the womens are be lusting fors."

He literally said that. I'm not paraphrasing. One more time in case you missed it.



Intergalactic Voluntarily Bald Clone Wizards of Time and Space Who The Womens Are Be Lusting Fors.


I for one am quite offended! I resent the continuing perception that Lispers are smug and arrogant. Nothing could be further from the truth! We do not see ourselves as some sort of secret society with magic powers beyond the comprehension of you mere mortals.

Signed,
Ray, Squire of the Grand Recursive Order of the Knights of the Lambda Calculus

Saturday, July 12, 2008

Learn Recursion

Todays message is short and sweet.

I was just reading chapter seven of Beautiful Code, "Beautiful Testing". As the chapter begins, the reader is challenged to attempt writing a binary search. Here's mine.
int search(int[] arr, int target) {
return arr.length == 0 ? -1 : search(arr, target, 0, arr.length-1);
}

int search(int[] arr, int target, int min, int max) {
if (min >= max) return target == arr[min] ? min : -1;
int middle = (max + min) >>> 1;
if (arr[middle] > target) return search(arr, target, min, middle-1);
if (arr[middle] < target) return search(arr, target, middle+1, max);
return middle;
}
Yes, it's in Java. I'm not going to rewrite it in eight other languages because that's not the point, for once.

Here's the point: If you call yourself a programmer and don't grok this recursion business, fix it. Now.

This has been a public service announcement brought to you by the "People Who Will Not Take You Seriously If You Don't Understand Recursion" Foundation.

Tuesday, July 1, 2008

The arrogance of demanding evidence

"No one has the right to destroy another person's belief by
demanding empirical evidence." — Ann Landers

This is widely quoted but I haven't found a full citation of the original source. Thus, it's remotely possible that this is a misquote, out of context, etc...

Still, I'm not too worried about misquoting a shared pen name. My main concern is that in 2008, it is still possible to read a quote like this without flinching. One might even agree!

The statement implies two things. The first is that the freedom of speech should be abolished. (We'll look past that one for now.) The second is that it is unreasonable, inhumane, and simply impolite to expect people to base their worldviews on ... the world. In intellectual discourse, once someone says the magic words "I believe," it is inappropriate for other parties to continue further.

In fact, a more honest phrasing of the Ann Landers quote might be:

"Don't speak up; you might inadvertently cause someone to change their mind."

I've had my beliefs turned upside-down by empirical evidence dozens of times in my life, and I'm probably not done. I used to believe that aliens crashed in Roswell, New Mexico in 1947. I used to believe that Uri Geller was psychic. I used to believe that homeopathic preparations had medicinal properties. I used to believe that computers couldn't think — and would never think.

I am a better person for being able to change my mind.

"It is far better to grasp the universe as it really is than to persist in delusion, however satisfying and reassuring." — Carl Sagan

"Isn't it enough to see that a garden is beautiful without having to believe that there are fairies at the bottom of it too?" — Douglas Adams

Thursday, April 10, 2008

Subsequences With BGGA Closures

Just a quick followup on Returning '(()) Considered Difficult. I just rewrote the "subsequences of length n" exercise yet again (surprise, surprise), this time using the Java BGGA prototype by Neal Gafter et al. I really like being able to abstract away the accumulation loop.
import static com.cadrlife.ListUtils.*;
...
private static <T> List<List<T>> subn(int n, List<T> list) {
if (n == 0) {
return Collections.<List<T>>singletonList(new ArrayList<T>());
}
if (list.isEmpty()) {
return new ArrayList<List<T>>();
}
T first = list.get(0);
List<T> rest = list.subList(1, list.size());
return addAll(collect(subn(n-1, rest), {List<T> sub => push(first, sub)}),
subn(n, rest));
}

A word on syntax: Some people have come down on the {=>} closure syntax but I wonder if they have actually used it. This was my first time using BGGA; everything compiled and worked on the first try. I can't say the same about trying to simulate closures with anonymous Transformer classes using the Apache Commons CollectionUtils version of collect.

Just in case you don't have com.cadrlife.ListUtils in your classpath, here are the obvious definitions of collect, addAll, and push.
package com.cadrlife.ListUtils;
...
public static <A,B> List<B> collect(List<A> list, {A => B} function) {
List<B> result = new ArrayList<B>();
for (A element : list) {
result.add(function.invoke(element));
}
return result;
}
public static <T> List<T> push(T el, List<T> list) {
list.add(0, el);
return list;
}

public static <T> List<T> addAll(List<T> a, List<T> b) {
a.addAll(b);
return a;
}

FunctionalJ: Steve points out that the last bit could be done this way, using the Java library FunctionalJ.
DynReflect reflect = new JdkDynReflect();
Function1 pushFirstElem = reflect.staticFunction(
AnnotatedSequencer.class, "push").f2().bind(first(list));
return addAll(map(pushFirstElem, subn(n - 1, tail(list))),
subn(n, tail(list)));
Be forewarned that using reflection to simulate first-order functions in this way will sacrifice compile-time verification.

Saturday, March 29, 2008

Returning '(()) Considered Difficult

Suppose you want a function to find all subsequences of a list with a certain length. For instance, the subsequences of (1...5) with length 2 are as follows.
(1 2) (1 3) (1 4) (1 5)
(2 3) (2 4) (2 5)
(3 4) (3 5)
(4 5)
This is not a red-herring example cooked up just to make Java look bad. This actually came up at work! I did something along these lines. Yuck.
private static <T> List<List<T>> subn(int n, List<T> li) {
List<List<T>> ret = new ArrayList<List<T>>();
if (n == 0) {
ret.add(new ArrayList<T>());
return ret;
}
if (li.isEmpty()) {
return ret;
}
T x = li.get(0);
List<T> xs = li.subList(1, li.size());
for (List<T> sub : subn(n-1, xs)) {
sub.add(0, x);
ret.add(sub);
}
ret.addAll(subn(n, xs));
return ret;
}
However, that was not the way my thoughts originally took shape. I think in Lisp.
(def subn (n li)
(if (is 0 n) '(())
(no li) '()
(join (map [cons (car li) _] (subn (- n 1) (cdr li)))
(subn n (cdr li)))))
Then I moved it to Haskell for greater clarity.
subn 0 _      = [[]]
subn _ [] = []
subn n (x:xs) = map (x:) (subn (n-1) xs) ++ subn n xs
With great pain, I converted those three lines of Haskell into the ugly splatter of Java you saw above. Later on, I started playing around with Scala and gradually compacted the Java into something strikingly similar to its Haskell ancestor.
def subn[T](n: int, li : List[T]) : List[List[T]] = (n,li) match {
case (0,_) => List(Nil)
case (_,Nil) => Nil
case (_,x::xs) => (subn(n-1, xs) map (x :: _)) ::: subn(n, xs)
}
The title of today's post refers to the inordinate amount of trouble it is to simply return "The list containing the empty list" in Java. I tried such monstrosities as these:
Collections.singletonList(Collections.EMPTY_LIST) // List<List>
Arrays.asList(Collections.emptyList()) // List<List<Object>>
Collections.singletonList(new ArrayList<T>()) // List<ArrayList<T>>
Collections.singletonList((List<T>)new ArrayList<T>()) // List<List<T>>
The last option does give the correct type, but it causes a compiler warning and is not much less cumbersome than the naive approach, i.e.:
List<List<T>> ret = new ArrayList<List<T>>();
ret.add(new ArrayList<T>());
return ret;
So here's a challenge for you Java developers out there. What would you change about the admittedly sloppy Java code at the beginning of this post? I'm issuing an all points bulletin to the refactoring mavericks (Mike, James), the simple-design visionaries (Steve), and the speed-freak optimizers (Dan). I wouldn't even mind seeing a C# version (Ethan), or an improvement on the Haskell (Eric).

Incidentally, I know some people think recursion is hard to read. I'd love to see an iterative solution as well.

P.S. I realize that the comment boxes don't hold code blocks very well. Feel free to link to a solution.


Updates:
Based on your input, I am now happy to give three different ways of returning '(()) in one line. I am only including ways that would work in the context I gave (i.e. the inner list must be mutable). The last is my favorite because the others cause compiler warnings.
return Arrays.<List<T>>asList(new ArrayList<T>());
// Type safety : A generic array of ArrayList is created for a varargs parameter

return new ArrayList<List<T>>() {{add(new ArrayList<T>());}};
// The serializable class does not declare a static final serialVersionUID...
// Ricky points out that this is an "Eclipse warning".

return Collections.<List<T>>singletonList(new ArrayList<T>());
// No warnings :)
Eric has blessed us with a version in Factor (not to be confused with my aversion to Factor). He also added a Haskell version using list comprehensions to achieve better laziness properties.

Meanwhile, back in Javaland, Dan suggested using SortedSet, particularly the subSet method. This seems to be a dead end in this case, but is certainly a useful corner of the Collection hierarchy.

After briefly attaining List Enlightenment, Thomas produced an implementation in Java with no recursion. Aside from being much longer and totally opaque, it also has worse performance charactoristics. It essentially produces every subsequence and then filters by length, similar to the Haskell one-liner suggested in the comments.

Wednesday, March 26, 2008

Speaker Spotlight - Simon Peyton-Jones

Simon Peyton-Jones was one of the designers of the Haskell programming language. Here are links some of his presentations.

A Taste of Haskell
Introduction to Haskell with XMonad as an example
(video 1) (video 2) (slides)

Composing Contracts - An Adventure in Financial Engineering
Very interesting case study on domain specific languages
(video) (slides)

Nested Data Parallelism in Haskell
Beating FORTRAN is the name of the game!
(video) (slides)

Taming Effects - The Next Big Challenge
The virtues of purity in functional languages
(video) (slides)

Transactional Memory for Concurrency
OSCON 2007 Keynote
(video) (slides)

Programming in the Age of Concurrency: Software Transactional Memory
Interview with Simon and Tim Harris about Transactional Memory
(video)

Type-driven Testing
Demonstration of the Haskell library QuickCheck
(video) (slides)

Haskell and Erlang: Growing Up Together
At Erlang Factory 2009
(video)

Interview at QCon 2008
(video)

Saturday, March 1, 2008

Haskell Style Pattern Matching In Arc

Since I did the Fluent Interface saga, it's only fair that I try to push the boundaries of metaprogramming in language that actually has metaprogramming.

Consider the following Haskell code. We will be striving to imitate it as closely as possible using an Arc macro.
union xs [] = xs
union [] ys = ys
union xs@(x:xt) ys@(y:yt) | x < y = x : union xt ys
| y < x = y : union xs yt
| otherwise = x : union xt yt
First we will see how this function might traditionally be written.
(def union (< xs ys)
(if (no xs) ys
(no ys) xs
(with (x (car xs) xt (cdr xs)
y (car ys) yt (cdr ys))
(if (< x y) (cons x (union < xt ys))
(< y x) (cons y (union < xs yt))
(cons x (union < xt yt))))))
As a useful side-note, Arc's let and with forms come with destructuring. This can save us a whole line here, as well as make the code more readable (no more cryptic cars and cdrs).
(def union (< xs ys)
(if (no xs) ys
(no ys) xs
(with ((x . xt) xs (y . yt) ys)
(if (< x y) (cons x (union < xt ys))
(< y x) (cons y (union < xs yt))
(cons x (union < xt yt))))))
This isn't bad. If we were programming in the wild, we would leave well enough alone at this point. However, this is an experiment; we want it to look Haskellian, not Lispy.

We'll start with pcase a 10-line macro using almkglor's pattern matching library to yield something similar to Scala's match-case statement.
(def union (< xs ys)
(pcase `(,xs ,ys)
(xs ()) xs
(() ys) ys
((x . xt) (y . yt))
(if (< x y) (cons x (union < xt ys))
(< y x) (cons y (union < xs yt))
(cons x (union < xt yt)))))
Switching from pcase to hcase, the outer parens in the patterns go away and equals signs separate the patterns from their corresponding expressions.
(def union (< xs ys)
(hcase `(,xs ,ys)
xs () = xs
() ys = ys
(x . xt) (y . yt) =
(if (< x y) (cons x (union < xt ys))
(< y x) (cons y (union < xs yt))
(cons x (union < xt yt)))))
To bring more Haskell syntax into the mix, we want to be able to substitute [] for () and (x:xt) for (x . xt). The colon and the brackets have special meanings in Arc, but our macro can intercept them without too much difficulty.
(def union (< xs ys)
(hcase `(,xs ,ys)
xs [] = xs
[] ys = ys
(x:xt) (y:yt) =
(if (< x y) (cons x (union < xt ys))
(< y x) (cons y (union < xs yt))
(cons x (union < xt yt)))))
Next, we can replace that pesky if statement with guard clauses.
(def union (< xs ys)
(hcase `(,xs ,ys)
xs [] = xs
[] ys = ys
(x:xt) (y:yt) / (< x y) = (cons x (union < xt ys))
/ (< y x) = (cons y (union < xs yt))
/ otherwise = (cons x (union < xt yt))))
The a single pipe | character can't be used here for technical reasons, so we'll have to make due with a slash /. A double pipe || could be used just as easily.

Since the guard conditions are enclosed on either side by / and =, we can make the outer parens optional. The xs@(x:xt) constructs known as "as-patterns" are the next addition. Having these as-patterns, we can define a simple macro to bring hcase up to the level of function arguments.
(mac hdef (name . body)
(let args (uniq)
`(def ,name ,args (hcase ,args ,@body))))

(hdef union
_ xs [] = xs
_ [] ys = ys
< xs@(x:xt) ys@(y:yt) / < x y = (cons x (union < xt ys))
/ < y x = (cons y (union < xs yt))
/ otherwise = (cons x (union < xt yt)))
The result: Lo! In a dazzling display of engineering hubris, a Lisp dialect is dragged kicking and screaming into Haskell-like syntax. If you dare, you can take look at the definition of hcase in all its convoluted glory.

Implementation note: Like most Lisps, Arc would read '(x : xs) as a list of three symbols, but '(x:xs) has only one. In order to behave properly, hcase must split apart the symbols in patterns: '(xs@(x:y:yt)) becomes '(xs @ (x : y : yt)) and so on.

Saturday, February 16, 2008

Doing Eratosthenes Justice

Yesterday I showed how to use some closures and macros to implement (very inefficient) lazy lists in Lisp. One of my examples produced an infinite list of prime numbers.

BlueStorm quite correctly pointed out that it was O(n^2), while the classic sieve of Eratosthenes is O(n log n). Always up for a challenge, I went ahead wrote a proper sieve.
(def merge_ (xs ys)
(with (x (car_ xs) y (car_ ys))
(if (< x y) (cons_ x (merge_ (cdr_ xs) ys))
(> x y) (cons_ y (merge_ xs (cdr_ ys)))
(cons_ x (merge_ (cdr_ xs) (cdr_ ys))))))

(def diff_ (xs ys)
(with (x (car_ xs) y (car_ ys))
(if (< x y) (cons_ x (diff_ (cdr_ xs) ys))
(> x y) (diff_ xs (cdr_ ys))
(diff_ (cdr_ xs) (cdr_ ys)))))

(def merge-multiples (li)
(with (xs (car_ li) rest (cdr_ li))
(cons_ (car_ xs)
(merge_ (cdr_ xs) (merge-multiples rest)))))

(def multiples (p) (map_ [* _ p] (count-from p)))

(= primes
(cons_ 2 (cons_ 3 (cons_ 5 (diff_ (count-from 7) non-primes)))))

(= non-primes (merge-multiples (map_ multiples primes)))
A typical optimization is to ignore multiples of 2, as follows.
(def count-from-by (n by) (cons_ n (count-from-by (+ n by) by)))

(def multiples (p) (map_ [* _ p] (count-from-by p 2)))

(= primes
(cons_ 2 (cons_ 3 (cons_ 5 (diff_ (count-from-by 7 2) non-primes)))))

(= non-primes (merge-multiples (map_ multiples (cdr_ primes))))

Friday, February 15, 2008

Lazy Lists in Arc

No time for inflammatory ranting today. We're going to define lazy lists in Arc. (See also CL version.)

The idea of adding lazy lists to Lisp dates back to TR44:
CONS should not Evaluate its Arguments
(Daniel P. Friedman, 1976). Guy Steele discusses TR44 and other cool ideas in this talk.

In order to delay evaluation, the lazy cons will suspend its arguments in closures. These closures will be executed on first access, and the results stored.
(mac wrap (a) `(fn () ,a))
(def force (f) (wrap (f)))
(mac cons_ (a b) `(cons (wrap ,a) (wrap ,b)))
(def car_ (li) ((zap force (car li))))
(def cdr_ (li) ((zap force (cdr li))))
Presto, we're done! Those are all the primitive we need. Now we'll just define some typical list utilities to using our new car_, cdr_, and cons_.
(def filter (p li)
(if (no li) li
(p (car_ li)) (cons_ (car_ li) (filter p (cdr_ li)))
(filter p (cdr_ li))))

(def map_ (f li)
(if (no li) li
(cons_ (f (car_ li)) (map_ f (cdr_ li)))))

(def take (n li)
(if (<= n 0) nil
(cons (car_ li) (take (- n 1) (cdr_ li)))))
With those, we can start playing with infinite lists. For starters, the natural numbers.
(def count-from (n) (cons_ n (count-from (+ n 1))))

(= naturals (count-from 1))

(take 5 (filter odd naturals)) ; => (1 3 5 7 9)
(take 5 (map_ [* _ 3] naturals)) ; => (3 6 9 12 15)
Now, the primes:
(def sieve (li)
(let p (car_ li)
(cons_ p (sieve (filter [> (mod _ p) 0] (cdr_ li))))))

(= primes (sieve (count-from 2)))
Holy 4-line prime number sieve, Batman!
(take 10 primes)  ; => (2 3 5 7 11 13 17 19 23 29)
Here is the equivalent in Haskell (see also HaskellWiki.)
primes = sieve [2..] where
sieve (p:xs) = p : sieve (filter (\x -> x `mod` p > 0) xs)

P.S. A multi-arg version of map_. I did the single argument version above because it is easier to follow.
(def map_ (f . lists)
(if (some no lists) nil
(cons_ (apply f (map car_ lists))
(apply map_ f (map cdr_ lists)))))
P.P.S. Bonus: Fibonacci sequence.
(= fibs (cons_ 1 (cons_ 1 (map_ + fibs (cdr_ fibs)))))

Wednesday, February 6, 2008

Is Java the new COBOL? No.

I'll bash Java as much as the next guy, but when people start to throw around the 'C' word, that's where I draw the line. Being compared to COBOL is kind of like being compared to Hitler. As Dijkstra said in 1975:
The use of COBOL cripples the mind; its teaching should, therefore, be regarded as a criminal offence.
Would anyone say the same about Java with a strait face?

The Perils of Java Schools offers some of the harshest criticism of Java to receive widespread attention. What's the chief complaint? Java is too easy! I'm serious. Joel actually asserts that Java is not difficult enough for a college curriculum. He argues that C++, with its pointer-juggling headaches, is much better at weeding out lesser minds.

This is such a bizarre thing to say that I hardly know where to start.

First off, I went to MST (formerly UMR) where our CS department uses C++ and we're damn proud of it. Therefore, it is with a heavy heart that I admit the following: Yes, Java is easier, but that's not a bug, it's a feature! Making programming easier is a Good Thing.

Second, there actually exist C++ programmers who don't understand recursion. Sad but true. It's even possible (albeit rare) to make it through my school's pointer minefield without being able to recurs your way out of a wet paper bag. Joel is just plain wrong to assume that there is a single "part of the brain" whose purpose is to grok pointers and recursion. Pointers and recursion are not the same. At all.

I'll grant that someone who only knows Java is unlikely to understand pointers, but how bad is that really? Pointers don't help you think in higher levels of abstraction. They exist for the sole purpose of getting in your way. Other respectable languages without pointers include: Lisp, Python, Haskell, Ocaml, and Scala.

Recursion is a completely different matter. Java and C++ both support recursion equally well (or equally poorly). If you don't understand recursion, you don't really understand functions/methods. Period.

Let's assume for the moment that I'm done picking on Joel Spolsky. His real point is that schools should not water down their curriculum, and I absolutely agree. Self-respecting CS departments should fight to keep data structures, operating systems, language design, and graph theory. They should also continue to require calculus. This all has nothing to do with our petty programming language feuds. Despite what we C++ elitists would have you believe, C++ is not an academic language. It is a vocational one. Universities teach C++ because employers want that skill, not because they think it builds character.

Still, in all of his anti-Java rhetoric, Joel never (AFAIK) compared Java to COBOL. That would be hitting below the belt.

What makes me rush to Java's defense all of the sudden? Mostly it's because I'm at home sick with a cold... but here is an equally good reason.
IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO-WORLD.
PROCEDURE DIVISION.
MAIN.
DISPLAY 'Hello World'.
STOP RUN.
I can not and will not compare Java to that. Let's look at another example, just to drive it home.
x = y * 2;       // C, C++, Java
$x = $y * 2; # Perl
X = Y * 2 ! Fortran
(set! x (* y 2)) ; Scheme
(= x (* y 2)) ; Arc
These are all fairly reasonable ways to multiply y by 2 and store the result in x. How does it look in COBOL?
MULTIPLY 2 BY Y GIVING X.
I can't even look at that without cringing. COBOL was a horrifically misguided attempt to blend pseudo-code with pseudo-English. Being a "business-oriented language", COBOL had syntax designed to appeal to the managers of 1960. Java was an effort to simplify and enhance C++, designed to appeal to the programmers of 1995.

So remember: Java just kinda sucks. COBOL really really really really sucks. If COBOL were an ice cream flavor, it would be pralines and arsenic. If COBOL were the only programming language I could use, I would leave the field.

Sunday, February 3, 2008

Nicknames for Noam Chomsky

[NOTE: If you have something better to be doing, you might want to move along. This is not one of my useful or insightful posts.]

This is part of an ongoing battle between myself and my linguistic superior, Hannah.
  • Choam Nomsky
  • Chommers in Charge
  • Chomalicious
  • Noam-ero Uno
  • The Roaming Noam
  • Chomfoolery
  • Chernoambyl
  • Chomp Suey
  • Hong-Chom Phooey
  • Long-Chom Silver's
  • International House of Chomcakes
  • GNU Gnoam 2.20
  • Chomskeroony
  • ThelNoamious Monk
  • Everlasting Chomstopper
  • The Blue-Collar Lingustics Tour featuring Jeff Foxchomsky and Chomsky the Cable Guy
  • ENoamuel Chomstein
  • Rikki-Tikki-Chomsky
  • Chomery Clinton
  • Barrack OChoma
  • Chom Edwards
  • Chom McCain
  • Mitt Chomney
  • Chom Knuth
  • Necronoamicon
  • H.P. Chomscraft
  • The Call of Cthulhomsky
  • Noamus Opperandi
  • Chubby Chomsker
  • Fats Chomino
  • The Human Genoam Project
  • Istanbul (Not ChomstantiNoample)
  • Chompernicus
Current score: Ray 32, Hannah 45

Hannah: As a wise man once said: "It is not enough to succeed, one's friends must also fail."
Ray: Who said that? Chomfucious?

P.S. I can see now that this will be a battle of truly astronoamical proportions!

Friday, February 1, 2008

The Functional Programming Apologist

There do exist legitimate criticisms of functional programming (FP) concepts, just as there are reasonable criticisms of imperative style, procedural style, prolog-esk logic/constraint programming, and even our precious object-orientation. Still, as you might expect, the vast majority of the tomatoes hurled in the general direction of the FP community are totally absurd. Here are a couple actual comments from various blogs. I am not giving the sources for obvious reasons, but you could probably find them in Google.
Imperative languages represent the straight line from point A to point B. They reflect the true nature of the beast. If you had to slay a dragon, would you try to convince another dragon to fight for you, or would you grab a damn weapon?
Yes, as a matter of fact, I WOULD get another dragon to fight for me. It's called abstraction! This is why we have programming languages in the first place, even though machine code "reflects the true nature of the beast". Does your CPU know what object orientation is? Unless you do all your structure and flow control using GOTO, you are NOT reflecting the true nature of your beloved beast. Sorry to break the news.

Here's another real gem. I see this sort of thing all the time.
Closures in today's world are a "language geek" feature. Unless done extremely carefully and in a way that supports the various skill levels of developers, they end up being unusable and unsupportable by anything less than computer language savants. In their inherent obscurity and complexity, "language geek" style closures are about as anti-Java as you can get.
First off thanks for calling me a geek, I really didn't get enough of that growing up interested in computers. Imagine a time when Java didn't have generics. In other words, imagine Java 1 through 4. Java 5 generics evolved from the type classes in Haskell — a fringe fancy-pants FP academic language if there ever was one. Some people (possibly the same ones) were saying:
Generics in today's world are a "language geek" feature. Unless done extremely carefully and in a way that supports the various skill levels of developers, they end up being unusable and unsupportable by anything less than computer language savants. In their inherent obscurity and complexity, "language geek" style generics are about as anti-Java as you can get.
I think I've made my point.

But wait! Maybe I've been wrong all these years! Maybe generics and even anonymous inner classes are perfect additions to Java, but closures would be stepping way over the line. After all, they are an FP concept and Java is OO. That's like the Hatfields and the McCoys and the two would never mix. Could that really be the case?

Let's see what James Gosling, the inventer of Java, said about it just yesterday:
There has been a lot of chatter about the closures proposal penned by Neal Gafter. And, in particular, whether or not I support it. I absolutely do. Java feels alive, not stuck in some chisel marks on stone tablets. Closures were left out of Java initially more because of time pressures than anything else. Closures, as a concept, are tried and true - well past the days of being PhD topics.
And don't even ask me what Guy Steele would say about it!

Tuesday, January 29, 2008

Arc is out!

Arc has been released! Let the hype commence.

As of now, it requires MzScheme version 352. There is also a patch for later versions.

This first release has generated allot of controversy; Check Ethan's blog for a hearty dose of perspective.

Monday, January 28, 2008

Re: "Borrowing at 100% Interest"

Ethan Vizitei has a post up about about the perils of being interrupted whilst hard at work. He makes some good points, so go read it. Go ahead, I'll wait...
...
So as I was saying, he makes good points but he also invokes math. Whenever someone on a blog invokes math to make a point about computer programming, it is healthy to be a tad skeptical. (Even if it's Steve Yegge. Even if it's Paul Graham.... hell, especially if it's Paul Graham.) In that spirit, here is some nitpicking about Ethan's assumptions.

If they don't interrupt you now, then the question they need answered doesn't get addressed in a timely manner. That might also mean lost productivity.

If you aren't on break, you are in "the Zone" — all the time? Are you really in the Zone 6 hours a day? Aren't you ever compiling?

If you are interrupted for N minutes, Ethan assumes it takes exactly N minutes to get back in, regardless of N. That's seems arbitrary.

When you are spending the N minutes getting back into the Zone, is that all pure waste? I would imagine it's just normal work with a bit less momentum.

Still, it is something to think about. Cheers!

P.S. And by "something to think about" I mean to say that Ethan's conclusion is more or less correct. It is only his reasoning that is a bit flawed. There's no doubt that interruptions do waste valuable time. I also think the best reason to avoid interrupting people is they don't like it. Particularly not Ethan (or Dan). However, "Borrowing at 100% interest" is a much more compelling title than "Hey, that's kind of annoying!"

Monday, January 14, 2008

Fluent Interface Grammars: Part III (Nesting)

Last time, we saw the properties of fluent interfaces that use method-chaining only. Namely, they can express any regular language not containing the empty string. We also saw an example of a non-regular language. The language accepts any string with a run of a's followed by an equal number of b's, i.e. ab, aabb, aaabbb, etc...

S ::= a B
S ::= a S B
B ::= b
Today, in Part III of our saga, we examine fluent interfaces that use nesting exclusively (Steve Asher discusses nested interfaces here). Just as one cannot implement this language as chaining interface,

a().b()
a().a().b().b()
a().a().a().b().b().b() // (Can't do this.)
the most obvious nested interface for this language is also impossible, for similar reasons.

a(b())
a(a(b(b())))
a(a(a(b(b(b()))))) // (Can't do this either.)

However, it is quite easy to express this (slightly uglier) version:

a(b())
a(a(b()),b())
a(a(a(b()),b()),b()) // (Yucky, but possible.)

This is done with the following nested interface.

public static class S {}
public static class B {}
public static S a(B arg1) { return new S(); }
public static S a(S arg1, B arg2) { return new S(); }
public static B b() { return new B(); }
Thus, we find that there exists at least one non-regular language expressible with nested interfaces. What else can we do?

All context-free grammars can be reduced to Greibach normal form, which only allows production rules of two types:

1. S ::= $
2. A ::= a X

... where A is any non-terminal, a is any terminal, and X is a list of zero or more non-terminals. Rules of type 1 cannot be expressed because our grammars cannot accept the empty string (as noted in Part II). The rules of type 2 become top-level methods in this fashion:

// A ::= a
public static A a() {return new A();}
// A ::= a B
public static A a(B arg1) {return new A();}
// A ::= a BCD
public static A a(B arg1, C arg2, D arg3) {return new A();}

Does this mean that any context-free grammar (that doesn't accept the empty string) can be mapped to a nested fluent interface? Not quite—there is one more caveat. No two rules may have the same right-hand side, because they would yield two top-level methods with competing signatures. The question then becomes: Can any context-free grammar be reduced to a Greibach normal form without duplicate right-hand sides? After afew minutes on a whiteboard, I'm leaning toward no.

Let us now extend our grammar notation to indicate parameter lists.

S ::= a(B)
S ::= a(S B)
B ::= b()

This will make the transition from grammar to code more obvious. Additionally, it will prepare us for the next chapter: mixed interfaces.

Saturday, January 12, 2008

Fluent Interface Grammars: Part II (Chaining)

Today's episode focuses on the grammars achievable using only method-chaining (as opposed to nesting). For brevity, we shall refer to this type of fluent interface as a chaining interface. See Ordered Fluency by David Laribee for an interesting discussion of chaining interfaces and their grammars.

Before we get started, here are some Wikipedia links that may come in handy.

Recall the Chomsky hierarchy:
  • Type-0, unrestricted grammars
  • Type-1, context-sensitive grammars
  • Type-2, context-free grammars
  • Type-3, regular grammars

There is more or less a one-to-one correspondence between chained interfaces and regular grammars. A regular grammar is expressed using only rules of these forms:
A ::= a
A ::= a B
A ::= $

These are actually called "right regular grammars", but left regular grammars are equivalent in terms of the languages they can express. This regular grammar (with start symbol S) expresses the language consisting of all strings with a run of a's followed by a run of b's.
S ::= a S | a B
B ::= b | b B

The grammar accepts strings such as "aaab" and "abbbbb". Thus, a corresponding chaining interface would accept these chains.
a().a().a().b()
a().b().b().b().b().b()

What does it mean to say an interface accepts a particular chain? Two conditions must hold: the statement must be a valid Java expression, and it must evaluate to the type of grammar's start symbol. In other words, these statements must compile:
S statement = a().a().a().b();
S statement = a().b().b().b().b().b();
This grammar is easily implemented in few classes:

public class Foo {
public static void main() {
S s1 = a().a().b().b().b();
S s2 = a().a().a().a().b();
}
public static class S {}
public static A a() { return new A(); }
public static class A {
public A a() { return new A(); }
public B b() { return new B(); }
}
public static class B extends S {
public B b() { return new B(); }
}
}

Ok, so we've seen that a painfully simple regular grammar does indeed have a chaining interface counterpart. The real question is: How can we get from the grammar to the interface? After all, a richer grammar would not be so intuively simple to translate! With that in mind, here are the steps that will translate an arbitrary regular grammar into a chaining interface. As usual, S is assumed to be the start symbol.
  1. Left factor the grammar.
  2. Make a class for each non-terminal.
  3. Rules with left-hand side S become static methods.
  4. Rule with empty string ($) as their right-hand side make the class of their left-hand extend S.
  5. All other rules are methods of their respective classes.
Rules can become methods in three ways:
  1. Rules of form "A ::= a" become a method named a() in class A returning S.
  2. Rules of form "A ::= a B" become a method named a() in class A returning B.
  3. Rules with left-hand side S follow the same rules, but yield static methods.
Yikes, that sounds pretty complicated. Well, here goes nothing! First let us left factor our grammar, taking care to preserve its regular status.
S ::= a S | a B
B ::= b | b B
becomes
S ::= a A
A ::= a A | b B
B ::= $ | b B
Good. Now, all the non-terminals (S, A, and B) will become classes. S will be empty as always, and we can also see that B will extend S. Next, we translate each rule into a method.
// Top-level:
public static A a() { return new A(); } // S ::= a A

// In class A:
public A a() { return new A(); } // A ::= a A
public B b() { return new B(); } // A ::= b B

// In class B:
public B b() { return new B(); } // B ::= b B

This yields code identical to the implementation above. The same can be done for any regular grammar - with one little tiny exception. Grammars that accept the empty string cannot be encoded in this way because "S statement = ;" would never be legal in Java. This is just a technicality however, because there isn't much of an outcry for fluent interfaces that accept nothing as a statement.

We have shown that (nearly) any regular language can be implemented as a fluent interface using only method chaining. Conversely, non-regular languages cannot be; not proven here, but happens to be true.

Consider the prototypical example of a non-regular language. The language accepts any string with a run of a's followed by an equal number of b's, i.e. ab, aabb, aaabbb, etc...
S ::= a B
S ::= a S B
B ::= b
It is not possible to write an interface that would accept only such chains as these:
a().b()
a().a().b().b()
a().a().a().b().b().b() // and so on...
If you aren't convinced, I would invite you to try it.

Update: That last statement technically wrong. Using generics (fair game, I said Java 5), tkr accepted my invitation and indeed constructed this language. The parameterization trick he discovered doesn't seem applicable to context-free grammars in general, but is worth checking out nevertheless.

Monday, January 7, 2008

Fluent Interface Grammars: Part I

A spectre is looming over the world of object-oriented design patterns—the spectre of fluent interfaces.

This new and unorthodox style has been espoused by such giants as Martin Fowler, such average-heighted people as Steve Asher, and such midgets as Mike Finney. Knowing my place in the order of things, I suppose it is my turn. I'll start out by reiterating some of points being made.
  • Fluent interfaces are a rockin' way to approximate the flexibility of domain specific languages (DSLs) using OO constructs.
  • Fluent interfaces are easy to use, but relatively difficult to design.
  • Fluent interfaces violate several "best practices" of interface design (such as command query separation), but are often a Good Thing nevertheless.
  • If Sun finally gives us closures in Java 7, the skys will part, the rapture will be upon us, and we will all join hands singing "Camptown Races". Also, fluent interfaces will become cooler.
As everyone keeps pointing out, a fluent interface resembles a sort of domain specific language. I would like to explore how far that analogy goes. In other words, what are the limitations on the languages that fluent interfaces can provide? Michael Feathers asks essentially the same question in The Cardinality of a Fluent Interface.

What qualities does a grammar need to have to create an embedded DSL which doesn’t allow nonsensical constructs?

Before addressing this problem, let us set some ground rules. First, we live in Java 5. Second, it is not sufficient for our interface to accept some superset of the desired language; our grammar must be an exact fit. Ungrammatical statements must not compile. They should also be underlined by our spiffy IDE :)

Note: In formal language theory, a grammar is a set of transformation rules and a language is the set of all strings that the grammar accepts. Some background with these concepts is assumed.

Fluent interfaces generally use two ways of combining tokens to form statements: Call-chaining as().seen().here(), and nesting which(works(like(so))). Some interfaces stick to one or the other, while Steve's customer creator uses both:
Customer customer = 
createCustomer.named("Bob").that(bought.item(CompactDisc).on("02/17/2006"));
He explains,

The first thing to notice is that almost half of this code is fully enclosed in parenthesis. The result of 'bought.item(CompactDisc).on("02/17/2006")' is being passed to a method named 'that'.

By contrast, this is what that statement would look like using call-chaining all the way:
Customer customer = 
createCustomer.named("Bob").that().bought().item(CompactDisc).on("02/17/2006");
And here is one of the possible translations using only nesting:
Customer customer = 
createCustomer(named("Bob",that(bought(item(CompactDisc,on("02/17/2006"))))));

The customer creator's grammar would look similar to this.
S ::= createCustomer C
C ::= $
C ::= namedSomething C
C ::= that T C
C ::= and T C
T ::= bought someItem
T ::= bought someItem onSomeDate
Examples of strings accepted by this grammar:
createCustomer
createCustomer namedSomething
createCustomer namedSomething that bought someItem
createCustomer that bought someItem and bought someItem onSomeDate
So far I have introduced the idea of formal grammars for fluent interfaces, and posed a question about their limitations. In future posts I will answer this question for each variant of fluent interfaces: call-chaining, nesting, and mixed. Have faith! I am going somewhere with this.

P.S. Steve and Mike: I was just kidding, you are both well above average height.