Language grammar, mental models and fluidity

May 27th, 2012

Select file
Invoke the copy operation
Select destination folder
Invoke the paste operation

The sequence of steps to copy a file to a different folder is now so ingrained in my motor memory that I don’t question why the steps have to be in this specific order. You select something, and then you “say” what you want to do with it. It only “feels” natural because we’ve been doing it for so long.

In English you would say “Copy A to B” when you’re asked to describe what you’re doing. But when you actually do it, it’s “A copy B to”. Sort of a reverse Polish notation in a sense. But following the rules on English grammar only makes sense if you insist that the sequence of actual steps must conform to those rules no matter what is the local dialect. In English, the only way to say this is “Copy A to B”. But in Russian, you can use any one of the following:

  • Copy A to B
  • A copy to B
  • To B copy A
  • Copy to B A
  • A to B copy
  • To B A copy

Here the only thing that stays the same is “to B”. We end up with 3 parts of the sequence and 3! ways to combine them. To a native speaker the end result is the same, but the importance is conveyed by the order. The part that comes first is the more important, and the part that comes last is the least important. There is a strong implicit importance relayed by the ordering.

But I wouldn’t really expect a localized version of Windows, OS X or Linux to allow me to do all six possible sequences to copy a file to a different folder. Ignoring the vast complexity of supporting something like this for all possible languages and dialects, it’s quite counter-productive to expose radically different ways to achieve the specific result depending on the specific natural language of the user. Or is it?

I had a little obsession with calculators growing up. I actually had only two, but I’ve spent an inordinate amount of time doing various tricks with them. The first one was of the usual variety. Every math operator was an implicit “equals”. If you did 2+3×4, you got 20. If you wanted to do a sine of 30, you did 30 sin. It’s only weird to have these differences if you stop thinking about it. It’s perfectly logical to have these differences if you grasp at least the basic complexities of implementing a simple calculator in your starter programming language of choice.

And then there was Sharp EL-531G with its D.A.L. – Direct Algebraic Logic. The way to compute something was to start typing the same exact sequence as you have in your problem. Sine of 30 is sin 30. 2+3×4 is evaluated only when you press = and then you get 14. The small downside is that if you’re typing quickly and don’t check every single operand, you won’t discover “obviously” wrong intermediate results. The much bigger upside is that you don’t start mentally regrouping parts in a complex formula just to fit the implementation model. The computation follows the rules of precedence, and you also have the brackets to control it if necessary.

The notion of fluent interfaces became quite popular a few years ago. In the “regular” object-oriented approach, copying a file would look something like this:

FileRef reference = new File(“path/to/A”).copy();
new File(“path/to/B”).paste(reference);

With a fluent interface, it would become something like this:

FileUtils.copy(“path/to/A”).to(“path/to/B”);

Where FileUtils.copy would return an object that has a to method that does the actual copying of the bits. Fluent interfaces are often tweaked and judged on the merits of how close they get to the rules of English grammar, with various techniques to encapsulate the complexities of chain links that arise because of that.

But what if I’m not an English speaker? A fluent interface based on the rules of English grammar is no more understandable or fluent, if you will, if it doesn’t follow the rules of my native grammar. You’re just trading one convention over another. Long chains of fluent calls are also counterproductive to handle failures. If you have more than one link, how do you handle and debug failures? Should a failure at a middle link roll back the results of the previous links? This also brings another interesting point which goes back to N! variations allowed by the rules of Russian grammar. If a native Russian programmer exposes a fluent API that allows all possible variations, does it mean that the implicit importance of the order affects how you handle intermediate failures?

It would be weird if I had to first press Cmd+C and only then to select a file to start the copy process. It would be equally weird to press Cmd+V and only then to select the target folder. How would you even “tell” to the computer that you are at the target folder and not in the middle of navigating to it by clicking around? But it’s only weird because of the 20-year strong rule of WIMP interaction model.

If you move away from files organized in folders towards files tagged by labels, things become different. If you move away from action “verbs” denoted by commands on the selected object toward a more fluent voice-controlled interaction, things become different. If you move away from the very notion of files as boundary units delineating your data, things become different. How different? I don’t know. But the next 20 years will be quite interesting.