Room 101

Gilad Bracha's blog. A place to be (re)educated in Newspeak

Originally posted on Blogger on January 13th, 2007

Representation Independent Code

In most object oriented languages, replacing a field with a method requires you to track down the uses of that field and changing them from field accesses to method invocations. The canonical example is a class of points. You decided to change your representation from cartesian to polar coordinates, and now all the places you wrote ‘x’ have to be rewritten as ‘x()’.

This example isn’t so bad, because the odds are you already had an x() method, and you probably had the sense to avoid making the x field public. But maybe you made it protected (perhaps your language is smart enough to disallow public fields, but simple-minded enough to force them to always make them protected, like Smalltalk). If x is protected, you’ll need to find all the subclasses. Maybe you don’t have access to all of them, and you can never get rid of the field x.

Or maybe you make a stupid mistake, and made x public, perhaps in the mad rush toward a release. Won’t happen to you? Take a look at Java’s System.out and ask yourself how it got to be there. Now go find all the uses of x and change them. Even if you can, it’s pretty tiresome.

The fact is, given the ability to publicize a field, programmers will do so. Once that’s happened, tracking down the uses may be impossible, and in any case is a huge amount of work.

It would be nice if you didn’t have to worry about this sort of thing. If everyone using your object went through a procedural interface, for example. Smalltalk makes all uses outside of an object do that - but uses within the object, in its class and subclasses, are exempt. As for mainstream languages like Java and C# - they allow you to declare public fields; it’s your funeral.

About 20 years ago, Dave Ungar and Randy Smith introduced Self, which fixed this problem. All communication is through method calls (or synchronous message sends, if you will) - even the object’s own code works exclusively by sending messages to itself and other objects. Fields (slots in Selfish) are defined declaratively, and automatically define access methods. The only way to get or set a field is by invoking a method. So if you get rid of the field and replace it with a method that computes the value instead, no source code anywhere can tell the difference. The code is representation independent. Self’s syntax makes it very easy and natural to send a message/call a method - there is no overhead compared to accessing a field in other languages.

In C# they have a thing called properties, which is similar. Except that C# also has fields, and so it requires careful attention by the programmer to ensure representation independence. In other words, it cannot be relied upon to happen. I don’t know why the designers of C# chose to support both fields and properties. I should ask my friends at Microsoft (yes, I have a few; I’m very non-judgmental). In complex languages, there are always all kinds of strange gotchas and constraints.

There are of course other ways that languages can undermine representation independence. In particular, the type system can support class types that make code dependent on which class you use, rather than on what interface is supported. I don’t want to dive into that right now.

The point of this post is to draw attention to the importance of representation independence. If you are using C# or something else with such a construct, I’d suggest you make the best of it and use properties or their equivalent religiously. And future languages should follow Self’s lead and ensure representation independence.