craschworks / Bruce Horn interviewed by TidBITS

Bruce Horn interviewed by TidBITS

Bruce Horn, one of the co-founders of Marketocracy (the company I work for), was interviewed by Adam Engst, editor of TidBITS.

The Mac at 20: An Interview with Bruce Horn
——————————————-
by Adam C. Engst

Twenty years of Macintosh. At this year's Macworld Expo, Steve
Jobs played a version of the famous “1984″ ad that launched the
Mac, and Alan Oppenheimer, who was responsible in large part for
AppleTalk, gave a fabulous talk about the history of networking on
the Mac. What I found most interesting was that although twenty
years have passed, many of the original people from those days are
not only still around, they're still producing great work. The
history of the Macintosh is not only still being written, some
of the same people are still doing the writing.

Let me introduce you to another member of the original Macintosh
team, Bruce Horn, who was responsible for a number of the key
aspects of the Mac and who has continued to write innovative code.
At Apple, Bruce was responsible for the design and implementation
of the Finder (oh, that!), the type/creator metadata mechanism for
files and applications, and the Resource Manager (which handled
reading and writing of the resource fork in files; a note in
Apple's technical documentation at one point exclaimed, “The
Resource Manager is not a database!”). The Dialog Manager and the
multi-type aspect of the clipboard also appeared thanks to Bruce's
ingenuity.

So, to commemorate this 20th anniversary of the Macintosh, I
wanted to talk with Bruce about not just what he did at Apple, but
also what he's up to now, since in many ways, his current work is
both a return to his roots and a glimpse at what might be possible
with the Macintosh in the future.

* Adam: Bruce, many of the aspects of the original Mac that you
worked on revolve around accessing structured data. The Finder was
a front end to the filesystem; the Resource Manager, despite that
note in the documentation, was a bit like a flat-file database;
and type/creator codes were metadata that were just screaming to
be used by a database. To what extent was all that planned, or did
you just come to these solutions as you were working?

Bruce: Several different goals drove me to these solutions. Having
had most of my programming experience in Xerox's Smalltalk
environment, where you could change anything you wanted at runtime
(changes made while the program was running), I was looking for
a dynamic way to handle objects in the system so data such as
localizable strings, menus, images, etc. could be modified by
non-programmers without recompiling the source code. At the same
time, I was realizing that the kind of data that I needed to
manage with the Finder – icons for applications and documents,
and bindings to those icons – needed the same sort of mechanism,
and I wanted a unified solution. So the Finder's Desktop Database
was the driver for much of what the Resource Manager ended up
providing.

The file metadata also was driven by Finder needs. Early on I
realized that to provide a double-click-to-open mechanism for
documents, I'd need a simple way to link a document to a default
application that would open it. Similarly, since multiple
applications could open multiple file types, I couldn't just have
a single mapping from a type to an application that would handle
all files of that type. Thus the separation of the type code
(the actual format of the file) and the creator code (the default
application, which could be easily changed). Independent type and
creator codes stored in the filesystem also enabled us to avoid
polluting the filename with type information, which I felt was
a significant advantage of our approach over others.

The Desktop Database was a cache of the bindings between types and
creators and the icons representing them, stored as resources.
Since application bundles – groups of resources tied together
describing document type and icon information – were stored in
application resource forks, installing an application simply
involved copying the appropriate resources from the application
into the Desktop. The redundant information – type and creator
information in the directory, and bundle information in
application resource forks – made it possible to rebuild the
database at any time without losing anything. It turns out
that this was important in the early days.

Resources were, of course, heavily used in factoring out non-
program data (like menus and text strings) that could be localized
to different languages. With ResEdit, this allowed language
experts to quickly create versions of an application without
needing access to the source code.

Once I was able to convince Andy Hertzfeld of the utility of the
Resource Manager, he rewrote most of the Toolbox to take advantage
of it, which saved significant space in the ROM and gave us the
ability to easily localize applications in a general way.

* Adam: So Mac OS X's reliance on Unix-style filename extensions
for mapping documents to applications is something of a step
backward, then?

Bruce: Yes and no. The original rationalization behind this
was that Mac OS X needed to be compatible with Windows filename
conventions, and to do so we'd need to force filename extensions
to be provided. Because there are so many places that a file might
leave the sanctity of the Mac OS and go out into the cruel world
where extensions are required, it was deemed impossible to
translate names from the Mac convention (with types and creators)
to the outside world's convention. As far as compatibility is
concerned, this did the trick.

But over time it has become apparent that it is difficult to do
this right, and the original mechanism of having redundant type
information, and allowing the user to name the files whatever she
wants, was more flexible and less prone to error. It turns out
that Mac OS X still needed a creator mechanism by which individual
documents could be opened by specific applications, so this
information is stored in the resource fork of the file (of all
places, since Apple is discouraging use of the resource fork),
rather than simply in a creator code.

So the filename extension approach has worked, but with a little
less elegance than the original.

* Adam: Why didn't you go all out and create a system-level
database to handle all this data in the original Mac? Was it
a horsepower issue, or were the software problems too tricky
at the time?

Bruce: It would have been nice. I had some ideas in mind, but when
it came down to fitting it in the 64K ROM, the Resource Manager
was all we could fit. It was a real effort on everyone's part to
make code as small as possible. The Resource Manager was 3K, and
the Finder 46K – amazing considering the size of applications
these days!

* Adam: When did you leave Apple, and what caused your departure?

Bruce: I left Apple in the spring of 1984, after doing a “final”
version of the Finder. I guess I was just looking for something
new to do: having spent several years working intensively on the
Mac, I was ready for a break. Being on the Mac team, working with
absolutely tremendous people, was one of the most significant
things I've done, and it still gives me wonderful feelings when
I think about those times.

* Adam: Can you give us a quick rundown of where you worked after
Apple? Were there any common threads among the various projects?

Bruce: After Apple I went to Adobe and worked a bit on a variety
of small projects, including a LaserWriter spooler. When I was
there I met a couple of Carnegie Mellon grad students, and, to
make a long story short, they convinced me that I should go to CMU
for graduate school (Chuck Geschke, one of the founders of Adobe,
was also a CMU Ph.D.) Grad school was a great experience. I spent
some time at the University of Oslo, Norway as a research
assistant, did some consulting at Apple now and then, and had
a chance to work with some intriguing startups while I was a
student. My Ph.D. thesis described the design of a constraint-
based object-oriented programming language called Siri, which
I'd love to re-implement someday.

After graduating I went back to Apple as a consultant in the
Advanced Technology Group and worked on a project called LiveDoc
with Tom Bonura and Jim Miller, among others. LiveDoc was an
experiment in automatically structuring documents so that various
recognizers could determine that, for example, 555-1212 was a
phone number and 124 Main Street was an address, and provide
contextual actions on those items. It was a lot of fun, and I
wish I had LiveDoc today in Mac OS X. Simson Garfinkel's SBook
provides some of these features as a PIM application.

But none of these projects really addressed the problem I wanted
to solve, which was: how can I design an information browser that
works with all types of data, from email messages to images to
music files to documents, and provide a unified mechanism for
organizing, searching, and viewing this information?

I began the iFile project in 1997 to do this, and worked on it for
a couple of years before putting it on the back burner to start my
other company, Marketocracy, where I've been since the middle of
1999.

Marketocracy is a mutual fund company that I co-founded with my
business partner Ken Kam. Our team built a Macintosh-based Web
site running WebObjects and a FrontBase database to allow over
50,000 people worldwide to buy and sell stocks in real time (but
with fake money) to create a model stock portfolio. We provide a
wide variety of tools to help our users to become better portfolio
managers, and by watching their performance over time and ranking
them, we can find the best people in the world to run our funds.
Our Masters 100 Fund, based on the top 100 in our community, has
been running for over two years now and has surprised even us with
its impressive performance and low risk. It has returned over 39
percent since inception when the market has been essentially flat,
and with a beta of 0.47 – half as risky as the market!

* Adam: What are you working on now?

Bruce: Recently I've picked up where I left off in 1999 with iFile
(just a codename for now). iFile is a unified desktop information
browser, like the Finder, but with significant architectural
improvements. It is based on an object-oriented database of my own
design that provides a general way for linking together and
organizing objects of all types. The basic unit of organization is
called a “collection,” which is distinct from a folder in that an
object may exist in many collections but in only a single folder.
Collections are like iPhoto albums or iTunes playlists, but they
can contain anything: text files, images, email messages, music
files, contacts, notes, appointments, and so on. While this sounds
a bit like BFS (BeOS Filing System) and the BeOS Tracker combined,
it is much more general and can be used on any filesystem with the
appropriate drivers.

The obvious first application for the iFile technology was
in photo organization, an area in which iPhoto does quite well
already. However, iFile provides more capability in organization
by image metadata (it currently keeps track of 46 different
pieces of metadata for each image), and it should scale much
more smoothly for large collections than iPhoto. But iFile is
not simply a photo manager: it is a general purpose information
browser that can be used in a variety of ways, and can easily
integrate different information sources, such as PIM, email, and
music, among other data types. I think the version of iFile that
I will release publicly will provide much more capability in
those domains.

* Adam: Is it fair to describe iFile as the Finder you'd write
today?

Bruce: Possibly. I think it is much more ambitious than I had
originally intended. If I can eventually get it scaled down to
a level where new users can understand it quickly, it might be
a nice alternative to the Finder.

* Adam: Have you shown it to people at Apple? What did they think?

Bruce: Back in 1999 I showed it first to the Finder group, then
to Avie Tevanian, and finally to Steve Jobs. I think that Apple
was strongly focused on solving the problems of getting Mac OS X
out the door as soon as possible, and looking at an alternative
Finder was low on their priority list. I believe they were
intrigued but had already committed to a different direction,
and couldn't turn the ship in time to take advantage of the
iFile technology. Given the history of Mac OS X, I think they
made the right decision.

* Adam: Let's look at iFile more deeply. There are two aspects to
any filing system, getting data in and displaying that data to
the user. How would someone get data into iFile?

Bruce: The current version of iFile requires the user to specify
the folders that the user would like iFile to track; this is done
by dragging the folders into the iFile workspace window. Once this
is done, iFile tracks any changes to the contents of the folders
and automatically updates the database as required. For example,
the user can drag in the Pictures folder and be able to browse all
the images, create collections, etc., without actually copying any
files or moving any data. iFile respects your directory structures
and never modifies anything directly, in contrast to iPhoto, which
copies images into its own directory hierarchy.

The release version of iFile will not require the user to request
that certain folders be scanned. Instead, iFile will initially
provide a view on the user's home directory, and will scan the
files and folders in the background automatically.

* Adam: Good! The less work users must do, the better. In fact,
one of the main problems with any filing system is that few people
put enough effort into categorizing and managing their data to be
able to find things later reliably. Can iFile automatically
categorize files based on metadata and content?

Bruce: Yes, it can. Collections are a way to automatically
categorize files by their properties. Because iFile maintains
file metadata in the object database, it can search and sort
through the metadata very quickly to return the appropriate
files. Collections are also “live”: specifically, if files
appear on the disk that match a collection's specification,
they will be automatically added to that collection, regardless
of whether the collection is currently being viewed. One can
imagine all sorts of interesting AppleScript scripts that could
be triggered based on these events.

Collections also collect files based on their content. Rather than
searching for individual words as Google does, collections search
for key phrases: a word or a sentence. Files that contain any of
the key phrases specified in the collection are automatically
gathered into that collection.

So, what collections do is provide a new way to slice-and-dice the
information you already have in a different way, without requiring
you to import your data or commit to a completely new
organization.

* Adam: What do you think about adding a capability along the
lines of a Bayesian classifier that would evaluate the contents
of a file statistically, much the way some spam filters or the
email classifying program POPfile work? That could reduce the
user's effort even further.

Bruce: That is a great idea and has been discussed for quite some
time. In fact, Apple had worked on a project that was based on
this idea. Piles were automatic groupings of files based on their
content:

One of the challenges here is to determine an appropriate
similarity function: how do you decide what the collections should
be a priori, to avoid the problems of hundreds of collections,
each with one file, or a small number of collections with
thousands of files? That will take some work.

* Adam: What does iFile do on the display side? Can users create
their own “smart folders” (a bit like smart playlists in iTunes)
that automatically show files that match a specific query?

Bruce: Absolutely. A collection is essentially a smart folder,
with a query specification. For example, it is easy to create
a collection that groups together all the images taken by a
particular model camera by specifying “ is '2500' and
is 'Nikon'”, since that data is available in the EXIF
metadata for the image. Similarly, metadata such as ID3 tags for
music; image data such as resolution, width, and height; file data
such as filenames, creation and modification dates, and sizes; and
so on are all stored in the database for object retrieval and
organization.

So collections actually have three mechanisms for grouping:
manually via drag-and-drop; automatically via metadata query
specification; and automatically via key phrase match.

* Adam: iFile's architecture sounds tremendously appealing, but
I suspect the devil is in the details, and thus in the interface.
Does iFile stick with the current file/folder metaphor (despite
the terminology shift to collections), or does it offer a
rethinking of how we interact with our data?

Bruce: You are right that the devil is in the details. I'm
currently working on how to present all this information in an
appropriately intuitive fashion, and I think I'm getting closer,
but there is still clearly work to do.

iFile begins with the traditional, icon-based file and container
organization (containers being either folders or collections),
but goes further with a variety of different views and layouts.
Many of the layouts provide preview views of the contents of
the files, and in the case of text files, iFile automatically
creates hyperlinks to related collections from within the text.
It's difficult to explain, but once you use iFile you'll find
that some of the views do in fact provide you ways to view your
data from different perspectives.

The more you provide iFile with information regarding how you want
to see your data, via defining collections, the more it can help
you by cross-indexing and showing relationships where they were
not clear before.

* Adam: Are some of the things you're attempting in iFile beyond
what many users can understand? Lots of people just want to be
told what to do, and something with iFile's flexibility might
be lost on them unless it was able to watch their actions and
automatically build collections.

Bruce: I agree that iFile can be somewhat intimidating to new
users: there are a lot of different things that iFile can do,
and there needs to be more immediate gratification when using it.
Creating collections automatically is a good approach, and by
creating useful collections based on not only images but documents
and email, I think that the power of the technology will become
more apparent. I'm planning on implementing some of this in the
next few months, so stay tuned! For anyone interested in this
technology who would like to be contacted when there is a public
version available, sign up at the site below, and I'll keep you
up to date. I'd be happy to go into detail about the release
version in a future issue of TidBITS.

* Adam: Bruce, thanks for taking the time to chat with me, and
we're all looking forward to seeing what you come up with iFile.
Who knows, perhaps now that Apple has stabilized Mac OS X, they'll
be interested in looking at what you've done again.

craschworks

Bruce Horn interviewed by TidBITS

Post a Comment — Trackback URI

RSS 2.0 feed for these comments

Post a Comment