Tom Chi  

What’s Hiding Behind Search-Based Navigation?

October 19th, 2004 by Tom Chi :: see related comic

When one does a Google search for the ‘War of 1812′, it doesn’t really matter much which site the answer comes from. Users poke around the first couple and results if they find what they need, they are done. Excellent. The pages and pages of other results may not get visited, but still serve a valuable purpose since they provide information about the size and shape of the data space for the subject.

Now what happens when we try to move search to the desktop? Searching your own files turns out to be a considerably different task because most of the time there really is only *one* correct result. An extensive result list actually puts you farther from the desired information and requires a lot of post-search list scanning.

There are other important differences which make searching a desktop machine a harder problem. Let’s examine a few:

Recognition vs. Recall

One major difference between browsing folders vs. navigating with Search is the difference between recognition vs. recall. When I browse through a folder structure, each step is a recognition task. While it seems clumsy to poke around folders with mouse or command line, at each step of the process, the cognitive load is very low. This is good. When I get to folder X and see that my subfolder choices are “DCM100500″, “NYC_Pictures”, and “Audiotrax”. It takes very little effort to recognize which is the correct branch.

Compare this to a search task where I begin my work at a search bar. Kicking off the search is a pure recall task. What did I call my new music directory? Was it ‘music tracks’? Since it had mixes in it, was it ‘mix tracks’? All these queries would return quite a few results which I would subsequently need to parse through before determining that the correct result was not present. It turns out the folder I am looking for is ‘audiotrax’, but having created it a month ago, I had forgotten that I had used ‘trax’ instead of ‘tracks’.

It seems like a silly distinction, but at least for me, this is an incredibly common scenario. I cannot recall the exact filename that I used or the exact foldername of the containing folder. Similarly, difficulties arise when there are many duplicates of the same file e.g. ‘prototype04′ appears 10 times because it was iterated on by several people and saved in many folders. Ultimately, the problem originates from the fact that a lot of my filenames are neither meaningful nor unique, especially looking back across years of data.

This was my major difficulty with AppRocket — which I stopped using after about two months. I would find myself sitting at the search line, trying query after query and having to sift through numerous erroneous result sets before eventually finding my data. Or simply breaking down and navigating through folders to find it the ‘old-fashioned’ way.

Data Privacy

Another rarely talked about hurdle for Search to overcome is the problem of finding information that would be better off left alone. Imagine this scenario: you lend your laptop to a friend to check some email. If your friend is something of a snoop, he/she could try the following queries against your data: ‘bank’, ‘accounts’, ‘financial’, ‘passwords’, ‘pass’, ’sex’, ‘pants’. Within about a minute they could find out more about you than you would probably want anyone to know. Ditto this for any coworker who happens across your desktop machine at work, or for children ‘accidentally’ finding out too much about their parents.

This is a legitimate privacy concern which is reasonably hard to mitigate. One way would be to designate ’safe’ areas on your machine which will not be indexed… but a snoop could just check those settings to find exactly where the juicy stuff is. Another way is to use obscure names like ‘83dhx’, but one could see how this would be annoying to remember everytime they needed to update financial records.

What it all means

While I don’t think that these are unsolveable problems, solving them might have interesting ramifications for how we use computers. For one, to tackle the RvR problem, we might need to start being much more careful about creating meaningful file names. It’s somewhat ironic that we might need to become more diligent filers (or as kevin mentioned, nab data from a metatagging community) in order to have search work properly. Afterall, search is supposed to excel at rationalizing an unstructured collection of data.

As for the privacy concerns, these seem a lot more difficult to me. The ability to have full-text indexing of all files will make it nearly impossible for private data to be safe from prying eyes. Ultimately, it might mean a shift in the way that we use our machines — offloading activities which need privacy or security onto non-indexed machines. Or perhaps creating special software and protocols to manage these activities and keep them secure. Are we ready for this?

3 Responses to “What’s Hiding Behind Search-Based Navigation?”
Eric Wood wrote:

Tom,
Excellent comments on the Data Privacy issue.

On the topic of Recognition vs. Recall, one of the things that previous technologies have failed to capture effectively is content and metadata, both of which solutions from Apple and Microsoft claim to have. This turns the task of “recall” to what I would call goal based navigation.

Example scenario: “I don’t know what the file was called but it was a PowerPoint with a slide mentioning Search as Navigation last year”

I can enter Search, Navigation, 2003 and I would probably come up with the right file. What is great about Apple’s Spotlight and hopefully Microsoft’s search is that real-time query updates help me see how detailed I need to be before “the right one” is in the top to results.

Another under exploited or at least not clearly understood method is the use of dynamically filtered collections. In this case we aren’t looking for the “right” answer, but a wide variety of answers. Google’s desktop search comes close to that when I type in a terms like “Joe Smith HCI” returning all the mail from Joe Smith, all the presentations from Joe that contain HCI etc… It will be interesting if people will actually start tagging their information with metadata (doubtfully).

Finally, the thing I’m really enthusiastic about is the potential to search for computer related goals. Again I’ll refer to spotlight. The Preferences application in Mac OS X 10.4 has a search window. I can type in something like “Change Network”, or better yet “set proxy” and Spotlight will open to appropriate transactions. Extending this into the future
-Eric Wood

Tom Chi wrote:

Wow. I guess we are seeing some of the privacy issues faster than I anticipated:

http://it.slashdot.org/it/04/10/21/1934234.shtml?tid=172&tid=217

It is serious.

The Slashdot comments focus on how stupid it is to assume security on any sort of public computer, but I think those comments miss the point. It needn’t be a public terminal, the same sort of issues could come up with your home machine when your friends and/or family are accessing the data.

Perhaps this will be the technology that drives widescale adoption of biometric security measures. On the other hand, it might drive the market to create ‘one-user-only’ machines… since the cost of some biometric verification systems outstrips the cost of just getting a new machine.

Jay Zipursky wrote:

Excellent point regarding recall and searches. I’ve run into the same problem where I thought Google Desktop would help me, but instead I wasted time in a fruitless search.

In a recent search attempt, the real problem was that GD has no idea about relationships. I was searching for an email I had sent about bugs in a particular product. I searched for the product name and “bugs” but was unsuccessful. When I found the email through other means, it turned out I never mention the product name, but instead used the name of one of the UI components. If GD could understand that relationship, it may have been able to return some decent search results.


Leave a Reply


OK/Cancel is a comic strip collaboration co-written and co-illustrated by Kevin Cheng and Tom Chi. Our subject matter focuses on interfaces, good and bad and the people behind the industry of building interfaces - usability specialists, interaction designers, human-computer interaction (HCI) experts, industrial designers, etc. (Who Links Here) ?