Recent watercooler discussion around Seattlest HQ has centered around the fact that if you do a Google search for "Seattlest," the search engine no longer "helpfully" suggests that you might have meant "Seattle St." In the grand scheme of things this isn't huge (we're already aware of the fact that us and our -ist/-est brethren are taking over the world one city at a time), but it does warm our cold little blogger hearts to know that some complicated, impersonal algorithm is willing to give us this nod at legitimacy. So in our neck of the woods, this is of much greater importance than the Youtube acquisition.
So what does this illustrate in the bigger scheme of things? We know what we mean when we search for "Seattlest," but until now, Google didn't. It's a matter of context, and context is something that search currently lacks. Last week's discussion by Information Architecture gurus Peter Morville and UW Prof. Joe Janes touched on this and other limitations in the realm of cyberspace, not so much presenting solutions to these problems ("they're very hard"), but providing a good overview of where we are and how we got there. It was enough to make Seattlest both amazed at what a bunch of nerds have been able to accomplish and annoyed that they can't make faster progress.
More wearing our nerd on our sleeve after the jump.
Joe Janes started the talk with a presentation entitled "Searching for Information: The Last 50 Years." Moving from various library systems to the current state of the web, he covered how search has moved from tedious manual work (oh card catalog, we loved you so) to being online, and the additional freedom and ease that's come with that transition. From a linguaphilic standpoint, we appreciated Janes' talk because of his use of the terms "polysemy" and "synonymy," which we are just dying to use in everyday conversation. Both terms describe limitations with even the most sophisticated systems today, as automated systems can't necessarily parse what you mean without clarifying terms, forcing users to "be the search engine" in order to get relevant results.
A noticeably jet-lagged Peter Morville continued on the topic of "findability." It's one thing to have a site online, but it's another matter entirely to ensure that that site (and it's contained information) is as readily available as possible for users. Not just a matter of good design, findability implies a certain amount of user-centricity that is all-too-often ignored. Morville spent a bulk of his talk on the cultural implications of where this could all head, presenting both a future where you could Google Map search your apartment for your missing sock and a present where parents can stalk their kids.
One of the most interesting aspects of these two speakers was the sheer reverence they hold for information and its repositories. Described as the "Revenge of the Librarian," they seem to both revel in the fact that the limitations of current search will involve a bit of a move toward the more organized structure of past library systems, where metadata is king and someone (or something) has to be able to organize what's out there. Hardly luddites, they know that information is worthless if you can't find it, and they're doing their parts to aid the cause of those that help us find what we're looking for. Janes and Morville, Seattlest salutes your cause, and we subtly suggest that as many of you as possible choose to show your support for librarians this Halloween.

Around The -Ists This Week


the fashion in my graduate program was to enumerate the myriad ways in which google stinks. some of those were decent but ultimately insufficient reasons. there were more reasons for why it was good, certainly. after a couple of years down the road building text mining systems, and after a long talk with a senior engineer at google, i can see they are heading the game theory-predictable way of the market leader: choosing product diversification over innovation. the windows search beta is more innovative than the google search. the market leader's best strategy is to say the winning course.
ultimately google is very good for a user who has wide-open lossely defined needs with little background knowledge or a user with background knowledge of a subject who has a ton of time to waste. you can see google's glaring shortcomings when you work for a company in a highly specialized field that uses a google box to provide their search. it's all in how google judges quality (via PageRank).
morville is right. you need to add faceted classification menus for searches just to get started. clusty already does this and microsoft is going to bring theirs forward. meanwhile google is gobbling up competitors rather than innovating.
i'm not sure the answer to your problem is with disambiguation. if you add contextual information you might, even in your case, send 'seattlest' farther down the relevance list and push 'seattle st' higher. adding context is crucial, but managing possibilities is always better than eliminating them. after all, IR systems and their descendants can do nothing more than make suggestions to users. in this respect google has made the right choice by showing the possibilities in different groupings. maybe you could argue that google is doing the faceted search *better* than vivisimo/clusty or microsoft....