Software, simplicity, and the librarian's corner case conundrum

I am a librarian and a programmer. I develop software supporting people who use libraries and their library-like activities.

Which is where things gets complicated.

Horribly, inextricably, inevitably complicated.

If you track software development punditry, such as it is, you'll have seen a lot of hammering of late on the theme of keeping things simple. It's an admirable goal, and when it works, it can work wonderfully.

The theme manifests itself in claims of Pareto principle-driven efforts to deal first with simple problems or to build a business around delivering simple, streamlined applications. Which are wonderful and provably useful and commercially successful and all that.

Which, as the proponents of these efforts readily acknowledge, indicates primarily that a centuries-old economic model still holds: build something everybody needs, do it well, do it simply, and do it cheaply, then profit. Good stuff. "It worked for UNIX, it'll work for us." Awesome. Sign me up. No, really, sign me up, I love the stuff.

It's just that there are problems with that model if you're a librarian and you develop software.

First, my goal is not to profit. My goal is to help people build their own libraries. If my goal were to profit, I'd be an unlucky, poor librarian/hacker driving a crappy old car to job interviews in the valley right now, or I'd be a lucky, rich librarian/hacker driving a slick new car interviewing unlucky job applicants in the valley right now (with roughly equal probability... enough colleagues are in both camps that, like them, I could have just as easily landed clearly in either by now). But that's not my goal. Instead I'm lingering on soft money fumes in an office cube near an ivy-covered institution because I'm a librarian who writes software to help people build their own libraries. Which, honestly, is pretty much exactly where I want to be.

Second, like the simplicity gurus will tell you, the simple things really only work for the simple things. Sounds simple, right? Mostly, it is. What are the simplest things people need to do? Track to-do lists, events on a calendar, share notes and documents. We all do that. Build an application that does that, and makes it simple, and does it cheaply, and you'll profit. Add complicated features that not everybody needs and you'll profit if you're Microsoft but otherwise you might not. And the people who just want the simple things will ultimately leave you for something simpler.

One thing that doesn't get mentioned in there is that simple != easy to build. The people at 37signals, say, are exceedingly gifted and capable in fulfilling their vision of simple applications everybody can use to do simple things. It's why they got there first, it's why their apps are so good, it's why I and more and more people I know use them.

So, it seems, perhaps only a small percentage of people can do the truly best job at the really hard work of figuring out how to make something simple work simply. Hmm... interesting.

If you're a librarian like me and you take this example and turn it toward your own work to help people build their own libraries, it hits you... it is not simple to build a library of one's own. And if you're a librarian like me, you have a ready list of why not:

  • Metadata is complicated
  • People in libraries don't all use the same items the same way
  • Maybe 20% of the collection is responsible for 80% of the use but that other 80% includes some really important stuff
  • Attempts to use new tools works great for new data but can be exceedingly hard for old stuff. Like, anything predating 1960. Which we have a *lot* of, and which is often *really* important.
  • Did I mention metadata being complicated?

See? If you're a librarian like me I bet you're already adding to that list in your head. You are, aren't you? I can hear you. Stop that!

The trick is to honestly refactor out what bits aren't complicated and which bits everybody does pretty much the same way, and to design the new tools so that you can get the old stuff into it somehow, but even if that shoehorning is difficult, then all the other easy stuff becomes possible with the old stuff.

Which is also hard.

(And, not always useful or important to anybody. But that's another story.)

And, which probably only a very few extremely talented people can do well. But, to the best of my knowledge, most of those people already figured out that it's easier to focus on the hard parts of solving simple problems and they're busy profiting on their productivity in accomplishing those tasks.

To make refactoring choices well requires ruthless analysis. There are some principles involved, but it's still hard to get right. Even if you can decide what to refactor, you still have to implement the details, which, as I noted already, very few people are exceptionally good at.

A while back a few of us stumbled onto a good place to start the refactoring for library software:

"...our ability to meet these users' needs will be limited by our inability to allow users to create and connect information sources and services as they see fit."

This broke down into needing to support something as simple as copying data across web applications. Fortunately somebody smarter than us figured out that what we really wanted was a web clipboard. We started working on the copy part; the Atom Publishing Protocol probably already suffices for paste. And fortunately for us the biggest software company in the world now wants to give the world web clipboard integration.

Cool. This will all probably work, probably soon, and we'll all probably be doing it in all our apps by 2007. It's nice to stumble onto a problem that the right smartest people in the world are already thinking about once in a while, because occasionally they'll give away their solution.

And the solution will probably look something like this: their clipboard specification will work well for stuff that's "in front of your face" like addressbook entries or blog entries or event listings that also fit nicely into a microformat. That solves the 80% usage pattern with the 20% spec; unAPI or something like it will probably get tied in so that more discrete macrocontent with various dissemination formats can be copied or referenced in, too, because it solves the 20% usage pattern with... um, well, another 20% spec.

But that's just a piece of the library refactoring puzzle.

It is an easy criticism of library standards (Z39.50, ISO ILL, NCIP, SRU, OpenURL, OAI, MARC+AACR+ISBD... need I go on?) that they make complicated things possible, and simple things complicated. It's easy because it's true. Many of these standards initiatives started from a complicated world where complicated stuff had to be possible so solving the complicated problems anchored the activity. Many of these specs are actually quite beautiful when viewed from this perspective: you can catalog *anything* in MARC. That's pretty cool. You can wire up *any* combinations of services in OpenURL. That's slick.

The problem is, as Casey Bisson summarized visually, and eloquently, is that we make it hard for the simple bits to be done simply.

But, again, this isn't easy.

The key is for the people who only see the complexity to recognize that there's an enormous demand for simplicity, and for people who only want the simplicity to recognize that some things require complexity. Whenever both sides are willing to acknowledge the viability of each other's viewpoints, we can work together to refactor all of it appropriately.

Let's walk through a few cases. First, the magical, mystical, "single search box". We'll just ping-pong dialogue-style between two abstract players representing the parallel alternate viewpoints of not-library/library and simple/not-simple.

  • Simple: Making a single search box is easy. Just use google.
  • Not: But it doesn't search this licensed stuff people depend upon in the library.
  • Simple: Okay, so, now, google scholar does.
  • Not: No, it doesn't search these other weird things our community cares about still.
  • Simple: Well, expose your data better.
  • Not: Okay, here's a Z39.50 interface.
  • Simple: Bah, we'll just crawl it.
  • Not: Wait, use OAI-PMH instead.
  • Not: Okay, but we still have to crawl those other things X, Y, and Z you care about so much.
  • Not: Yeah.
  • Simple: Hmm, well, while we were working on that, we came up with OpenSearch.
  • Not: Funny, at the same time, we came up with SRU.
  • Simple: But this is easier.
  • Not: But this is more robust.
  • Simple-ish: Ah, you've got a few good points there.
  • Not: See?
  • Simple: Yeah, well, but you could still add OpenSearch or SRU to your big weird data silos.
  • Not: Hmm, good point.

See? Now, that's progress. And from my admittedly strange perspective, that's pretty much what's happened over the past two years.

Let's try another case that hasn't yet come along quite so far yet: linking to fulltext of articles (as a common case of the general deferred-context-sensitive-service-interconnection model, which we'll just not refer to as such for now, for obvious reasons).

  • Simple: Here's a link to the full text of a cited reference from my web page.
  • Not: But I can't follow that link because it has a session id tied to your account in it.
  • Simple: Oh, sorry, here's a link to the table of contents for the title instead.
  • Not: But we don't subscribe to that resource here, we get it from another vendor.
  • Not-so-simple: Oh.
  • Not: Here, try this OpenURL link.
  • Simple: Huh???
  • Not: It's a context-sensitive link to our library's service resolver.
  • Simple: Okay, whatever, here it is.
  • Not-so-not: It works!
  • Simple: But wait, now it points to your resolver, and I can't get to it through your resolver.
  • Not: Oh.
  • Simple: Hell, I'll just post a copy of--
  • Not: COPYRIGHT!
  • Simple: Oh.
  • Not-so-not: Okay, try this OpenURL link instead. It will redirect to any resolver in North America by checking your IP address.
  • Simple: Cool. It works!
  • Not: Wait a sec... what about people that are off-campus.
  • Not-so-simple: Yeah, and, what about all my collaborators, who mostly work in Spain and Sierra Leone?
  • Not: Good point.
  • Simple: Hrm.
  • Not-entirely-not: Try this instead, it's a COinS, an OpenURL in a microformat-like html pattern which you can automatically redirect to your own institution's resolver.
  • Not-so-simple: Okay, but don't I need a little thingy installed?
  • Not: Yeah, but here's a list you can grab one from.
  • Not-so-simple: But that's only the first list of North American places.
  • Not: Oh.
  • Simple: What about an actual microformat for a citation instead of that complicated OpenURL?
  • Not: Hmm... interesting.

...which is about where we are on that just now, too.

It's easy to list other common case vs. corner case conundrums. User tagging (folksonomies) vs. administered taxonomies. Commodity-yet-proprietary document formats (.doc) vs. not-so-well-used-yet-open document formats (opendocument). "Everything on the web" vs. "incunables and microfiche and manuscripts, oh my".

The work of refactoring library services so that users can tag an archival finding aid or a single search box can search incubula and anybody can copy and paste references between search results in webapps and personal library tools is pretty much the meat of the work of the intersection of librarianship and software develop I try to hang out in these days. It's a cool place to hang out... 15th century rarities of great value and beautiful illumination over here, shiny AJAX whizbang for moving commodity information around over there.

But it would sure be a nicer place to live if the simple-is-all crowd stopped throwing stones and the complexity-is-everything crowd stopped dropping 900lb. catalogs on everybody.

Trackback URL for this post:

http://onebiglibrary.net/trackback/50

"Make everything as simple as possible, but not simpler."

Thanks for all the hard work you are doing Dan, and keeping your finger on the pulse of things that are happening in the library software community.

For what it's worth, I really *dig* your use of the dialectic between the simplicity_is_everything and completeness_is_everything camps. I think it does a great job of showing where and who we are as library technologists. Looking forward to that book :-)

I guess the great irony in the quote of Einstein is that it sounds like such a simple thing to do.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <pre> <code> <img> <ul> <ol> <li> <dl> <dt> <dd> <blockquote> <form> <input> <span> <object> <embed> <br>
  • Lines and paragraphs break automatically.
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <apache>, <bash>, <css>, <diff>, <dot>, <java>, <javascript>, <mysql>, <perl>, <php>, <python>, <rails>, <ruby>, <sql>, <xml>. Beside the tag style "<foo>" it is also possible to use "[foo]".

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
7 + 9 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
Syndicate content

This site is Copyright (c) 2005-2008 by Daniel Chudnov. All rights reserved.

All opinions stated here are my own, and do not reflect those of my employer.