registry

A Clean, Well-Linked 'Base (or, Solving the "Appropriate Resolver" Problem with the OCLC Resolver Registry)

Along with several colleagues I've come to the opinion that the current OpenURL implementation workflow and user experience is wholly insufficient. There's too much maintenance overhead on both sides of the library/publisher equation. There's too much work for small publishers to be able to participate. There's almost zero possibility that a tiny publisher (like a single weblogger somewhere) will be able to put useful OpenURL links on their own little sites.

What we need next if we really want to spread OpenURL-based services more widely is a no-configuration, no-overhead, inexpensive solution that works for the widest possible range of libraries, publisher/vendors, and users. (The usability of the prevailing OpenURL resolution click-flow is a wholly separate matter with its own insufficiencies, but we can't solve everything at once.)

COinS, as cool as it is, remains inadequate for meeting this need statement because it requires every user to twiddle bits on their desktop, although it pushes us a step closer by *allowing* users to benefit by twiddling bits, which wasn't possible before.

What to do? Use registries.

The OCLC OpenURL Resolver Registry comprises records for roughly 1000 OpenURL resolvers at various institutions, mostly but not solely in North America. It also provides a simple web service that takes an IP address as a parameter and returns zero-to-many resolver records for every resolver that serves users coming from that IP address.

What does that mean? If you're like me, and you work for a small service like the Canary Database, you used to be essentially unable to provide user-appropriate OpenURL linking without having to configure many many ranges of IP addresses after many many conversations with librarians. "Used to be," that is.

Are you on a campus (or using a campus proxy) for an institution with an OpenURL resolver? If yes, visit the Canary here and tell me what you see.

What you *should* see (only just turned this on, mind you, so please report broken stuff!) is working links to your own institution's OpenURL resolver. Easy, right?

Here's how to implement it on your site:

  • read about the OCLC OpenURL Registry Gateway service and find the details of the query service in the Word doc on that page.
  • implement code in your database that queries the gateway with the IP address of your webapp's incoming users (REMOTE_ADDR in CGI-land)
  • parse the response and if there's a resolver in there, formulate and render links and link buttons to that user for all the references on your site
  • watch users' eyes light up when they see links
  • get excited

Well, it's a bit more complicated than that. You're not quite done:

  • since you don't want to hit OCLC's service multiple times for the same user, build a little caching system into your application It could be as simple as a single table with a UNIQUE constraint on users' IP addresses that maybe stores the raw xml Registry responses and parsed values for at least base_url, icon_url, and link_text
  • instead of hitting OCLC every time, first check your db, and only query OCLC if you don't already have a record.

Better, right? Actually, the best thing to do would be to also parse out the per-institution IP address block information and do local queries against the *ranges*, not specific IP addresses. PostgreSQL has a built-in type that supports this really well.

Any competent web geek should be able to implement this in a few hours. Call me if have questions.

So, to review, here's what happens:

  • You implement a function in your webapp that queries OCLC and caches the resolver information locally and then renders appropriate resolver links to all your users
  • Your webapp users follow the links as if they were always there because it looks just like what they're used to seeing in Fancy Expensive Resources and they'll Just Know (tm) what to do

Pretty cool, eh? It's not without some important problems, though.

  • It's still not good enough. It doesn't solve the "but I'm off-campus" problem. The good news is, though, that if your campus, like ours, uses a web proxy for remote access, your remote users will probably be using the proxy anyway if they're already doing research, so this should work for them in a lot of cases.
  • There's no good fallback if there's no resolver for you. Ideally something like OCLC's Find in a Library function could work here too, but that's fraught with difficulties. Trust me, if you've ever been to New Haven, you'll know... if you live on the wrong side of the street across from Ivy U., you won't necessarily get Ivy U. access even if you beg and plead. But, that's a much bigger problem of which this is just yet another instance.
  • This simple function isn't as smart as the work Google Scholar does to attempt to check against each institution's holdings before showing links. But then again, Google Scholar isn't a last mile service, so they don't want to have to deal with the untidy problem of finding something that might not be online. But we librarians do!
  • It isn't clear whether the OCLC Registry coordinates queries across the pond to the UK Registry, or any other registry, or whether OCLC's registry comprises their remote data, but it would be good to be able to fire off just one query and be reasonably assured that users *anywhere* will find their resolvers.
  • There's a fundamental flaw in the whole approach of using a web service to query IP- and DNS-related information.

The flaw with the IP/DNS query bit is that a massive, distributed, caching system for queries about membership of IP addresses in IP blocks already exists, and it's all connected to user- and application-queriable layers through DNS, too. And a protocol which uses these existing layers for just these kinds of purposes already exists along with a related DNS Service Discovery piece. Check the name of the first protocol: Zero Configuration Networking.

That's exactly what we're doing here - providing a zero configuration experience to users. But we're not doing it with ZeroConf, though we probably should be. Or at least we need to make a concerted effort to try it before we dismiss it.

Still, this seems to work. And from multiple, repeated usability tests on the Canary, the first thing everybody always says *still* has been "I want full-text links." Now they can have 'em.

A quick aside: the folks behind the Registry and Gateway at OCLC are supportive of this approach and want to see more people using it, so don't be shy (but cache your queries so as not to be rude, either :). If your institution's resolver isn't in the Registry, or your resolver's record there is out of date, use the form they provide to enter or update information about your service.

Have at it!

Syndicate content