openurl
The other end of the mic: OpenURL, Crossing Over
In case you didn't catch the updated text at the top of the last post here, Jon Udell kindly followed up on a comment he left here recently and interviewed me for his Friday podcast about OpenURL and other topics persistent and linkish.
...in which I:
- try to explain OpenURL linking
- fail to recognize or usefully distinguish the differences between archive.org and arXiv.org (I just heard "archive.org" and went with that even though Jon meant "arxiv.org", at least early on, I think)
- reference the 50+-year research and publishing history of Eugene Garfield
- dredge up the old DSpace identifier wars
- discuss some issues with the Handle System and the centralized/distributed governance spectrum
- describe COinS and how to use COinS browser extensions at your library
- highlight the COinS Generator, the OCLC OpenURL Resolver Registry, and LibraryFind projects
- crib Jon's notion of a publishing surface area and reference FRBR as a framework for considering the surface area of a distinct intellectual or artistic creation
- say "Mess is Lore" and reference the Book Arts Press valentines the *week* of valentine's day without saying they're valentines
- mention the Computer History Museum
- try to convince Jon that he should find an archives to work with to start collecting his papers (online and otherwise) and correspondence without actually saying that so succinctly or usefully
- give Jon the scoop about my upcoming new job
It was a real pleasure to get to talk with someone whose writing I've read for many years and to appear on a podcast I've been listening to since its inception. Jon's long-time crossover interest in libraries (c.f. Library Lookup) is one of the strongest bridges we library hackers have to the broader software and systems communities. It's also a privilege to follow great folks like Tony Hammond, both John Blyberg and Ed Vielmetti, John Wilkin, and Lou Rosenfeld (both of whom I was lucky enough to interact with while at grad school at umich, check the podcast feed for these older, pre-Jon's new blog links) as a library-interested interviewee.
Unfortunately I've been too busy to put together more Library Geeks interviews, but I plan to start that back up again once I settle into the new gig, I promise. It's easier to be the one asking the questions!
Rethinking OpenURL
Recently I was interviewed for a podcast episode that might (hoping it indeed is published! [Updated 2/16, 5:30pm: yep, and here it is, and there goes the news about my new job...]) soon be heard by a lot of people. The topic was OpenURL, and why it's both so exciting and so frustrating to everyone who dives down into it.
I've been particularly frustrated with OpenURL lately. Just when efforts like COinS are starting to pay off (c.f. new Wikipedia cite templates, worldcat.org, new Wordpress plugin from Zotero, Zotero itself, and LibX), it feels like the door to wider use and acceptance of COinS specifically and OpenURL in general remains welded shut.
Don't take this the wrong way - I remain a huge OpenURL fan, including the whole vision that led to the ANSI/NISO Z39.88-2004 standard, even if that standard turned out to be way too complicated. I've been a fan of this way of doing things since we added multiple linking paths to the now-defunct jake project in early 1999, so we could link users from bibliographic references on the web into the books and journal articles the way your library has best set you up to do so, in pretty much an OpenURL-0.1-compatible way, even before we ever heard about OpenURL. (It was, after all, obvious to have a bunch of HTTP GET fields with names like [title, spage, volume, issue, issn, etc.].)
And I'm not saying we need to do away with OpenURL. I think instead that we need to recontextualize what it is we think OpenURL is for, and to do so in a way that focuses on the common use case of service linking on the web today. And I think we need to avoid dogmatic adherence to the Z39.88 standard, since we all mostly think it's overwrought anyway.
Let me try to explain it this way. Have you noticed those little clusters of links next to stories at NYTimes.com, or at the end of blog posts, or in reference databases, or on journal sites? These little blocks of links are popping up all over the place. What they are, in general, is "a list of the functions you can do from here, given the thing you're looking at now, depending on who you are."
New York Times service links.
Blyberg.net weblog service links.
Pubmed.gov service links.
JAMA service links.
Citeseer service links.
Google Scholar service links.
This, my friends, is dynamic service linking, in a nutshell.
(And, yes, I do mean "in a nutshell" as in "ooh, ooh, help, I'm trapped in a nutshell, get me out of here.")
I believe that this analysis maps *exactly* to the vision for OpenURL specified in the documents that led up to the ANSI/NISO Z39.88-2004 standard that nobody can actually read. They nailed it - they knew this was coming, and they were right, and here it is, all over the place, on the most popular sites on the web.
But there are two major problems with this analysis.
First, all of these diverse sites publishing useful service links for their own materials have *no* idea that I'm a Yale University person and that *my* library's OpenURL resolver is over here, yes, please, thank you very much. This means that there's no public, cross-site workflow for getting out from most of these sites' own concepts of what "context-sensitive links for me for this content object" what mean back into what Yale's library thinks should be useful links for me. Nor is there a clear workflow for me to get back out from a Yale resource to these online, not-at-Yale items. This is a fatal flaw in the current state of how we both conceive of and implement OpenURL solutions in libraries today (which is to say, not a fatal flaw in the model, just how we use it now).
Until we address this flaw in our current thinking, and bring this wider world of web resources and their own notions of service links and our traditional, more narrowly defined research library workflows into some kind of common system, we'll never see the full vision of OpenURL realized.
Secondly, and this is perhaps a bit less practical, but no less important, but I think the notion of "context-sensitive" linking does itself have a basic flaw. If you've studied finite automata or linguistics and know what language parsing is all about, "context-sensitive" might ring an alarm for you. "Context-sensitive" grammars tend to be significantly computationally harder to work with than "context-free" grammars. (If you're not familiar with this distinction, see the Wikipedia entry for Chomsky hierarchy.) Do you know how the "natural language query" problem remains mostly unsolved? Well, think about the difference this way - most human languages are context-sensitive, whereas most computer programming languages are context-free.
What we've done with the OpenURL model and how we use it in libraries today is to design a standard around our unusual, more complex cases (dealing with weird bibliographic references). The workflow we have applied to OpenURL is context-sensitive in that the publisher needs to recognize my network address, and needs to then redirect to my institutional resolver, and that our institutional resolver needs to be pre-configured with our holdings information, and the outgoing links from our resolver needs to be aware of my network address again as I go out there to get the thing I wanted in the first place. Without all of this preceeding context-setting, the workflow breaks, so our context-sensitive worldview simply requires all of it.
In the meantime, sites applying the more common, simpler cases (dealing with web links) have essentially adopted the entire OpenURL model but without so much as saying so. That blog engine doesn't care what my network address is, and doesn't know whether I have accounts at the endpoints of the digg and del.icio.us links at the bottom of the blog post. It'll send me there just in case I do, and at the moment of need, just in time, if I need to identify myself to digg, or to whatever the target link might be, I will, if I have an account. But not before. Digg doesn't need to know that I'm a TimesSelect subscriber. The New York Times doesn't care that I use unalog, not del.icio.us. This whole scenario does not depend upon - and thus performs none of - the context-setting of our library research and licensed resource workflows.
The computational aspect of it doesn't bail out if you don't have a del.icio.us account - del.icio.us just doesn't let you in, but I can still get there if I want. In the heavier-weight scenario, I can't even figure out where I might want to go through my OpenURL resolver if my institutional context isn't sniffed out by the original remote resource, because I never even get to the OpenURL resolver in the first place.
I think I'm starting to see a better way to address all of this. This post is already pretty long, though, so I'll just state the objective for now.
The goal of rethinking OpenURL, I think, should be to reimplement the same original OpenURL model of dynamic service linking as - in the typical case - a context-free workflow. The instant we do that conceptually, we see how unified all of this really could become. It gives us librarians something to offer the nytimes.coms and diggs of the world: a singular framework for mixing and mashing service links for diverse sites independent of who we all are and what we know about each other.
Then, to address the more complicated context-sensitive research workflows we hold so dear, we simply define a context-free way to hook into that workflow, so if some arbitrary transaction needs it, it can hand us over into it, but *not* until we have to have it.
I'm not sure who said it first, but there's a librarianly spin on the old Perl paradigm I think I heard at code4libcon or Access in the past year: instead of "making simple things simple, and complex things possible," we librarians and those of us librarians who write standards tend, in writing our standards, to "make complex things possible, and make simple things complex."
That approach just won't cut it anymore.
Coming soon: a concrete way forward.
OpenURL over OpenSearch using Parameter extension
I will get back to the ZeroConfMetaOpenSearch series soon. But before I could go much further, I had to try this first.
Here's what the OpenURL KEV format for journal and journal article reference queries might look like if it were to follow OpenSearch 1.1 Draft 3. More to the point, this is what the block specifying the journal KEV params might look like inside the OpenSearch description response if it were to use the Parameter extension.
<Url xmlns:parameters="http://a9.com/-/spec/opensearch/extensions/parameters/1.0/"
type="text/html"
template="http://example.com/search">
<parameters:Parameter name="q" value="{searchTerms}"/>
<parameters:Parameter name="count" value="{itemsPerPage?}" minimum="0"/>
<parameters:Parameter name="start" value="{startIndex?}" minimum="0"/>
<parameters:Parameter name="aulast" value="{aulast?}" minimum="0" maximum="1"/>
<parameters:Parameter name="aufirst" value="{aufirst?}" minimum="0" maximum="1"/>
<parameters:Parameter name="auinit" value="{auinit?}" minimum="0" maximum="1"/>
<parameters:Parameter name="auinit1" value="{auinit1?}" minimum="0" maximum="1"/>
<parameters:Parameter name="auinitm" value="{auinitm?}" minimum="0" maximum="1"/>
<parameters:Parameter name="ausuffix" value="{ausuffix?}" minimum="0" maximum="1"/>
<parameters:Parameter name="au" value="{au?}" minimum="0" maximum="*"/>
<parameters:Parameter name="aucorp" value="{aucorp?}" minimum="0" maximum="1"/>
<parameters:Parameter name="atitle" value="{atitle?}" minimum="0" maximum="1"/>
<parameters:Parameter name="title" value="{title?}" minimum="0" maximum="1"/>
<parameters:Parameter name="jtitle" value="{jtitle?}" minimum="0" maximum="1"/>
<parameters:Parameter name="stitle" value="{stitle?}" minimum="0" maximum="1"/>
<parameters:Parameter name="date" value="{date?}" minimum="0" maximum="1"/>
<parameters:Parameter name="chron" value="{chron?}" minimum="0" maximum="1"/>
<parameters:Parameter name="ssn" value="{ssn?}" minimum="0" maximum="1"/>
<parameters:Parameter name="quarter" value="{quarter?}" minimum="0" maximum="1"/>
<parameters:Parameter name="volume" value="{volume?}" minimum="0" maximum="1"/>
<parameters:Parameter name="part" value="{part?}" minimum="0" maximum="1"/>
<parameters:Parameter name="issue" value="{issue?}" minimum="0" maximum="1"/>
<parameters:Parameter name="spage" value="{spage?}" minimum="0" maximum="1"/>
<parameters:Parameter name="epage" value="{epage?}" minimum="0" maximum="1"/>
<parameters:Parameter name="pages" value="{pages?}" minimum="0" maximum="1"/>
<parameters:Parameter name="artnum" value="{artnum?}" minimum="0" maximum="1"/>
<parameters:Parameter name="issn" value="{issn?}" minimum="0" maximum="1"/>
<parameters:Parameter name="eissn" value="{eissn?}" minimum="0" maximum="1"/>
<parameters:Parameter name="isbn" value="{isbn?}" minimum="0" maximum="1"/>
<parameters:Parameter name="coden" value="{coden?}" minimum="0" maximum="1"/>
<parameters:Parameter name="sici" value="{sici?}" minimum="0" maximum="1"/>
<parameters:Parameter name="genre" value="{genre?}" minimum="0" maximum="1"/>
</Url>
I *think* that's right. It isn't clear to me whether the q/searchTerms param is required in OpenSearch or not. If it is, then we could use it to paste through an url-encoded text representation of the reference, I suppose. If not, well, it could just be optional, or dropped entirely.
We'd still need to do work on the response record format - perhaps something might be borrowed from the IESR Metadata specs and inserted into RSS2.0 or Atom responses.
What's amazing to me about OpenURL is that it's been implemented so successfully (in the sense that we had nothing before, and now we have something). That said, I repeatedly hear arguments that by defining and using more OpenURL application profiles you gain benefits through the shared OpenURL infrastructure. I don't buy that because I've never seen such benefits. If you swap out OpenURL and swap in OpenSearch and use the same application profiles, like in this example, then I could see the benefits of shared OpenSearch infrastructure.
Talk: COinS, unAPI, and a Plan for Zero Configuration Service Discovery
Today I gave that talk at the NISO D2D meeting. I think it went pretty well despite having gone way beyond what I'd originally thought it would cover.
The first part basically introduces and gives background for why COinS and unAPI are useful steps forward. The second part argues that:
- We should merge our metasearch and openurl resolver interfaces
- We should layer an opensearch interface on top of that
- We should register the opensearch interface as a DNS-based wide-area zeroconf discoverable service
Why? If we did all of that, then everybody visiting our domain could find our base search interface, and remote services visited by people from our domain could find our resolver interface.
The slides are attached as pdf. Sadly I botched my own audio recording, but tomorrow I'll plead for a copy from the soundboard guys, and will add that too if I can get it.
[Updated 2007-01-08] The audio is a little blotchy, but here it is.
Library Geeks 001 - Fun with OpenURL (updated)
Hear ye, hear ye, we have met the Library Geeks and we are us: Listen to Library Geeks 001 - Fun with OpenURL.
Ross Singer joins me for the first episode, wherein we discuss OpenURL, the state of the OpenURL resolver marketplace, and the innovative work Ross is leading at the Georgia Tech library to implement a next-generation resolver.
[Updated.] Get it through the feed at geeks.onebiglibrary.net/feed.xml.
Notes:
- Learn more about the umlaut at the umlaut trac
- The JCDL paper on statistics is available through the umlaut itself






Recent comments
2 days 10 hours ago
2 days 11 hours ago
1 week 2 days ago
2 weeks 2 days ago
2 weeks 4 days ago
2 weeks 4 days ago
2 weeks 5 days ago
5 weeks 6 days ago
6 weeks 4 days ago
6 weeks 4 days ago