Published on One Big Library. (http://onebiglibrary.net)
MPEG21 DIDL bastardization in JSON
By dchud
Created 2006-02-23 12:34

I've written before [1] about taking some of the ideas from MPEG21 DIDL and LANL's xmltape [2] model and implementing these using brain-dead simple techniques. I've started working on this and progress so far is very good.

Here's the plan: first, I'm not building a long-term preservation archive today, so I can play fast and loose with the full-on requirements of the OAIS Reference Model [3]. Instead, I'm building a quick-n-dirty standardized repository structure for use in prototyping search interface usability. If we get this funded, then I'll worry about the longer-term issues and stuff like updates, but they aren't a requirement today.

Second, instead of MPEG21 DIDLs in XML a la aDORe [4], I'm using a simplified version of the DID model in JSON [5]. Here's the object model:

[img_assist|fid=15|thumb=0|alt=Suki AIP object model|caption=Suki AIP object model]

(Note: I don't know UML; rather, I have Visio. :)

The SukiAIP id is a package identifier; the SukiItem id is a content identifier; the SukiResource is a resource (datastream) identifier. Oh, and, this codebase is codenamed "suki", as in Japanese for "like", "love" or even "adore". :) Pronounced much more like "ski" than "sue-key".

Third, instead of XMLTapes, I'm using plain old zipfiles, as supported in the python standard library [6]. *Zero* additional coding required, no separate indexing step, nothing. Just stuff the JSON SukiAIPs into the zipfile using their package identifiers as filenames, and both instantaneous retrieval and compression comes for free, fully debugged.

More on the application architecture soon... suffice for now to say that I'll be managing SIP handlers and ID cross-referencing in an rdbms with a Django admin front-end and Django templates for OAI-PMH and unAPI responses.

Don't get me wrong, I *love* aDORe, and think MPEG21 DIDLs are a fine idea, as is XMLTape, and I'd advise anyone to go that route. But, I don't need all that just now; instead, I just need this to be up, and running, and ingesting millions of items, like, yesterday, because I have, like, no time and no funding to build this repository. Hence the oversimplification, and, I think I can get this whole thing done and sucking up mass quantities of data in a matter of days.

Trackback URL for this post:

http://onebiglibrary.net/trackback/34

Source URL (retrieved on 2008-10-13 15:27): http://onebiglibrary.net/project/mpeg21-didl-in-json-sorta

Links:
[1] http://curtis.med.yale.edu/dchud/log/idea/why-not-json
[2] http://arxiv.org/abs/cs.DL/0503016
[3] http://nost.gsfc.nasa.gov/isoas/ref_model.html
[4] http://african.lanl.gov/aDORe/projects/adoreArchive/
[5] http://www.json.org/
[6] http://www.python.org/doc/current/lib/zipfile-objects.html