rss

A simple, old design for widespread blog mirroring

(Remembered this after last week's interview, and thought I would re-post it in case it makes any sense to anyone, assuming a few more interested people might be watching just now.)

Back at the Access 2004 Hackfest I worked with a few folks to design up a design for widespread copying and mirroring of blog content for distributed-copy and "preservation" purposes. I think we came up with something that could definitely work, at low cost, on a wide scale. Rethinking it today I'd substitute Atom for RSS, and maybe rethink using METS (perhaps instead just using Atom for both purposes).

It's a pretty simple idea: you extend an aggregator system to "archive" entries posted each day into bittorrent files, and then build a secondary system to turn the data distributed over bittorrents back into browseable "blog" mirrors if/when you need to. The best part is that you don't really need any new technology to do it.

Weblog mirroring system diagram

The main drawback is that you're dependent on the quality and completeness of what you get in the source feeds to begin with, which isn't always good enough. But, I still think it could work.

Syndicate content

This site is Copyright (c) 2005-2008 by Daniel Chudnov. All rights reserved.

All opinions stated here are my own, and do not reflect those of my employer.