Digitize it all: from law to code and standard, for public justice
Thursday August 09th 2012, 12:51 pm
If you haven’t visited recently, do so now. I’ll wait… you are in for a treat.

Carl Malamud and Friends (soon to be a show on CNN) have kept up the momentum of their early work to digitize and publish technical and other standards, many of which are now online in all their glory.

And there’s a lovely collection of introductions, from the 5-minute summary of why and how to free building codes, to a 20-minute showcase of what the team does. (via boingboing)

This is still rather top-down for my tastes — there’s no obvious way for me to help out, fund the digitization of a particular code, or run a digitizing party in my neighborhood library or FabLab. But I am inspired by the persistent work and vision of the people making this dream a reality.

They also have a lovely site devoted to a national scanning project for scanning all the archives: YesWeScan. Which gave rise to this excellent blog post and commentary from the Archivist of the US, David Ferriero*.

* Recently seen at Wikimania DC saying, in his beautiful closing speech, “If you have any trouble using Wikipedia… tell them, if it’s good enough for the Archivist of the US…”

Dilettantism? No, it’s intellectual vulgarization. -Philippe Charlier
Sunday July 08th 2012, 11:23 am
Dr. Philippe Charlier, forensic historical sleuth, tries to recreate the life and death of figures throughout history, from his office in Paris. He spends much of his time popularizing his findings. Some in his field criticize this hypervisibility.

Charlier replies: “I want to share everything I know with the greatest number of people. What I do is not dilettantisml; it’s intellectual vulgarization.

(HT to Elaine Sciolino & the Grey Lady)

A brief, awkward tale of abandoned policy: Old English 5P
Thursday June 21st 2012, 10:00 pm
This history from Ænglisc Ƿikipǣdia has it all.  Vandalism, pasta, unfinished translation, capricious bots (and bot edit wars)… typical low-points of pages on small wikipedias.  There must be better ways to do all of these steps.

Google’s Knowledge Graph: connecting structured knowledge from diverse sources
Tuesday May 22nd 2012, 2:12 pm
Stefano Mazzochi and other former MetaWebbers now at Google have turned out another beautiful structure in the garden of human knowledge: the Knowledge Graph.

This helps visualize one key aspect of information meshes, though it has many limitations still. (It is only a graph, as the name suggests; as defined within Google it is only the part of the universal knowledge graph that they choose to bless as ‘clean’; it doesn’t include any data that they choose not to make publicly visible; and there is no higher level of structure to support a metric, or a multi-dimensional space).


How will YOU use 12M bibliographic records?
Thursday April 26th 2012, 7:57 am
Harvard Libraries recently released bibliodata from their collections – 12 million works in all – under a CC-0 license, which lets other sites and researchers reuse that data in any way possible.

This is the biggest release of bibliographic data of its kind — four times the size of a similar release by the British Library in late 2010.  (Without an explicit release under a free license, such collections of metadata are covered by ‘database rights‘).

How would you reuse these records in your own work and dreams?  Some quick ideas:

  • WP or Wikisource could create 12 million stubs with those records
  • Open Library will improve and update its own metadata collection, which was built from scraped subsets of such data
  • We can write scripts that autogenerate “lists of works” for authors and authors or categories for works
  • We can automatically find mismatches between our person-data and title-data and those in MARC
  • We can publicly clean up mistakes in the MARC catalog and suggest updates

Decentralized smarts, twenty-four eyes, crystal power: the amazing Cubozoa (box jellyfish)
Monday April 23rd 2012, 10:36 pm
Cubozoa, or Box Jellyfish, are remarkable creatures. Among jellyfish – the oldest multi-organ creatures on the planet – they are some of the most highly developed in terms of nervous response, memory, and sensory organs. Some cubozoa species are among the most venomous creatures per weight on the planet, using a very effective poison for hunting.

They have a ‘neural ring’ which help coordinate their nervous system, the closest thing to a brain that jellyfish have been observed to have. They have some capacity for memory and to learn from experience.

They live largely in mangrove lagoons, where as many as 25 different species of Cubozoa may occupy different ecological niches, and forage at different times of day.

And they have 24 eyes, 4 of which are ‘true eyes’ with corneas and retinas – two of which can see color! They have been observed to navigate by visual cues out of the water, such as trees on shore. The 20 lesser eyes sense light more simply, and some point straight up at all times, thanks to a keen adaptation: they grow small gypsum crystals within their bodies at the base of their ‘eye-stems’, which act as a plumb bob to keep the eye pointing skyward.

In general I am no great fan of jellyfish – and can’t quite believe I am writing about them – but in this case the eyes (and angels) have it. Cubozoa are amazing.

Taking the Steel Blogger Challenge
Thursday April 19th 2012, 12:29 pm
Much of my work recently has been about community creation, capacity, identity, energy, and relation to partnership-building and vision-setting. And how to listen carefully and plan well for that hindsight-enabled possibility-space we call the future. This affects everything from how we define the future we want to live, how we chart our own course in groups of all sizes, to how we raise funds, forge volunteer or sponsor relationships, and enable those around us to do work we’d like to see done.

I got excellent feedback on these ideas at last week’s OER meeting, where most projects wanted to be community-driven or maintained. A few people asked if I was writing a book, with varying levels of arm-twisting. So I’d like to get into better writing shape. Inspired by Cool Cat Teacher’s tireless blog and ideastream, I have been thinking about ways to publish thoughts and essays dozens of times a day. I enjoy writing short essays, and linking them to practical works or implications. I do this in some format – often on email lists or in response to private requests – every day. But I don’t currently do it methodically, publicly, in an archived or editable way. And there is a backlog of practical thoughts in my unintentionally-private tomboy notes about how my current communities could work better / internally and together.

So in the spirit of doing ten thousand times whatever you want to eventually do well, I am taking on a personal Steel Blogger Challenge – publishing a post for every day this year. Retroactively. That gives me a bit of catching up to do. I don’t know quite how to coordinate this with tweeting and writing essays of various lengths – the ideal length here would be 100-200 words to capture the idea, with a few links, but editable. And I’m not sure how to make my writing editable, though I would like to let you all make revisions and post updates and links and cross-references.

And for the first time recently I feel let down by my blogging platform. I want a better way to publish many times a day, in many formats at once. Including quick personal notes, 140-char summaries, blog posts, longer monographs. Preferably with wiki-style versioned editable backend for every format. If you have toolchain suggestions, please let me know.

Primary sources matter. How do we convince journalists to cite them?
Wednesday March 14th 2012, 3:18 am
Some legal and political bloggers have written recently about an Arizona bill which reportedly “legalizes firing employees because they use contraceptives”. That’s the sort of claim which I always read with an invisible “citation needed” tag floating in the air next to it. But it took a frustrating few minutes to track the bill down; no posts linked to it, and few bothered to mention it by name.

Even the page about the bill on VoteSmart (a lovely site which focuses on tracking the progress of a bill and its changes/votes over time) has only a tiny, obfuscated link to the actual bill text. (I know that the raw bill isn’t the primary focus of that site, but I still expect it to be clearly linked from the top of the page about it.)

At any rate, here is the text-with-diffs of Arizona House bill 2625, “an act amending sections 20-826, 20-1057.08, 20-1402, 20-1404 and 20-2329, Arizona revised statutes; relating to health insurance.” The changes start at page 8.

It does not in fact legalize “firing employees” or other discrimination; but it does redact a special clause expressly prohibiting discrimination on the grounds of using contraceptives, which former lawmakers saw fit to include next to every description of how religious institutions should be allowed to opt out of providing coverage for them. I am certain that the history of the inclusion of that clause would be interesting; I also find it unlikely that every state has such a similarly explicit reminder embedded in their healthcare laws.

So: deep linking to primary sources is important; not just to better inform readers, but also to find out if what you are writing is true. When most modern journalists were cutting their teeth in their first newsroom, deeplinks to source material within an article were impossible. Now they are a matter of a few minutes’ research. How do we reemphasize the value of this work?

