The Longest Now


Scoping human knowledge
Sunday March 14th 2010, 3:06 pm
Filed under: Uncategorized

As Wikimedia moves through its movement-wide strategy process, I’ve been thinking a lot about the scope of human knowledge, how far we’ve come since we started passing on oral, then written, then digital knowledge.

A progression of awareness

Wiki project logos

Some changes have come naturally with the development of our understanding of the universe.  As we understood more about the Earth and Space, we had a framework within which to fill out detailed maps and charts and cross-sections. As we learned about the component parts of the body, we could come up with a layered anatomy.  As we improved our understanding of mathematics, music, and language, we could identify different types of each, with classes of similarity and building blocks, making more advanced knowledge possible and possible to describe succinctly.

Some knowledge (say, 3D models of the center of the Earth) consists of so many small individual pieces of information that it was hard to dream of holding it in one place, before we had computers for automation and digital databases for collation.  Other valuable knowledge has yet to be developed (say, models for the efficiency of groups of millions of people; models for a full society of physical production and industry for 15 billion people where inputs match outputs and there are no ‘externalities’ as fudge factors).

Some things which we do not currently know are hard to describe, because language about them has not yet been developed.  Some things that we know imperfectly (say, a comprehensive species survey of the planet) require millions of individual observations, but are only pursued by thousands of expert individuals.  And some things we ‘know‘ in some sense (say, the full text of every book in a nationally funded library in some country on the planet), but are unable to access that knowledge readily.  If your life depended on finding a sentence in any one of those books in a week, you probably could; but you could not [yet] discover how many of those books contain a specific string.

Divide, specialize, automate, and conquer

Now we have the power to draw on input from billions of people with little more than publicity about how to contribute.  Connection and creation are easy and enjoyable, and some aspects of organizing knowledge (search, tracking) are easy and cheap.  So: what aspects of knowledge should we improve first?

Wikipedia has done a reasonable job of capturing and providing ready access to editable summaries of notable topicsWiktionary, OmegaWiki, and now Wordnik in a different fashion, have done something similar for access to editable information about words (though of these, only Wiktionary is readily editable by anyone).  OpenStreetMap has organized a few % of the world’s road segments and features and inspired similar communities of practice.  Wikimedia Commons has organized some of the world’s best freely-licensed images, while smaller projects such as Fotopedia focus more keenly on beauty.

Freebase makes an effort to organize metadata about and links to over 10 million topics, on Wikipedia and in other publicly-readable databases, regardless of their underlying copyright.  And AboutUs has done a fair job of capturing information about internet domains. The Encyclopedia of Life and Wikispecies each aim to gather images and information about the almost 2 million known species, though neither comes close to comprehensive coverage yet.

Moving away from projects with comprehensive targets: Wikiquote offers an editable compendium of quotes, though it is less intent on being complete than some of the aforementioned projects.  Yelp efficiently covers services and businesses in a limited number of cities, and has inspired a new wave of amateur reviewers and critics.

Then there are public services that are not particularly editable or distributed; but use bots and scripts to draw from the work of millions of others; they could be the seeds of truly great collaborative efforts.   The Internet Archive specializes in these, from the Wayback Machine, which has a comprehensively amazing repository of webpages, to OpenLibrary, with metadata and basic information about a few % of the world’s published books.  And one could say that Google itself is based on the power of distributed organization of knowledge.

What comes next?

These projects cover only a small selection of the world’s knowledge, but a significant portion of the collaborative knowledge-gathering sites in the world. Shouldn’t there be hundreds or thousands of these projects?


6 Comments so far
Leave a comment

“So: what aspects of knowledge should we improve first?”

I’d say going back to Socrates and Kant woudn’t be a bad think : “know thyself (γνῶθι σεαυτόν) ” and “Dare to know! (Sapere aude!)”

We not only need information in the open, but also ways to relate it to our individual lives (as souls/consciousnesses/whatever).

We not only need information, but we also need to strenghthen the will to enrich our private and public knowledge, to consider what “truth” is and why it is important.

My 2 philosophical cents 🙂

Comment by Bastien 03.18.10 @ 5:34 am

There are dozens of other projects but they are much smaller. Many of them are hosted on Flickr or Youtube or Wikia. Others are independent – Project Gutenberg, Wikitravel, TV Tropes, Appropedia, Citizendium, Conservapedia.

The problem all the smaller projects have is the network effect which means that if their main interest is in getting the information out rather than in running a site then they will probably be more effective if they work as part of an existing project like Wikipedia.

Most of the independent projects are designed to complement the existing projects, not to compete with them; aiming to fulfill a function the big projects are not covering.

Comment by Joe 03.18.10 @ 6:34 am

Regards “passing on […] knowledge”, ‘passing on’ is the easy part. Crafting and breeding the memes, the story fragments to be passed on, is much harder.

Wikipedia seems like journalism – good at transmission, but not at analysis. Where the source community is confused, the content is confused. Worse, when there is confusion, outstanding journalism becomes more selective of sources, but WP becomes less attractive to expert contributors, who are deterred by the mess and rot.

WP seems like the collective equivalent of rote memorization – regurgitating memes which have been seen elsewhere. Extensive research and analysis, to create and improve memes, is not the mission.

Unfortunately for society, it’s a neglected mission. Confusion abounds. It’s at least arguably a major cause of our science and engineering education failure. WP has trouble keeping something as basic as “What color is the Sun?” correct, in part because there’s confusion even among professional astronomers. And it’s not that spectroscopists are unaware, but that the incentives and infrastructure are poor for fixing even colleagues and students, let alone anyone else.

WP could be improved, and perhaps even play a role here. Imagine WP with an “notebook” tab in between “article” and “discussion”, separating wordsmithing from analysis, and discussion from its conclusions. Article text is a poor place for a persistent analytic framework – it’s too fragile, and has incompatible goals. A feature article is a very different thing from the writer’s notebook, file, outlines and sketches, which were used to create it. The current collective WP writer is an amnesiac with a misplaced notebook.

Dramatically improving science education content seems one task for a new and dedicated wiki. There appears to be interest – grad students are anecdotally enthusiastic about the idea of collectively crafting introductory material to reflect their theses and the current literature. There’s just been too much of an impedance mismatch between that task and WP. Creating such a wiki has long been on my todo list – if anyone is interested, let’s talk.

Comment by Mitchell Charity 03.22.10 @ 1:54 am

first, the grammar nazi remark: s/decsribe/describe 😉
now, the real comment: I really enjoyed your post. I would also point out that you could have mentioned Freebase. And several Wikia wikis are filling great spots as well. As for whether there should be dozens more of such projects, I generally agree, as long as there’s no overlapping/duplication of effort. The main feature that attracted me in Wikipedia was that there is only one version of each article where everyone can work on and incrementally improve.

Comment by waldir 03.24.10 @ 7:47 am

Thanks for all of the comments. Waldir, you’re right that Freebase deserves a mention; so do the all-species projects. It’s interesting to me that many of the other projects exist as willing reponsitories for certain types of knowledge, but without a drive to be a comprehensive resource for all useful knowledge in that domain.

One of the strengths of Wiktionary and Wikipedia is that they do have a clear mission to be comprehensive, so that when a contributor finds out about some other source of encyclopedic or linguistic knowledge, they go out of their way to make it possible to integrate that new material.

I should probably include Yelp as well, they proach ‘businesses and services’ from a comprehensive perspective. Upcoming, Meetup and the like, in contrast, are more of the ‘whatever people happen to submit’ school of thought.

Comment by metasj 03.31.10 @ 6:07 am

And Bastien, you have it just right. We need to understand why knowledge matters, what sorts of knowledge matter most, what kinds lead to new knowledge, and even when knowledge (out of balance, in the wrong order, or otherwise) is counterproductive.

Comment by metasj 04.19.10 @ 5:07 pm



Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>




Bad Behavior has blocked 441 access attempts in the last 7 days.