I dunno why the New York Times appeared on my doorstep this morning, along with our usual Boston Globe (Sox lost, plus other news) — while our Wall Street Journal did not. (Was it a promo? There was no response envelope or anything. And none of the neighbors gets a paper at all, so it wasn’t a stray, I’m pretty sure.) Anyway, while I was paging through the Times over breakfast, I was thinking, “It’s good, but I’m not missing much here–” when I hit Hot Story to Has-Been: Tracking News via Cyberspace, by Patricia Cohen, on the front page of the Arts section. It’s about MediaCloud, a Berkman Center project, and features quotage from Ethan Zuckerman and Yochai Benkler…
(pictured above at last year’s Berkman@10).
The home page of MediaCloud explains,
The Internet is fundamentally altering the way that news is produced and distributed, but there are few comprehensive approaches to understanding the nature of these changes. Media Cloud automatically builds an archive of news stories and blog posts from the web, applies language processing, and gives you ways to analyze and visualize the data.
This is a cool thing. It also raises the same question that is asked far too often in other contexts: Why doesn’t Google do that? Here’s the short answer: Because the money’s not there. For Google, the money is in advertising.
Plain enough, but let’s go deeper.
It’s an interesting fact that Google’s index covers the present, but not the past. When somebody updates their home page, Google doesn’t remember the old one, except in cache, which gets wiped out after a period of time. It doesn’t remember the one before that, or the one before that. If it did it might look, at least conceptually, like Apple’s Time Machine:
If Google were a time machine, you could not only see what happened in the past, but do research against it. You could search for what’s changed. Not on Google’s terms, as you can, say, with Google Trends, but on your own, with an infinite variety of queries.
I don’t know if Google archives everything. I suspect not. I think they archive search and traffic histories (or they wouldn’t be able to do stuff like this), and other metadata. (Mabye a Googler can fill us in here.)
I do know that Technorati keeps (or used to keep) an archive of all blogs (or everything with an RSS feed). This was made possible by the nature of blogging, which is part of the Live Web. It comes time-stamped, and with the assumption that past posts will accumulate in a self-archiving way. Every blog has a virtual directory path that goes domainname/year/month/day/post. Stuff on the Static Web of sites (a real estate term) were self-replacing and didn’t keep archives on the Web. Not by design, anyway.
I used to be on the Technorati advisory board and talked with the company quite a bit about what to do with those archives. I thought there should be money to be found through making them searchable in some way, but I never got anywhere with that.
If there isn’t an advertising play, or a traffic-attraction play (same thing in most cases), what’s the point? So goes the common thinking about site monetization. And Google is in the middle of that.
So this got me to thinking about research vs. advertising.
If research wants to look back through time (and usually it does), it needs data from the past. That means the past has to be kept as a source. This is what MediaCloud does. For research on news topics, it does one of the may things I had hoped Technorati would do.
Advertising cares only about the future. It wants you to buy something, or to know about something so you can act on it at some future time.
So, while research’s time scope tends to start in present and look back, advertising’s time scope tends to start in the present and look forward.
To be fair, I commend Google for all the stuff it does that is not advertising-related or -supported, and it’s plenty. And I commend Technorati for keeping archives, just in case some business model does finally show up.
But in the meantime I’m also wondering if advertising doesn’t have some influence on our sense of how much the past matters. And my preliminary response is, Yes, it does. It’s an accessory to forgetfulness. (Except, of course, to the degree it drives us to remember — through “branding” and other techniques — the name of a company or product.)
Just something to think about. And maybe research as well. If you can find the data.
Tags: "Boston Globe", "New York Times", advertising, Apple, Berkman Center, Ethan Zuckerman, google, Patricia Cohen, SEO, Technorati, TimeMachine, traffic, Wall Street Journal, Yochai Benkler
I’d argue that the money is there in certain contexts. Amazon, Pandora and similar engines use technology to profile past user behaviour (research) to ultimately generate sales, advertising’s frequent objective. Granted, this may play a bit loose with your definition of advertising. But still, a third way, perhaps?
Always enjoy your posts.
Isn’t keeping such historical records traditionally the function of research libraries? I know that when I want to look at old newspapers, I go to a public library, and look at either paper copies or microfilm copies of the newspapers, but they don’t keep everything. I feel sure there are at least some libraries (perhaps associated with universities) that try to keep comprehensive news records. I hope they have stepped up to preserving the news from online sources, but I don’t specifically know of any that do.
I suppose archive.org doesn’t have the capacity to try to archive all of the news. I hope there are other places that are keeping archives of the online news — several places, at least. MediaCloud might well be one of them, but certainly shouldn’t be the only one.
Doesn’t the Internet Archive provide the ability to look back at (some of) the former home pages for sites? So once you know what site you’d like to use for research, you can then jump to the Internet Archive to look for past information.
And not “every blog has an actual directory path that goes domainname/year/month/day/post.” Some bloggers don’t like putting dates anywhere in the article or in URLs. And other bloggers only use year and month in URLs.
Early morning, pre-coffee, mid thunderstorm thoughts
Agree on all points regarding hard “news” but maybe the “foot in the door” is in soft “news”
The world of celebrity news, fan-stuff, and, of course to top it off all the retro fads, music, fashion, etc.
Seems that there is a world out there wrapped up in fluff, but maybe that is the first place to “monetize” an effort.
While I can remember pouring over microfilm rolls to do “research” for papers/projects for highschool/college/later, and use to have piles of trade publications in the office – with most I can now do contextual searches – but, granted, within the silo of the publisher.
Now where’s that coffee
There is no question that advertising requires us to be in the here and now, and not in the there and then, because it seeks to influence our desires and actions. Active repression of time, history, the past is basic to most commerce and commercial speech.
But I’d go further, because this is a large and important topic. Broadcast itself as a medium tends to put the past at a distance, even when it is about the past, because it makes it into spectacle. Something we watch from our NOW, the big now of advertising and current media.
And yet further: no media are more dis-attuned to the past than news media. It is all about the next story. That one last week that was entirely wrong? Ancient history. To be current, in news-speak, is to develop a sort of targeted Alzheimer’s in a certain direction.
Here’s a quote to help put in context my remarks above:http://interimtom.blogspot.com/2009/08/so-does-social-media-augment-or-inhibit.html
Pingback from Doc Searls Weblog · Geology vs. Weather on August 13, 2009 at 9:57 pm
Comments are now closed.