Months ago I was trying to decide whether it was worth it for me to purchase a Kindle reader (not an early adopter, I know). Since I was not sure, I decided to try first with the Kindle app on my laptop. I downloaded it, played around with it for a bit, and forgot about it.

A few days ago I decided to try again and started reading my first Kindle book on my computer: Eli Pariser’s “The Filter Bubble”. I was actually enjoying the reading (both the content and the experience). Then, as I turned a page, I saw a passage that was lightly underlined and had a tag next to it that said “16 highlights”. Amazon was telling me how many readers of the book had considered that passage worth highlighting. That made me remember why I had not continued using the app when I first downloaded it: I somehow felt that someone was reading over my shoulder and tracking what I was doing.

The mild discomfort I felt when I saw the highlighted passage was again forgotten when I started thinking about a research question: How does the information provided by Kindle on other readers’ highlighting affect the way I highlight when I read? It’d be nice to run an experiment: some readers get to read the book with information on other readers’ highlights, and other readers get to read the book without that information. How different is the way they highlight the book? I started wondering whether Amazon has already run that experiment, and about the possible results.

I continued reading “The Filter Bubble”, and then, on page 29, I came across the following passage: “When you read books on your Kindle, the data about which phrases you highlight, which pages you turn, and whether you read straight through or skip around are all fed back into Amazon’s servers …” I could not help feeling the mild dizziness of self-reference.

At that point, I immediately remembered Julie Cohen’s article “A Right to Read Anonymously”. I realized I had it on my hard drive, as a PDF file I downloaded from SSRN long ago. And I decided to read it again using Adobe Reader.  In the opening passage, she already sets the stage: “the new information age is turning out to be as much an age of information about readers as an age of information for readers. The same technologies that have made vast amounts of information accessible in digital form are enabling information providers to amass an unprecedented wealth of data about who their customers are and what they like to read. […] [I]ncluding, quite possibly, information that the reader would prefer not to share”.

I got happily immersed in this act of anonymous reading. Halfway through the piece, I realized my eyes were getting tired (too much on-screen reading recently) and I decided to print out the remaining pages. On the first one coming out of the printer Julie calls for readers’ right to decide: “Reader profiles are valuable to marketers precisely because they disclose information about the reader’s tastes, preferences, interests, and beliefs. That information is content that the reader should have a constitutionally protected interest in refusing to share”.  At that point I wondered whether Kindle offers readers the option of refusing to share certain information. And so I went to Kindle´s terms of use . There I read that “the Software will provide Amazon with data about your Kindle and its interaction with the Service (such as available memory, up-time, log files, and signal strength). The Software will also provide Amazon with information related to the Digital Content on your Kindle and Other Devices and your use of it (such as last page read and content archiving). Annotations, bookmarks, notes, highlights, or similar markings you make…” That was it: “the software will” and the software’s will.

For more information, I was directed to Amazon’s privacy page , and there I could read that Amazon, in addition to the usual pieces of information (IP address, login, email, password, cookie, browser, computer, and connection-related information) and the purchase-related data (purchase history, the full URL clickstream to, through, and from the site, products viewed or searched for), routinely collects the phone number used to call its 800 number, uses Flash cookies and “may use software tools such as JavaScript to measure and collect session information, including page response times, download errors, length of visits to certain pages, page interaction information (such as scrolling, clicks, and mouse-overs), and methods used to browse away from the page”.

On that page there is also a whole section on the information Amazon receives about us from third parties, but nothing on what (if anything) Amazon shares with third parties. But, as I went back to Julie’s piece I could read that “the chilling effect on individual freedom to read and react to a work arises not only because information about one’s reading habits might be shared with others, but also because it is collected at all…” And I started to wonder whether this chilling effect was going to keep me away from e-books (will the enjoyment of reading electronic books be too tarnished by the “reading over my shoulder” effect?), or influence my choice of some books over others (and not just because Amazon would shape my interaction with the store to reflect the profile of me they’ve created), or even affect the way I highlight what I read. And, furthermore, wasn’t this another manifestation of the filter bubble Pariser was talking about in the very book whose reading started this reflection?

Coincidentally, as of today, amazon.es, Spain’s version of Amazon, is up and running. Since I currently live in Madrid, I got excited about reduced shipping costs and delivery times. For  the time being,  however, amazon.es will only be selling physical products—no e-books—and I am not sure of whether I should feel glad or disappointed.

Google has become such a huge and hyperactive company that it is often difficult to keep up with all its initiatives, and even its conflicts. Among the latter, the one that has been drawing most attention recently has been its problems in Europe regarding the collection of data on private wireless networks by its Street View cars. But I have been following another conflict that, even though it is not getting much publicity, I find to be quite interesting.

In late April, a coalition of abortion clinics in Spain requested Google to lift its ban on abortion service ads in Adwords Spain. In 2008, Google updated its advertising policies to forbid abortion clinics ads in 15 countries—Germany, Poland, Hong Kong, Taiwan, Singapore, Malaysia, Philippines, Indonesia, Argentina, Brazil, Mexico, Peru, France, Italy and Spain. [You can read here an interesting exchange of emails between Planned Parenthood Oklahoma and a Google representative regarding this decision] The recent petition in Spain was triggered by the upcoming legislative change that will lift some restrictions on abortion practices in the country–abortion was until now legal, but subject to wider restrictions. What ensued was an exchange—both in private and through the media—between Google and the petitioners, with Google showing its willingness to discuss the issue and review its norms if necessary, and the clinics engaging the Government in their support and threatening with legal action.

This whole exchange lead me to Adwords’ policy page, in which Google lists the types of ads that are not allowed, and specifies the country variations to the general policy. To my surprise, I could not find any reference to abortion clinics in it. And I wonder why. Looking at all these forbidden topics and their local variations also made me wonder how exactly these decisions are made inside Google. Who decides what is acceptable advertising, and on what grounds? Regardless of the answer, the page is a great reading to understand the local contours of Google’s version of “…evil”.

In the hope of shedding some clarity on a bunch of ideas I’ve been struggling with over the past few months, I went back to read some classic texts about the internet. Among them, I got a hold of J.C.R. Licklider and R.W. Taylor’s  paper “The Computer as Communication Device”, describing the structure, features, and impact of a computer network devoted to communicating (soon to exist in the form of ARPANET). The paper, published in 1968, has a few awkward passages, especially its view of communication as “cooperative modelling.” For Licklider and Taylor, communication as cooperative model can be summarized in this way (the drawings are Roland B. Wilson’s):

Other than this, it is full of prescient ideas and sharp forecasting. They talk about “communities not of common location, but of common interest”, they anticipate voice over IP (“You will seldom make a telephone call; you will ask the network to link your consoles together”), they foresee how the network will collect and use information about our personal relations (“It will know who your friends are, your mere acquaintances. It will know your value structure, who is prestigious in your eyes…”), they anticipated debates about digital divides and rights of access (“Will ‘to be on line be a privilege or a right? If only a favored segment of the population gets a chance to enjoy the advantage of ‘intelligence amplification,’ the network may exaggerate the discontinuity in the spectrum of intellectual opportunity”) and they even offered a nice visualization of what could be described as a denial of service attack by brute force:

But there were two paragraphs that especially caught my attention and led me to write this post. The first one talks about the cost of the network and is within a section entitled “Who can afford it?”. It says:  “In the field of transmission, the difficulty may be lack of competition. At any rate, the cost of transmission is not falling nearly as fast as the cost of processing and storage. Nor is it falling nearly as fast as we think it should fall. […] it will be the dominant cost. That prospect concerns us greatly and is the strongest damper to our hopes for near-term realization of operationally significant interactive networks and significant on-line communities”. Ring a bell?

The second one is the last paragraph of the paper, and formulates their final prediction on the impact of the network: “Unemployment would disappear from the face of the earth forever, for consider the magnitude of the task of adapting the network’s software to all the new generations of computer, coming closer and closer upon the heels of their predecessors until the entire population of the world is caught up in an infinite crescendo of on-line interactive debugging”. What an image!

Last September Adobe purchased Omniture for $1.8 billion. In my view, it was a surprising mix, and I went around trying to understand what was behind it. While for some people the deal seemed to make business sense , others were more skeptical. And I left it at that. But a couple of weeks after the operation was announced, I came across a post in a rather laconic blog that pointed to flash cookies as a key element for understanding the deal. That post referred to a piece in Wired magazine that discussed flash cookies, a piece which, in turn, referred to a paper written by a group of young researchers at UC Berkeley on the use of flash cookies in 100 of the most visited sites. By the end of the day I had a pretty clear idea of what flash cookies were, and of how they were being used. And it was unsettling (if you retrace my steps you can probably understand why). However, I was not finding many references to flash cookies anywhere, and I was starting to wonder whether I was simply reading too much into the issue.

Last week, in the talk  I gave at the Berkman Center, I mentioned the Berkeley study to illustrate the different mechanisms of data collection in use by the advertising industry. Doc Searls, who was acting as informal host for the talk, decided to ask the people attending whether they had heard of flash cookies and, to my surprise, most of them hadn’t. Considering the audience, this response made me wonder again whether I was actually conferring too much importance to the issue. On the other hand, and this was the possible alternative interpretation of the response, if the issue was relevant and not even the people in that room were aware of it, then we really had something to be concerned about.

Interestingly, just three days after the talk, I got news of a report on the use of flash cookies for tracking purposes, commissioned by BPA Worldwide to Web Analytics Demystified (you can get it here  in exchange, of course, for some personal information). This white paper does not add much factual information to what is already known (it actually borrows quite a bit from the Berkeley study and relies to a large extent on Google searches), but it clearly points to flash cookies as a concern, both for the advertising industry (fear of regulation) and for internet users (covert collection of information). In terms of the former, the white paper concludes “that companies making inappropriate or irresponsible use of the Flash technology are very likely asking for trouble (and potentially putting the rest of the online industry at risk of additional government regulation)”. As for the latter, flash cookies are characterized as “super-cookies which are dramatically more resilient than cookies due to their implementation and a general lack of knowledge about their existence among consumer”.

What a better way of starting the year (and the decade?) than by starting a new blog? It is actually my first personal blog ever, and I feel I should probably explain in this (its first) post why now (not before, not later), why the name, and what this is about.

Let’s start with the timing. I’ve always felt that having a blog (or a Facebook profile, or a Twitter account) gives you the opportunity to do things (communicate, get in touch with, express yourself, you name it) that are not possible without it. But I also feel that with this opportunity comes a certain responsibility. Namely, I feel I need to keep the “conversation” alive and I feel that whatever it is I write should have some value to people other than me. The first issue, keeping the conversation alive, can be seen as planting a seed and acquiring the responsibility of watering and nurturing it so that it turns into a healthy plant. I’ve always been terrible with plants and I have always thought I would be equally terrible at nurturing a blog (and who needs another dried out blog?). However, this academic year I have been released of certain pressing and time-consuming obligations, which allows me more time to take care of plants like this. Regarding the second issue, the value of what is written: this is always difficult to assess from the point of view of whoever is writing it. But I try to follow the advice of Guatemalan writer Augusto Monterroso when he said that “only those words that are better than silence deserve to live”. To follow this advice in the age of digital communication is particularly hard (and I am not sure I will be able to), but  after spending a few months as a fellow at the Berkman Center, surrounded by smart and interesting people, discussing and enacting very relevant ideas, I believe my chances are higher than ever before.

And this takes me to the second issue, the name. If a blog is like a plant, and the point is to write only those words that are better than silence (which coming from me would probably be few), I guess moss is a good candidate. Thus, what I hope I will be doing here is mossing a tiny portion of the blogosphere. (Besides, I always wanted to use that line from a Leonard Cohen song based on a poem by Federico García Lorca that goes “And I’ll bury my soul in a scrapbook / With the photographs there, and the moss”, and with this blog name I feel I can).

Finally, what will this blog be about? It’ll be about the issues I am thinking of these days, and they refer to the Internet, to communication and commercialization, to advertising and control. But that doesn’t mean that other issues will not come up (I am sure they will). I just hope you’ll like the moss.