You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

Being a student again…

1

One of the sessions of our freshman seminar that I’ve grown to particularly like is the last one, where the students are responsible for coming up with a topic, putting together the reading list, and leading the discussion. It’s always interesting to find out what they think we missed during the semester, and it is fun to be a student again, getting a chance to read up on and think about something new.

This time around, the students decided to run a class session on bitcoin and blockchain. I was delighted; I’d been planning on taking some time to learn about these technologies, and now I had the chance to do so with others finding the right readings and study time built-in. The students put together a great selection of readings and videos, so I got to dive in.

I want to look into the technology more deeply, but what I’ve seen so far doesn’t really impress me. There are some interesting aspects to it; I like the idea of provably unchangeable ledgers (even though it isn’t all that new) and find the incentive system clever. In theory, it should all work fine. But in practice, I don’t see how it is going to scale.

In a distributed system, scale is the hard part. Getting something that works for 2 machines is a lot harder than getting something to work on a single machine. Getting something to work on thousands of machines is a lot harder still. And where things really get hard is when you try to deal with all of the possible faults that can occur in the system– from machines going down (the simplest form of fault) to networks losing packets to partitions to software bugs.

Blockchain seems to start with the idea that you can get agreement on a distributed database; the base paper points out that this requires agreement by 51% of the machines. The first thing to notice is that this means that the system is only worried about simple faults like machines crashing, since a well-known proof in distributed systems shows that if you are worried about byzantine faults (like those caused by software, or someone trying to subvert the system) you need agreement of 2/3rds of the participants in the system (plus 1). I’m also not sure how, in a system that is as open-ended as the blockchain system, you know how many machines are taking part. Which is a pre-requisite for knowing when you have reached agreement with 51% of them.

Then there is the use of time in the blockchain system. Time is exceptionally tricky in a large-scale distributed system (you get genuine relativistic effects), so assuming even roughly synchronized clocks is generally a bad idea. Looking at how they use time, it is also pretty clear that all that is needed is a sequence number, which is easier to implement. But the folks doing blockchain seem to not understand this.

My overall impression is of a system that combines some interesting cryptographic techniques in a distributed system built on a large set of naive or incorrect assumptions. Which wouldn’t be a problem if the system were being used for experiments. But I worry when such a system is being used for money, or contracts, or other legal matters, since then any but or oversight could lead to an exploit.

And when I try to think of how to change the system so that it doesn’t have the problems I see, I’m stumped. At base, I don’t know how you scale a system based on an assumption of computational difficulty. I feel like I’m watching the modern equivalent of the children’s crusades– well-intentioned, with passionate advocates, but doomed not to end well because it isn’t tied to the realities of the situation.

So I’ll dig some more to see if I can find where I’m wrong. I hope I am, but am afraid that I’m not. Til then, I think I’ll not invest in this particular technology.

I would like to thank all of the seminar participants for being great teachers, both in the last session on bitcoin and blockchain and during the rest of the semester. You are a remarkable group of people, and I hope you stay in touch with us and each other. The truth about Harvard is that the faculty learn much more from the students than the other way around, and this seminar was a great example of that. Drop by, drop me a note, but don’t drop out of my life…I miss you all already.

Caught between bad and worse…

1

We spent a lot of time over the semester talking about fake news and on-line speech, but many of those concerns came together when we focused this last week on social networking. There are a lot of concerns, especially with the big Internet companies like Google and Facebook having so much data that they could reasonably be thought of as knowing more about us than even our closest friends or family members. As Bruce Schneier has said on a number of occasions, he never lies to Google about his interests.

I’m more than a bit conflicted with the notion that Google or Facebook knows much about me. They have the data that could, if looked at, tell them a lot about me. But Google and Facebook are corporations, and corporations aren’t the kinds of things that know anything. There are algorithms that use the data to deliver content to me, but I’m not sure that the algorithms or the machines on which they run know in any interesting sense, either. People are the kinds of entities that know, but I don’t think anyone at Google or Facebook can access all of my information (both companies rely on trust for their business; having anyone there looking at all of this data would destroy that trust). So while they have the data that could let them know a lot about me, I’m not convinced that they in fact know about me.

That being said, there are certainly many unintended consequences that appear to be arising out of the social media environment we have today. Some of the seminar participants talked about the blows to the sense of self that middle-schooler’s they know have taken through social networks. The kinds of hate speech, fake news, and cyber bullying that take place over these networks is at least disturbing, and I have a lot of sympathy to those who say that someone should do something about it.

But then I hit the troubling question of what can be done, and who is responsible for doing it?

I’m not at all a fan of the idea that the government, at any level, should start policing standards on the Internet. Except in the most extreme cases (shouting “fire” in a crowded theater) and pretty absolutist on free speech. This means that I do defend the right of people to say pretty terrible things, be they racist, sexist, or simply stupid. I wish they wouldn’t say these things, but I also don’t want the government to decide what can and can’t be said, since it isn’t clear that it will be done correctly, or that the notion of correctness is steady over time.

But I’m also not sure I can get behind companies like Google, Facebook, or Twitter becoming the arbiters of what speech is acceptable or what is reality. And I’m not sure whether the notion of these groups deciding what can be posted or what can be deleted is any different than their current ability, via algorithms and the data they have amassed, to decide what you see and don’t see in your search results or your feeds.

Targeting of information is something that has been going on for a long time. Advertisers decide where to place ads by the demographic the ad is meant to reach and the supposed demographic of those who are watching some particular content. I’m always amused to watch the Harvard-Yale game on television; the usual football-broadcast pickup truck ads are replaced by ads for Jaguar and Lincoln. Knowing who your viewers are is important.

But the granularity of the knowledge that is available now means that what I see is as different from what you see as my interests are from yours; we are being pushed apart in our view of the world by the slightest differences that can be determined by our on-line histories. Where the media was once a force that brought us together by showing us a shared reality, the Internet is now allowing the new media companies to be dividers into more and more isolated and specialized groups. When we don’t have the shared experience, we have trouble understanding what others are thinking. And when we only communicate with those who think the way we do, there are fewer social checks on the kinds of things that we are willing to say and think.

Perhaps more worrying is that we can no longer be sure that what someone is saying to me is what they are saying to you. If I can target my message, I can also craft different messages to different targets. The idea that others know what has been said to me, and can make sure that the message doesn’t change when it is delivered to someone else, is one of the ways that we insure our common understanding. With the right targeting using the right information, this is no longer guaranteed.

I’m not at all sure where this all leads. My hope is that this is an adjustment phase, where we become aware of what the new kinds of media can do and as a society react to that new media, much as the introduction of television moved us from a regional to a national sharing. But this seems different, in that it targets us more individually rather than exposing more of us to the same thing. We still need to be able to air our views without others telling us what can and can’t be said. The Internet, which has brought us all together, is now being used to segment us like never before.

Cyber war, cyber crime, and jurisdiction

ø

It’s an odd thing about ‘cyber’ as a prefix– with the exception of cyberspace, it almost always means something bad. We have cyber-crime, cyber-war, cyber-bullying, but never cyber-puppies or cyber-joy. And most of the people working in technology don’t use the term at all. But it is a big thing in government and policy circles.

We had a great discussion in the seminar this week with Michael Sulmeyer about cyber war. The subject is complicated by the difficulty of distinguishing between cyber war, cyber crime, and cyber espionage. There are rules about war, but they were developed for the kind of conflict that occurs in physical space. The rules for conflict in the digital world are not well understood. And the notion that the two spheres of conflict will remain distinct is something that few believe. We have already seen some attacks that move from the digital world to the physical world, but there is little understanding of how an escalation from the digital world to the physical world would work. What are the rules, and what can be expected from adversaries? Without having some notion of reasonable escalation, it is hard to tell were any attack will end.

One worry that I have is that the pace of change in technology is so much faster than the pace of change in the policy and legal worlds. Getting countries to talk to each other about the rules of cyber engagement takes years, and reaching an agreement takes even longer. By the time treaties can be written and agreed upon about some aspect of technology, the technology has changed so much that the agreements are irrelevant. How to get these time scales more in synch is a difficult problem.

But I think a larger problem is getting the right set of players into the discussion. Most countries think that discussions about trans-national conflict need to take place between countries, which is reasonable in the physical world. But when we talk about the cyber world, just having the various countries at the table misses a major set of actors– the technology companies that are building and shipping the technology that make up the cyber world. As was pointed out in our reading by Egloff, we now live in a world where major players include the corporations, much as was the case during the age of exploration. Keeping these players out of the discussion means that major forces are not represented. Companies like Google or Apple may be based in a single country, but their interests cannot be fully represented by their home government. They are powers themselves, and need to be represented as such.

It may seem strange to think of the tech giants in this way, but no more so than seeing the influence of the East India Company or the Hudson Bay Company during the age of exploration. It took a couple hundred years to work out the law of the sea; I hope that we can do better with cyberspace.

Governing the ungovernable

ø

Many thanks to Jonathan Zittrain for joining us this last week to talk about Internet governance. JZ is always thought-provoking, entertaining, and leaves you thinking about things you haven’t thought about before. I feel lucky to count him as a friend and colleague.

Talking about the Internet Engineering Task Force (IETF) is always fun and interesting. The IETF is governance as done by the geeks; it doesn’t really exist (at least legally), it has participants rather than members, and the members (even when they work for and are supported by a company or government) are expect to represent themselves, not their employers. It is a technocracy, running on the principles of loose consensus and running code. In many ways, it is just a somewhat larger version of the groups of graduate students who got together when their advisors told them to write the code for the original ARPAnet. But it is hard to argue with the power of what they produced, even if you can’t understand how they could have done it.

The other aspect of the IETF that tends to confuse people steeped in governance is its enforcement mechanism. Passing laws or creating standards isn’t much good if there is no way to encourage or force others to follow those laws or standards. After all, passing a law doesn’t do much good if you don’t have police to enforce the law.

But here the IETF is different, as well. It has no enforcement power. If you don’t implement a standard that the IETF publishes as an RFC, no one will challenge you. There are no fines to pay, and no one goes to jail. Nothing happens.

Except, of course, that you can’t communicate with any of the computers that do implement the IETF standard. Nothing says that a computer has to speak TCP/IP, and nothing happens if the computer doesn’t. Including getting the traffic from the other computers that do implement the standard.

In fact, there are lots of IETF RFCs that haven’t been implemented. There is even a group of them that are famous (well, famous in the IETF community) for being April Fools jokes. Some of my favorites are RFC 3092, an etymology of the term “foo”; and the standard for electricity over IP (RFC 3251). Not all RFCs are taken seriously, even those that are meant to be by the proposers.

But the core RFCs define the interoperability of the Internet, and as such they become self-enforcing. You don’t have to follow them, but if you don’t you are shut out of the community. And if you want to replace them, you need to get others to not only agree to the replacement, but get them to do so simultaneously with everyone else. Which is pretty much impossible. So the best approach is to simply go along with what everyone else is doing, and follow the standard.

This is much of the reason that groups like the ITU or various parts of the United Nations, that would dearly love to have control over the Internet, can’t quite figure out how to take that control. They might declare that they own the standards (they in fact have). They can insist that everyone change to use their standard (they have done this, as well). But they can’t make it worth anyone’s while to make the change, so they have no enforcement mechanism.

It’s enough to make a bureaucrat cry. Which is enough to make the geeks smile, and continue…

Who do you trust?

ø

Our seminar this week was billed as talking about voting and the Internet, but rather rapidly changed into a discussion of fake news, polling, and how to determine what is true. Another technology class going philosophical in front of our eyes. Towards the end of our discussion, we went around the table to say who it is that we each trusted, and the answers were both interesting and revealing. Trust in parents was phrased in the past tense. Most institutions were not considered particularly trustworthy. Most often mentioned were crowd-sourced sites like Reddit, Wikipedia, and Quora. Occasionally a news magazine like the Atlantic was mentioned, but I found the lack of trust in any sort of expert-based or curated site interesting.

Trust in experts seems to be at an all-time low, or maybe it is simply that we don’t recognize experts in some fields. With the advent of the Internet, everyone believes he or she can be a journalist, even though professional journalists go through a lot of training on how to insure that they have multiple sources, how to balance between the public’s right to know and the safety of releasing information, and the like. One of the real differences I see between the leaking of the Department of Defense and State Department information by Chelsea Manning and the leak of NSA information by Edward Snowden is that Manning gave the information to WikiLeaks (which then released everything) while Snowden gave the information to a team of journalists (who decided what should be released and what should be held back, balancing the right to know with the damage the information could do). One can argue that this is not a material difference, but in the Snowden case there was a reliance on trained expertise that was missing in the Manning case.

There have certainly been times when the experts have made huge mistakes. The Vietnam War has often been blamed on the hubris and self-deception of the “best and brightest” around Robert McNamara. Reports of weapons of mass destruction in Iraq that came from intelligence experts (although, it should be noted, other experts disagreed) led to another war, the consequences of which we are still seeing (and paying). Just because someone is an expert doesn’t mean that they are always right.

But we seem to have come to a point where the possibility of being wrong is confused with the certainty that someone must be wrong, or at least so prejudiced that their conclusions can’t be trusted.  The press is either liberal leaning or conservative, so neither can be trusted. Many people seem to take E.B. White’s stance, that “All writing slants the way a writer leans, and no man is born perpendicular.”

People tend to forget the whole thing that White was saying– the full sentence is “All writing slants the way a writer leans, and no man is born perpendicular, but many men are born upright.” Just because someone has a point of view doesn’t mean that they are wrong, or somehow dishonest, or manipulative. Trust, as Ronald Reagan said, may require verification. But imperfection doesn’t mean that trust is impossible. I certainly believe that the New York Times has a point of view, but I don’t believe that it impacts the truth of what they report. I often see the same reporting in the Wall Street Journal (on the news pages), a publication with a very different point of view. This leads me to trust both (for the news), as opposed to, say, Fox News or the National Enquirer, where I find the stories much more difficult to independently substantiate.

The push towards crowd-sourcing of knowledge and away from trust in expertise appears to rest on the assumption that the prejudices and distortions of a large populace will be evenly spread around the truth, so using the wisdom of the crowd will cancel out the individual prejudices. But I find little or no evidence that this is generally true, in spite of the nice democratic flavor of such a stance. Around matters of technology, I find that there are people who simply know more than others, and are better able to solve certain problems. I trust climate scientists more than, say, Senators on the subject of climate change. This is a form of elitism, but one that I’m willing to live with. It doesn’t mean that these people know about everything, or even that they are right in everything they say about their particular subject. But they are more likely to be right than someone randomly picked.

It is tempting, given the difficulty of having to think about what is really true, to take the stance that nothing can be trusted and it is all relative. Unfortunately, the universe doesn’t really care if we believe the facts or not; the facts are as they are. Disbelieving experts can lead to rather bad outcomes; thinking that there is no difference between truth and lies (or mis-statements) can lead to other bad outcomes (as we are seeing). Finding at least the best approximation of the truth can be difficult, but not doing that work is worse.

 

Empire and Innovation

1

I’m late in posting again this week, but this time I have a reason. Our discussion last time (and many thanks to David Eaves for visiting and leading the session) was about the interaction of the Internet and government. By coincidence, I had agreed to go to Washington, D.C. on Friday of last week to give a talk to interested Senate staffers about cyber security. So I thought I’d wait until after the trip to see if going to the seat of the federal government would trigger any thoughts.

The trip was fascinating– I had been asked to give a talk that was the conclusion of a month-long series of talks and panel sessions, organized by the Sargent at Arms for the Senate, on cyber security and social media. The Sargent at Arms is, essentially, the head administrator of the Senate, running all of the departments and groups that allow the Senate to do its work. My audience was made up of members of these administrative units, along with staff members for the Senators themselves. There were about a dozen people in the room, but the talk was also broadcast on the Senate’s own version of CNN, both within the Senate office buildings (there are many) and to the field offices of the Senators.

The room where I gave my talk was one of the (many) Senate hearing rooms. It was impressive, even by Harvard standards. Beautifully painted ceiling (with a zodiac motif), high ceilings and huge windows, lots of wood and carvings, and a raised area with a table and chair for the Senators (blocked off so no one would enter the space). After the talk I got a great tour of the Capitol itself, one-on-one with a staff member of the computer security group, which let me go all kinds of places that are generally not open to the public. The size of the place, the scale of the rooms, and the history recalled were all pretty awe inspiring and a bit overwhelming.

The only places I could compare it to that I have visited are the Coliseum in Rome, the Doge’s palace in Venice, and St. Peter’s in the Vatican. All monuments to their empires, all built at the height of that empire’s power.

But as I was feeling the awe (and pride) caused by seeing the Capitol, I couldn’t help but think of the places I knew out in Silicon Valley, in the New York tech scene, or in the tech companies around Boston. None of them were as beautiful and impressive as what I was seeing. But there was also a sense of ponderousness, of self-satisfaction, and of slow moving deliberation in the halls of the Senate that contrasted sharply with the feeling of excitement, experimentation, and energy that I remember from the tech companies.

All of which makes me wonder about the effect and interaction between the world of government and the world of technology. We talked some in the seminar about how technology can improve the delivery of services by government, but often that is just government adopting technology that has been used in the rest of the world like reasonably designed web sites and APIs that allow access to information in a way that enables others to write useful applications. This may be new and different in the world of government, but has been the norm in the rest of the world for a decade or more.

David’s stated worry was that government could use technology to impose a surveillance state and become far more controlling than anything thought of by Orwell. We have seen some things (like the Snowden revelations) that might back this view up, but so far I haven’t seen evidence that the government agencies can more quickly enough or competently enough to really carry this off. Nor do I think that the government  believes that it has to…the environment in which those running the government, like the Senate, are designed to make them feel that they are already masters of the world. Why would they need to do something different?

I have a very different worry– that the tech companies will move so fast in comparison to what happens in government that they make the Senate and the rest of government pretty much irrelevant, at least on a day-to-day basis. Yes, we will need the government bodies that deal with defense and foreign affairs to continue dealing with those subjects, but our everyday lives will be molded and controlled by corporations that move so fast and change so rapidly that the government bodies that are supposed to regulate those companies, protect us from their abuse, and insure that they don’t overstep the bounds of law and ethics are simply left behind. It has taken a year for the federal government to even start investigating what role tech companies like Google, Facebook, and Twitter played in the last election. Imagine how quickly a tech company would go out of business if it took a year to react to something like that?

I’m generally an optimist, and I don’t think that tech companies (or any other kinds of companies, for that matter) are actively evil. But they are currently pretty much unanswerable to the public, and this is beginning to worry that same public. We need to find some way of addressing these issues, but it won’t be by slowing down tech so that it matches the pace of government. The time scales are too different, and the incentives are too far out of alignment. We need a new approach to these problems, one that combines speed with responsibility. Our technologists need to think beyond the technology to the effects of that technology, and our legislators and regulators need to learn to understand both where technology is and where it is going. I don’t see an easy answer, but this is a problem we need to solve.

Projection and Confession

1

I increasingly find myself in the middle of discussions about what the machines are going to do when they become smarter than we are. This worry has fueled science fiction for as long as there has been science fiction (I have an anthology from the 1940s and early 1950s where the theme shows up). But the conversation has taken on a new immediacy since the deep learning advances of the past couple of years. Machines now play go at the highest level, machine vision is getting much better, and there seem to be new breakthroughs all the time. So it’s just a matter of time, right?

I’m not so sure.

My first area of skepticism is whether, as the AIs get better and better at what they do, that they come closer and closer to being sentient or thinking. Computers play chess in a very different way than people play chess, and I suspect that the new silicon go champions are not approaching the game the way their human counterparts do. I’m always reminded of Dijkstra’s comment “The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.” Submarines certainly move through the water well, but it isn’t what I would call swimming. And while computers can do lots of interesting tasks, I’m not sure it makes sense to say that they think.

Saying that they think projects our own model of what it takes to solve a problem or do a task onto (or into) the machines. We tend to build these kinds of models, where we anthropomorphize non-human things, all the time. We even do it with machines– we talk about how our computers are out to get us, or the personality of our cars. Of course, we also project our internal life to other people, where all we have as evidence is that they act and react like we do. But we also share a lot more with other humans (evolution, biology, and the like) that makes the projection seem a bit more reasonable (although not provable, as the problem of other minds is still active in philosophy).

So I tend to be a bit skeptical of the claim that, because machines can do the things that we do, that they are therefore able to think and be conscious in the same way we do.

But even if I were willing to grant that at some point of complexity and with the ability to learn the computers of the future will become sentient and self-aware, I’m not sure that the worries of many who talk about the singularity are warranted. My skepticism here is the unstated assumption that if the machines become sentient, they will also behave in the way that people behave. The worriers seem to jump from the singularity to the conclusion that the new, super-intelligent machines will keep us as pets, or try to destroy us, or set themselves up as gods, or simply not need us and treat us as ants.

Maybe this too is projection. If the machines are sentient the way we are, they will act the way we do. I tend to see this as more a case of confession on the part of the people doing the worrying– this tells us what they would do if they we the super-intelligent beings. But the motivations for humans are much more complex than what sentience or even intelligence dictates. We are still wired for the desires for food, and reproduction, and all sorts of other things that have nothing to do with learning or being intelligent (if you think human behavior is driven by intelligence, you haven’t been paying attention).

So I’m not at all sure that machines will ever be intelligent or sentient, but if they do I’m even less sure I know what will drive their actions. A super-intelligent machine might decide to go all Skynet on us, but I think it is just as likely to ignore us completely. And just as we don’t understand how many of the current machine algorithms actually work, we might not understand much about a super-intelligent machine. Because of this, on my list of worries, the singularity doesn’t make the cut…

Prematurely right…

2

Part of our reading for the discussion of the Internet of Things this last week was a selection from the National Academies study Embedded Everywhere. I was part of the study group, which published the book back in 2001 (after a couple years of the study). It’s interesting to look back at something like this– we got so much of the big picture right that I’m surprised, especially since we got almost all of the details wrong.

What we got right was how ubiquitous networked sensors were going to be. We saw them being used to help with traffic congestion, agriculture, climate science, seismology, and a host of other things. All of which is being done.

What we got wrong was how all this would happen. We talked about “smart dust,” a notion that the miniaturized sensors, with low-power networking and long-lasting batteries, would be sprinkled around the environment so that the data they collected could be fed back to banks of servers. There was even a project at UC Berkeley that produced what they called motes that were seen to be a first step along this path. Somewhere in my office I think I still have one or two of these hanging around. But it turned out that these were never as small as we had hoped. Batteries didn’t get better as fast as we thought they might. And calibration of large numbers of sensors turns out to be an amazing difficult problem to solve unless you have some way to get a human to deal with the individual sensors.

Instead, all this happened through cell phones. There were cell phones back when we wrote the study, but they were used as, well, phones. They didn’t have GPS embedded. They didn’t include accelerometers, or other forms of sensing. The networks they connected to were cell networks, optimized for voice and pretty bad (and limited) when transferring data. They were interesting and useful devices, but they weren’t the kinds of sensor platforms we were envisioning.

Like the early users of the ARPAnet and the Internet, we didn’t see what Moore’s Law and the acceleration of network technology was going to do to our little phones. Within a short number of years Apple introduced the iPhone, which was really a small networked computer with enough sensors to make things interesting. Restrictions on the accuracy of civilian GPS were lifted just before the study was published, but we had no idea the size of the impact that would have. As sensors, cameras, and microphones became smaller and better, they got pushed into the phones. The networks for the cell phone system got better and better, both in bandwidth and reliability. Calibration ceased to be a problem, since all of the sensors were paired with a person who could help with the calibration. Soon all of the data we had hypothesized being sent to clusters of servers was being gathered and sent. Just by a different set of technologies than we had been able to imagine.

The connection of people to the networked sensors caused other good things to happen, as well. People could get involved in the projects that we originally thought would be machine-only. There were community government projects to allow people to report pot-holes (automatically, based on bumps that their cars encountered), damaged sidewalks (where pedestrians could take a picture of the sidewalk, with a GPS tag, and send it to the local government), and the monitoring of air pollution or how well transit systems were keeping to their published schedules (which have given way to real-time schedules that tell you where the busses or trains are, not where they are expected to be).

It’s another reminder to all of us technologists. We tend to think of what the technology can do on its own. But the most valuable uses of the technology pairs the technology with people who use it, sometimes in unexpected ways. We think of how we can use machines to replace us, rather than how we will use machines to enhance what we do. But the valuable uses of technology are in enhancing the human enterprise, and that’s how we end up using the technology, even when it wasn’t designed for that. A lesson that we need to keep in mind, since we seem to constantly forget it.

 

Good service or privacy invasion…

ø

You wouldn’t know it now, but there was a time when I was a pretty good dresser. Fashionable, even.

I will admit that this was through no fault (or effort) of my own. But when I was a teenager, I bought my clothing at a single store, where there was a clerk who essentially picked out all of my clothing. I would go in a couple of times a year, and he would tell me what to buy, and what went with what, and how I could combine what I was buying with what I already owned. He didn’t quite stitch the garanimals on the clothing for me, but it was close. He knew everything that I owned, and all I had to do was follow his instructions.

When I went off to college, I was on my own and things degraded quickly. But for a while there I had great service, and was willing to pay a price and know that there was someone who knew more about what was in my closet than I did. He also cared more than I did. But I liked the result.

I can now foresee a time when this sort of service could be offered again, pretty much to everyone. It could be done via the web, using an AI that was able to filter my preferences (taken from what I had bought in the past) and add some sense of fashion taste, and offer me the right clothing to buy. It would know more about what was in my closet than I did, and could customize itself to my taste and preferences.

But we tend to worry about all of the data that on-line corporations like Amazon and Google know about us, given all of the data that they gather. We worry about the micro-targeting of ads (which can also be seen as only showing us ads about things in which we are interested) and the honing of our news feeds that put us in a bubble. Because of this, there is talk of regulating the data that these companies gather about us, limiting their ability to know what we have done and predict what we will do.

While I share a lot of these concerns, I also wonder if we are letting the unfamiliarity and “creepiness” of the situations dictate an emotional response that may not be justified. When I hear people talking about how much, say, Google knows about me, I wonder who it is that actually knows. Corporations don’t know things, they are abstract entities. The computers at Google don’t know things, either; they process data and perform algorithms, but they no more know anything than submarines swim. Is there any person at Google who knows all of this information about me (I have friends at Google who know a lot about me, but that isn’t from tracking the data I enter on-line). There might be someone at the NSA that knows about me (although I doubt I’m that interesting), but I don’t think there is anyone at Google.

One big difference between the technologies that know about me and the clothing store clerk of my youth is that I trusted the clerk. He was another human, and I talked with him and interacted in a way that led me to respect his opinions (and do what I was told). There is no such trust relationship with the technologies with which I interact. But that could change, if the sophistication of the algorithms improves. Rather than showing me things I have just purchased to see if I want to purchase them again, maybe the algorithms will start showing me things related to those purchases. The more they could be like the trusted clerk, the less I would be creeped out.

I don’t think they will ever get to the point that I will be a fashionable dresser again. But it might mitigate some of my worries about privacy. Maybe…

Design…

ø

End-to-end Arguments in System Design is one of my favorite papers in the computer science universe. It is well written, and it clearly states a design principle that was followed in the creation of the network we now know as the Internet. It gives some nice examples. What more could you want?

Well, I’d like more papers like that. In the computer science/software engineering/programming community, we don’t often talk about what makes a good design, or why we made the design decisions that we make. Oh, there are lots of book and articles that tell you how you ought to go about doing design. Maybe you should use pattern languages, or you should adopt an agile methodology, or you should do everything in UML diagrams. I’ve even seen combinations, saying that you should use an agile pattern language (no doubt producing UML diagrams). All of these can be useful as mechanisms for communication of a design, but I don’t find any of them actually talking about what makes a good design good, or how to go about doing such a design.

Writing about good design is far rarer. There is Butler Lampson’s classic Hints for Computer System Design. This is a great paper, but it is also pretty old (although not, surprisingly, outdated). There are a couple of books that tried to talk about design at different levels (Beautiful Code for programming, and Beautiful Architecture for collections of programs), but the results in both are mixed (full disclaimer; I wrote a chapter in the second and am not at all sure that it was much of a success). I’ve always like Paul Graham’s Hackers and Painters (the blog rather than the book), but it is more a contemplation on the art and craft of programming rather than on design. I’ve tried to talk about how to get good design, but it is a very slippery topic. Even Fred Brooks has written a book on the subject, which I like but is also, in the end, somewhat less than satisfying.

One of the reasons, I believe, for the lack of literature on the subject is that it is so hard to say anything that doesn’t seem to be either trite or just wrong. We all can agree that simplicity is a good thing, but what makes a design simple? Breaking a system into modules that make sense is trite; saying what it is for a module to make sense is difficult. You can recognize a good design when you see one, but explaining what a good design will be before the fact– well, I’ve never seen that done all that well.

After thinking about this for some time, I’ve come to the conclusion that good design is more art than science, more craft and taste than knowledge and process. As Brooks says, often the best that one can say is that good design comes from good designers. Good designers are often trained by having apprenticed with other good designers, but that’s about as far as you can go with the explanation. Even that training (which is really an apprenticeship) may not be all that conscious. I know of companies that set up mentoring programs in the hopes of getting good design training, but my apprenticeship (and I remember it well) was much more like that in Kill Bill, Vol. II– I spent a lot of time doing the computing equivalent of carrying water, and often felt bruised and battered. But I learned.

This is another way in which system design is like art– the way you learn to be an artist is to try, hear criticism from a  more experienced artist, and try again. Like a painting, the design of a system is never finished– there are always things you would like to change, decisions you would like to revisit. Sometimes you decide that it is good enough, and you ship. But you know you could always do better.