You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

Archive for the 'What is the Internet?' Category

Back on the air…

1

Hard to believe that I haven’t written since last November, but there it is. Now the second version of our Freshman Seminar is up and running, so I have a forcing function to write…

I’m always fascinated by reading the early history of the ARPAnet/Internet. The whole effort was an amazing leap of faith by a group that didn’t know if what they were trying to do would ever work. They certainly had no conception of how it would scale and the importance it would have in the current world. They were just trying to get something that would let them do their work more easily.

Which leads me back to a question I often consider, which is how can standards best be set? The standards that enable the internet (TCP/IP, DNS, and the like) were, for the most part, built to solve problems people were actually having. They weren’t built using an inclusive, multi-stakeholder, democratic standards organization. The IETF, which is the organization that has the most to do with Internet standards, is run by a process that causes governance people to faint. It’s a bunch of geeks, figuring out how to get the job done. But the results are pretty good.

I’ve argued elsewhere that the best technology standards are the ones that describe an existing tool or technology that everyone uses because it is useful. The C programming language, for example, was standardized well after everyone who was going to use it was already using it. The purpose of the standard was to write down what we all knew. The early days of Java(tm) were the same, as was Unix (which had, in its early days, two standards; one from AT&T and the other from Berkeley). I think of these standards efforts and writing down what are already de facto standards. This is valuable, and the standards seem to work.

I contrast these standards with those that begin with a standard’s process. These are technologies like Ada, the object-oriented language mandated by the Department of Defense and built by committee, or the ISO networking standards, produced by a full multi-stakeholder process. While they may have their good points, they tend to be clumsy, ugly, and not very useful (and therefore not very used). These are attempts to impose standards, not describe them. I used to claim that the reason managers liked these standards is that they turned technical questions (which the managers didn’t understand) into political questions (which they did). But perhaps I’ve mellowed with age; I now think an even more basic problem with these committee-invented standards is that they are created without a particular problem to solve, and so solve no problems at all.

One of the real blessing of the early Internet was that no one other than the engineers working on it thought it was important enough to be worth putting under some form of governance. The geeks got to solve their problems, and the result is a system that has scaled over many orders of magnitude and is part of the infrastructure of all technology. But one of the reasons that it works is that those who started it had no idea what they were doing, or what it would become.

Technology and government

1

Our discussion in our Freshman Seminar this last week concerned how technology in general and the Internet in particular could be used to help governments (at all levels) be more responsive and deliver services better. We were fortunate to have David Eaves join us; he has been an advocate for open data and using technology in government for some time, so getting advice from someone who has been there is really helpful.

What I keep thinking about, after the discussion, is the number of large technical projects within various parts of government that fail. There are the really visible failures like healthcare.gov (on its initial rollout) or the attempts to implement a case management system for the FBI or attempts to replace the air traffic control system (which has failed multiple times). Depending on the statistics you want to cite, 72% of government IT projects were in trouble in 2009, or that 70% of government IT projects fail, or that only 6.4% are successful. David claimed that things are not that different outside of the government, and you can certainly find studies that agree with this. In fact, reading these studies, it is surprising that anything ever works at all.

My problem with all these studies is that they fly in the face of my own experience. I spent about 30 years developing software in industry. I was on lots of projects. There were some that weren’t successful in the market for one reason or another. There were a couple that were stopped when the company I was working for got bought and the new owners decided that those projects weren’t the sorts of things they were interested in. But I was never on one that failed because we just couldn’t get the system to work.

David did distinguish between projects that were done in “technology” companies verses those done by everyone else, and I certainly worked in technology companies. But over the past 6 years I’ve been working in one part or another of Harvard Information Technology. Harvard is hardly a technology company (don’t get me started…), but in that time we have successfully rolled out a new course management system, a new student information system, re-vamped the Identity and Access Management system, moved most of email from local servers to the cloud, and done a ton of other projects as well. Not all of them have done exactly what everyone hoped they would do, but that have all pretty much worked. None had to be abandoned, or re-written from scratch, or got deployed and then turned into a disaster.

So what is the difference between the facts and figures that we see about project failure and my own experience? Maybe I have some sort of magic about me, so that projects I join or observe are somehow saved from the fate of all of these others. That would be really useful, but I don’t think it is the right explanation. I think I’m good, but I don’t think I’m magic.

I’m more inclined to think that the difference has to do with what the managers of the projects care about. In most of the government projects I’ve heard about, and in lots of the non-governmental projects that have failed, managers have been more concerned about how things get done than anything else. That is, the worry is what kind of process gets followed. Is the project being run using the waterfall model (which was first discussed in a paper saying that it was the wrong way to manage a software project) or various forms of agile development (which is a new cult), or some other method? These are approaches that managers really hope will make the development of software predictable, manageable, and most importantly, independent of the people who are on the project. All of these models try to make the developers interchangeable parts who just do their job in the same way. Doesn’t matter who is on the project, as long as the right process is followed.

This is in contrast to what I saw through my career, and what I see in companies that might be thought of as “tech” companies now. In these projects, the worry was all about who was on the project. There was a time I gave talks about what I called the Magnificent Seven approach to software projects. The process was straightforward: hire a small group of experienced professionals, let them deal with the problem as they saw fit, and if you found a kid who could catch fish barehanded ask him or her along. This was hardly an idea that I came up with by myself; you can see it in The Mythical Man Month and other things written by Fred Brooks.

A process-based approach seems a lot more egalitarian, and in some ways a lot more fair. It means that you never have to tell someone that they aren’t good enough to do the job. It is good for the company, because you don’t have to pay outrageous salaries to engineers who are probably a pain in the tail to manage because they think (often rightly) that the company need them more than they need the job (since, if they really are that good, they can find another job easily). So I certainly understand why managers, and large bureaucracies like various levels of government, want to focus on process rather than individual talent.

But then you have to explain the difference in success rates. If focusing on process gives a success rate somewhere between 40% and 5%, and focusing on talent does a lot better (I don’t have numbers, but my anecdotal experience would put the success rate of really high performance teams at the 85%+ range), then maybe making quality distinctions isn’t a bad idea. I’m not sure how you get the various levels of government to accept this, but I think if we are going to have governments that are dependent on good technology, we need to figure out a way.

The Singularity, and Confessional Language

ø

In our seminar this last week, we talked about the Singularity, that point at which machines become smarter than people, and start designing machines that are even smarter so that the gap between humans and computers running AI programs just gets larger and larger. Depending on who you listen to, this could happen in 2045 (when the computing power of a CPU will, if current trends continue, be greater than that of the human brain), or sooner, or later. There are people who worry about this a lot, and in the past couple of weeks there have even been a couple of Presidential studies that address the issue.

I always find these discussions fascinating, as much for what is presupposed in the various positions as for the content of the discussion. The claim that machines will be as “smart” as humans when the complexity of the chips equals the complexity of the human brain assumes a completely reductionist view of human intelligence, where it is just a function of the number of connections. This may be true, but whether it is or not is a philosophical question that has been under discussion at least since Descartes. Consciousness is not something that we understand well, and while it might be a simple function of the number of connections, it might be something else again. In which case, the creation of a computer that has the same level of complexity as the human brain would not be the same as creating a conscious computer, although it might be a step in that direction.

Then there is the assumption that when we have a conscious computer, we will be able to recognize it. I’m not at all sure what a conscious computer would think about, or even how it would think. It doesn’t have the kinds of inputs that we have, nor the millions of years of evolution built into the hardware. We have trouble really understanding other humans who don’t live like we do (that is the study of anthropology), and this even goes back to Wittgenstein’s dictum that “to understand a language is to understand a way of life.” How could we understand the way of life of a computer, and how would it understand ours? For all we know, computers are in some way conscious now, but in a way so different than we can’t recognize it as consciousness. Perhaps the whole question is irrelevant; Dijkstra’s aphorism that “The question of whether machines can think is about as relevant as the question of whether submarines can swim” seems relevant here.

Beyond the question of whether machines will become more intelligent than humans, I find the assumptions of what the result of such a development would be to tell us something about the person doing the speculation. There are some (like Bill Joy) who think that the machines won’t need us, and so will become a threat to our existence. Others, like Ray Kurzweil, believe we will merge with the machines and become incredibly more intelligent (and immortal). Some think the intelligent machines will become benevolent masters, others that we are implementing Skynet.

I do wonder if all of these speculations aren’t more in the line of what John L. Austin talked about as confessional language. While it appears that they are talking about the Singularity, in fact each of these authors is telling us about himself– how he would react to being a “superior” being, or his fears or hopes of how such a being would be. These things are difficult to gauge, but the discussion was terrific…

Money, bits, and the network

1

We had an interesting discussion in the freshman seminar on the impact of the internet on the economy. We all see some of the disruptions– Uber and Lyft are causing major changes to the taxi companies, Amazon has done away with most brick-and-mortar bookstores (and many other kinds, as well), and the notion that you need to actually to somewhere in the physical world to buy something (that you will see, touch, and perhaps try out before you buy) is getting restricted more and more to items that seem oddly special for just the reason that you need to see before you buy. Even the paradigm of this, the car, is having buying habits changed by companies like Tesla.

We only touched on a more fundamental change in the economy that the computer and networking world is bringing about– the very notion of money. It wasn’t that long ago that what stood behind our currencies was a hot political topic– there were those who worried about currency that was only backed by silver, rather than by gold. The idea was that the value of money needed to be directly traced to some precious metal that the money represented. A $100 bill got its worth by being capable (at least in theory) of being exchanged for $100 in gold (or at least gold coins).

This notion was abandoned by most countries in the early part of the 20th century. What made $100 worth something is that someone else was willing to part with some set of goods and services in exchange for the bill. The final arbiter was that you could pay your taxes with such bills; the government always needs to be paid, so that kind of worth is a form of guarantee.

Now, the real money is just bits in a computer. Banks exchange the bits with each other, and we all hold our money in those banks, where they are represented as bits, and where we can transfer the bits via computers, or credit cards, or by direct electronic means. On occasion we exchange some bits for bits of paper (bills) that we can carry around with us and use to pay, but that is become less and less needed as we all become used to the notion of money as bits.

This can be very convenient. I was recently in England for most of a week. My ATM card could be used to get English currency because bits are easy to transfer internationally. But I didn’t need the currency all that often; mostly I paid with credit cards that were tokens allowing me to directly move bits from one account to another. I remember many years ago when I first went to England, and needed to worry about getting traveller’s checks beforehand so I would have money while I was there. Just not needed any more.

All of which leads to the question of what money is now. It isn’t a representation of a precious metal. It’s more a consensual hallucination that we all believe will continue to be exchanged for goods, services, or (as a last resort) taxes. But it isn’t much of a thing so much as a representation in the Internet.

All of which brings me to one of my favorite characters in this space, J.S.G. Boggs. Boggs became (in)famous for drawing complex pictures of obviously fake U.S. bills (he often put his own face on the bill), and then passing the art at the value shown on the bill. So if the bill showed $100, he would ask for $100 worth of goods (or change). Merchants were happy to do this, since the bills were worth much more than their face value (as art). But it drove the U.S. Secret Service (which has as part of its job the enforcement of anti-counterfeiting laws) somewhat bananas. The courts finally sided with Boggs, who claimed that the work was performance art. But it was also something that had value of a sort that, at least in the initial transaction, was not like other forms of art.

Things are getting more complex today. Bitcoin is hardly performance art, but whether or not it is currency is under debate. It appears to have value (although it fluctuates considerably), it can be used for payments, and is certainly causing some authorities consternation. How much it changes the economy is yet to be seen, but it will certainly have an impact no matter what happens.

Design principles

ø

As part of the seminar on the Internet that I help to teach, we read End-to-end arguments in system design, one of my favorite papers in computer science. What makes this such a great paper is that it takes the notion of design seriously. The reason to read the paper isn’t to learn about a nice implementation, or the proof of some theorem, or even how to make an algorithm that runs faster than others when doing some task. Instead, this paper talks about a general principle that informs any work that might be done that involves a network.

Simply put, the end-to-end argument tells us to keep the network simple, and do the work at the endpoints, since those endpoints will need to do most of the work themselves anyway. Worried about who you are talking to? Well, you need to authenticate at the endpoints, so there is no need to do that in the network. Need to check that the message hasn’t been corrupted in transit? That has to be done at the endpoint, so there is no reason to do so in the network, as well. It is an outgrowth of the idea that you don’t want to do work twice. So find the right place to do the work, and ignore it everyplace else. The result of following this principle is that the network we now use is simple, has scaled remarkably, and can be used for all sorts of things it was never intended to be used for. To introduce a new bit of functionality, all that needs to be changed are the end points that need that functionality, not the network itself.

It seems obvious now, but this principle was pretty radical at the time it was proposed. A lot of people thought it would never really work, never really scale, never really perform. It was fine for experimentation and toy applications, but not for real work (that needed token rings, or something far more reliable and guaranteed).

Articles that enunciate general design principles are few and far between, and should be treasured when they are found. My other favorite is Butler Lampson’s Hints on Computer System Design, a paper written in 1983 but still relevant today. The examples may be somewhat outdated, but the hints are still important to understand. The details of the work may change, but the underlying design principles are much closer to being timeless.

In the seminar, we also talked about the design notion that you can solve a problem by introducing a level of indirection. When the first ARPAnet was built, the networking was taken care of by simple, small computers called interface message processors (IMPs), that were responsible for talking to each other, with each IMP connected to a host system. This isolated the differences in the host systems to a local phenomenon; if I wanted to connect I had to deal with local characteristics locally, but not with the special characteristics of the remote hosts I wanted to talk with or connect to. The IMPs offered a level of indirection. When it was realized that there were local networks that wanted to be connected, another level of indirection was introduced. This level looked to an IMP like a host, and to the local network like part of that network. Thus was a gateway born, allowing the local idiosyncrasies of the local networks to be dealt with locally, not globally.

Each of these levels of indirection can also be seen as adding a layer of software into the system that will translate from the local environment to the more global one. Each local environment may have a different gateway, but that is masked from the global environment. The power of software is that it allows a common external interface to be implemented in a different way that is hidden to those who don’t need to know about it.

Discussions of software design have recently centered around various sorts of patterns. While these may be interesting, I do wish we as a community would talk more about the general principles that can inform a wide variety of designs. They are hard to find, often look trivial once they are known, but are important ways to think about how we build our systems.

Back again

1

It has been way too long…amazing how life intrudes on the best of intentions. But time to get back.

To help force myself, I’m teaching a freshman seminar (along with Mike Smith) in which we are requiring that the students keep a blog of their thoughts about the content of the class. And since it seems unfair to ask others to do what you are unwilling to do yourself, we committed to do the same. It’s one way to get back to writing.

The seminar’s topic is What is the Internet, and What Will It Become? One of the pleasures of teaching a freshman seminar is that the topic can be wide open, pretty much unconstrained, and far more interesting than tractable. This topic fits the bill pretty well. It reminds me of my past as a philosopher– the more I think about the topic, the less sense I can make of it. Is the Internet just TCP/IP? Is it a suite of protocols, or a consensual hallucination?

Beyond the topic, we get to discuss this (and in the process get to know) a small group of what appear to be spectacular students. I always learn more from them than they learn from me, and I’m looking forward to being taught by them.

We are starting by looking at the history of the development of the Internet. We have been reading Hafner and Lyon’s Where Wizards Stay Up Late, as accurate a single-volume history as we could find. History is a funny thing, especially when there are still those around who were involved in the events. It is hard to get everyone to agree who did what when, and even more difficult to get everyone to agree on the impact and import of much of what went on. It’s so much easier when no one is around who can say “well, I was there, and it didn’t really happen that way.”

There are lots of interesting lessons to learn from the way the early Internet was constructed. There seemed to be some ideas that permeated the air but were completely counter to the existing orthodoxy, such as packet switching. It was clear that there was no real agreement on what the end state of the experiment that was ARPAnet was going to be. And reading the history it becomes apparent that then, as now, much of the real work was done by graduate students, who seemed to have a better idea of what it was all about than the people who were supposedly running the project.

What I find most interesting, though, is the contrast in notions of how to build a reliable network. The packet network advocates started with the assumption that the network could never be made reliable, and that was just the way the world was. So they spent a lot of time on figuring out how to build reliable transmission on top of an unreliable network, thinking through things like re-tries, congestion control, and dynamic routing. Errors, on this design philosophy, are a given, and so the users of the network need to acknowledge that and build reliability on top of the unreliable pieces.

This is a huge contrast to the network engineers of the time at, say, the Bell system. The phone company (and there was only one in the U.S. back then) was all about building a reliable network. They did a pretty good job of this; I remember when not getting a dial tone on your (AT&T owned) phone was a sign of the Zombie Apocalypse (or, given the times, the beginnings of nuclear war). But making the system reliable was difficult, and expensive, and limited what could be done on the network (since lots of assumptions about use got built in). It is hard to remember, now that the Internet is the backbone of most everything, that which of these approaches was going to be best wasn’t clear for about 20 years. Big companies backed “reliable” networks well into the 90s. But in the end, simplicity at the network level won out, giving us the networks we have today.

I suppose my interest in this evolution is not surprising, given that I have spent most of my life working in distributed systems, where the same argument went on for a long time (and may still be going on). Building a reliable computing platform can be done by trying to insure that the individual components never fail. When you build like this, you worry about how many 9s your system has, which is a reflection of what percentage of the time your system is guaranteed to be up. Four 9s is good (the system is guaranteed to be up 99.99% of the time), five 9s is better (now you have a guarantee of 99.999% uptime). But moving from four 9s to five 9s is expensive.

The alternative, best exemplified by cloud computing, or the approach taken by Amazon, Google, or Facebook, is to build a reliable system out of lots of unreliable components. You assume that any of the servers in your server farm is going to fail at any time, but build your system around redundancy so that the failure of one doesn’t mean that the system is no longer available. It is a more challenging design, since you have to worry about failure all the time. But it is one that works well.

Just like the Internet.