Back again


It has been way too long…amazing how life intrudes on the best of intentions. But time to get back.

To help force myself, I’m teaching a freshman seminar (along with Mike Smith) in which we are requiring that the students keep a blog of their thoughts about the content of the class. And since it seems unfairĀ to ask others to do what you are unwilling to do yourself, we committed to do the same. It’s one way to get back to writing.

The seminar’s topic is What is the Internet, and What Will It Become? One of the pleasures of teaching a freshman seminar is that the topic can be wide open, pretty much unconstrained, and far more interesting than tractable. This topic fits the bill pretty well. It reminds me of my past as a philosopher– the more I think about the topic, the less sense I can make of it. Is the Internet just TCP/IP? Is it a suite of protocols, or a consensual hallucination?

Beyond the topic, we get to discuss this (and in the process get to know) a small group of what appear to be spectacular students. I always learn more from them than they learn from me, and I’m looking forward to being taught by them.

We are starting by looking at the history of the development of the Internet. We have been reading Hafner and Lyon’s Where Wizards Stay Up Late, as accurate a single-volume history as we could find. History is a funny thing, especially when there are still those around who were involved in the events. It is hard to get everyone to agree who did what when, and even more difficult to get everyone to agree on the impact and import of much of what went on. It’s so much easier when no one is around who can say “well, I was there, and it didn’t really happen that way.”

There are lots of interesting lessons to learn from the way the early Internet was constructed. There seemed to be some ideas that permeated the air but were completely counter to the existing orthodoxy, such as packet switching. It was clear that there was no real agreement on what the end state of the experiment that was ARPAnet was going to be. And reading the history it becomes apparent that then, as now, much of the real work was done by graduate students, who seemed to have a better idea of what it was all about than the people who were supposedly running the project.

What I find most interesting, though, is the contrast in notions of how to build a reliable network. The packet network advocates started with the assumption that the network could never be made reliable, and that was just the way the world was. So they spent a lot of time on figuring out how to build reliable transmission on top of an unreliable network, thinking through things like re-tries, congestion control, and dynamic routing. Errors, on this design philosophy, are a given, and so the users of the network need to acknowledge that and build reliability on top of the unreliable pieces.

This is a huge contrast to the network engineers of the time at, say, the Bell system. The phone company (and there was only one in the U.S. back then) was all about building a reliable network. They did a pretty good job of this; I remember when not getting a dial tone on your (AT&T owned) phone was a sign of the Zombie Apocalypse (or, given the times, the beginnings of nuclear war). But making the system reliable was difficult, and expensive, and limited what could be done on the network (since lots of assumptions about use got built in). It is hard to remember, now that the Internet is the backbone of most everything, that which of these approaches was going to be best wasn’t clear for about 20 years. Big companies backed “reliable” networks well into the 90s. But in the end, simplicity at the network level won out, giving us the networks we have today.

I suppose my interest in this evolution is not surprising, given that I have spent most of my life working in distributed systems, where the same argument went on for a long time (and may still be going on). Building a reliable computing platform can be done by trying to insure that the individual components never fail. When you build like this, you worry about how many 9s your system has, which is a reflection of what percentage of the time your system is guaranteed to be up. Four 9s is good (the system is guaranteed to be up 99.99% of the time), five 9s is better (now you have a guarantee of 99.999% uptime). But moving from four 9s to five 9s is expensive.

The alternative, best exemplified by cloud computing, or the approach taken by Amazon, Google, or Facebook, is to build a reliable system out of lots of unreliable components. You assume that any of the servers in your server farm is going to fail at any time, but build your system around redundancy so that the failure of one doesn’t mean that the system is no longer available. It is a more challenging design, since you have to worry about failure all the time. But it is one that works well.

Just like the Internet.

Furthering Human Knowledge
Design principles

1 Comment »

  1. Mike Smith

    September 9, 2016 @ 3:00 pm


    Great post, Jim. Your paragraph mentioning the years of companies proposing new “reliable” networks that would be superior for this use or that brings back memories. I remember working for Honeywell in the 80s and listening to the arguments about which network “standard” they were going to back and why. Unlike today, the systems we used couldn’t talk to anything but a small percentage of the systems on which I could work. Computing systems were islands or small archipelagos, at best.

    The 90s saw more standardization, but by then I was fully back into academic research. Now it was my friends in academia who were certain that they could see a future with an alternative, non-TCP/IP dominant standard. As you say, “in the end, simplicity at the network level won out.” Simplicity, a tolerance of errors, and the unrelenting progress of computing power, network bandwidth, and storage capacity. It was, and continues to be, a combination hard to beat.

Leave a Comment