You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

It Can’t Be Over Already

2

Today was the last session of this fall’s Freshman Seminar. I can’t believe we have reached the end of the semester. Each year’s class is wonderful, but there was something about this group of students. Engaged. Funny. Thoughtful. Spirited. Excited about the future and the potential it holds for all of us. I hope each of you keeps in touch and calls on me when you have something to share or want to just talk through different opportunities. Thank you for making 50N something to which I looked forward each and every week.

On the other hand, as you probably sensed in class today, I am amazed that the exuberance around blockchain and crypto-currencies continues to grow. When will we see this exuberance pop? I don’t know. Jim would know better than me, but there are a number of hard distributed computing problems that seem to have been swept under the rug, which might be ok in small networks, but worry me in large ones. I was also struck while reading with the lack of careful consideration of the threats against blockchain and the systems built upon it. Maybe such careful threat modeling is done elsewhere, but it continues to feel like we’re repeating the same problematic approach to the world that we saw throughout the development of the Internet and its applications: rush to trumpet the functionality and worry later about the threats.

The first Bitcoin paper says that double spends are not a problem because you can ignore them. But can an adversary flood the network with double spends? Would such an approach become a type of denial of service? Is a double spend request just a normal timing problem in the network, or is it something nefarious? This is just one example I wish I had seen considered.

In a related way, I found the Ether Thief paper fascinating. It appeared that no one deeply involved with the development of Ethereum ran a tabletop exercise to determine and practice what every person involved should do when the inevitable attack happens. You hope your bank is never robbed, but you can bet your bottom dollar that banks run tabletop exercises regularly to make sure that everyone knows what to do when a robbery happens. I suggest that you ask before you put your money into one of these blockchain-based systems what they’ve done to protect themselves against theft and how often they practice doing what they’ll do during an actual theft. No theft will be a textbook theft in all ways, but preparation matters and can help with prevention. The Harvard Kennedy School has experts in the field of Crisis Leadership, and while it might be no fun to be grilled by the likes of Professor Dutch Leonard in one of these mock sessions (from experience), you learn a lot about what you’re ready to handle and what you need to prepare yourself to handle.

Lastly, I’d like to touch upon the Iansiti and Lakhani paper. I thought it was quite good for what it was: an argument against blockchain as a disruptive technology. However, I don’t think blockchain is like TCP/IP. TCP/IP had competitors, but forks of the technology weren’t a long-term issue. If you found a mistake in the current implementation of TCP/IP, you fixed it and everything easily moves over to the new implementation. If you had a dispute about which implementation works better, you put both out there and one eventually won out (everyone moved over to it). Of course, some things are harder to change than others (e.g., consider how long it took to move from IPv4 to IPv6). Even so, we haven’t experienced in TCP/IP the threat of many persistent forks in the way we’re seeing in crypto-currencies. The network effect made the Internet take off quickly and that was fueled by basically a dominant foundational technology. I am not so convinced in the blockchain world.

But then, I’m old. I have often been wrong.

Make Up Your Mind

ø

I’m going to cheat a bit this week and write a short post. The reason is that I want you to read the article titled “Should Facebook and Twitter Be Regulated Under the First Amendment?” by Lincoln Caplan. It does a better job than I ever could at explaining the seeming contradictions (e.g., President Trump can announce U.S. government policy on the Twitter account @realDonaldTrump and these announcements must be preserved under the Presidential Records Act, but the president can block other accounts from responding to his tweets because some lawyers agree that the account @realDonaldTrump is the person and not the president, and thus his actions on Twitter are protected under the First Amendment) and significant differences between the United States’ and Europe’s free and hate speech laws (and the resulting effect on social media companies and their different approaches).

What a quagmire.

As we discussed this week, what happens online matters. It matters to our wellbeing as individuals, our ability to productively interact with each other in the real world as communities, and as the article states, our “ability to [have] an informed citizenry” as a democratic nation. The more we discuss these topics, the more convinced I become that social media companies aren’t wires, “passive conduits,” or simple gathering spaces. They are collections of algorithms that have real effects on us and the world in which we live.

What do you think? What will you do?

Toward Better Corporate Security

ø

One of the topics touched upon briefly this week was how we might approach the securing of a nation’s corporate sector against cyber attacks, whether from foreign powers or organized crime. For the purposes of a simpler discussion, let’s assume that the corporations in question reside entirely within our nation (call it the U.S.). In the physical world, U.S. corporations rely on the U.S. military to protect its national borders against foreign incursions, on the National Guard and local police force to protect its property and maintain peace and order, and on their own contracted security personnel for day-to-day security precautions (e.g., to ensure that only authorized people enter the corporate facilities). What’s the equivalent in cyberspace?

Well, one thing to notice is that the training and the capabilities of the different forces in the physical world decrease in sophistication as you go from the highly trained U.S. military forces down to your run-of-the-mill security guard. You could ask the U.S. military to cover your company’s day-to-day security precautions, but it wouldn’t be a good use of the skilled people in the military. Plus, as was mentioned on Monday, corporations probably don’t want the U.S. military tromping around their site. It seems wrong to Americans that the U.S. government would display that much of a show of force within the country’s borders. In cyberspace, the situation is that much worse because you can’t just post guards at the door and around the property. Anti-virus software, for example, works because it scans everything on your system and is trusted more than anything else in your system (except the operating system kernel). I bet most companies would not want the U.S. military looking through every one of their drawers and files.

If we can’t rely on the sophisticated expertise of the U.S. military’s cyber division, what should one do? Well, I founded a software security company back in 2001 with this mission, and I thought I’d show you some of what I wrote 15 years ago on this question. This was a trip down memory lane for me. I hope you enjoy reading it (unedited), even if it is far from perfect.

**** Our [company’s] beliefs and philosophies

  1. Our business focus is enterprise security. This security focus encompasses the protection of all sensitive digital documents within the enterprise as well as the operation of the enterprise’s distributed computing infrastructure. It does __not__ include protection of digitally-based products (e.g. music files) sold, rented, or otherwise involved in a financial transaction by the enterprise to consumers or other corporations. Though our technology can be used for such purposes, the balance between security and ease of use differs in these two market opportunities.
  2. We are interested in averting strategic disruption (i.e., loss of strategic information) as well as operational disruption (i.e., loss of some or all of the capabilities of the enterprise’s computing infrastructure); disruption that occurs through the unintentional misuse or even malicious use of corporate information or resources. We are __not__ directly addressing the wide range of illegal activities associated with digital commerce.
  3. We believe that security is dynamic. The security concerns that enterprises have will change over time, and thus our security solution must be flexible and extensible to adapt to these changes. The perceived importance of a security threat and the willingness of the enterprise and its employees to change their behavior to protect the enterprise and themselves against particular threats varies, and thus our security solution must support this variability directly.
  4. It is very difficult for an enterprise to quantify how much it should spend on security, and thus enterprises typically purchase a security product only if it is known to be a “best practice.” To become a best practice, a security product must be widely deployed. How does a security product become widely deployed if it is difficult to quantify the benefit of security? The answer, as demonstrated by other successful security products like anti-virus solutions and VPNs, is to provide a meaningful level of security while simultaneously being easy to deploy and essentially transparent to the enterprise user. These three axes work together to define what we call the Security Success Triangle (SST). In our business space, the SST says that we must avoid operational disruption due to the deployment or use of our security solution, since operational disruption is one of the two reasons why the enterprise is purchasing our solution in the first place.
  5. Once we have become a best practice, the SST says that we can increase the amount of meaningful security our solution provides. Besides the powerful and profitable business models that this enables, this observation again reinforces the need for our approach to be flexible and extensible.
  6. A focus on ease of deployment and usability also implies that our security solution must not be tightly coupled to the rest of the enterprise’s computing infrastructure, except when the security solution is enforcing the security policies of the enterprise. In other words, our security solution should be tightly coupled to the enterprise’s applications when those applications are running, but it should be only loosely coupled to the application infrastructure for purposes of maintenance and upgrades, etc. Note that this coupling eases the maintaining and upgrading of both the security infrastructure as well as the application infrastructure.
  7. The best way to obtain a security solution that is easy to deploy and transparent to the end user is to implement the security solution in such a manner that it is possible and straightforward to understand the user’s intentions and to be able to differentiate between normal and abnormal behavior. Security solutions that are implemented far from the end user and deep in the lower layers of the computing infrastructure cannot achieve the level of understanding and differentiation that we desire. Thus, we are driven to an approach to enterprise security that can track and affect the operations performed within applications. (Say something stronger about the need to avoid false positives and user dialog boxes?)
  8. Our business is focused on the distributed computing infrastructure in today’s enterprises, e.g., devices like personal computers, laptops, and PDAs. This infrastructure is not well covered by today’s security solutions, especially when it is not clear who owns or has the right to configure (or even understands how to configure) the device. Personal computing devices are just that–something that employees would like to use for both personal as well as corporate computing. Our security solution must support the often-conflicting needs and requirements created by these two worlds. It is not a viable solution to force the user to work with two separate sets of applications.
  9. Since our business focus is on the enterprise’s distributed computing infrastructure, our approach must support a wide variety of platforms. We cannot rely on special hooks that are unique to one application or computing platform. Also, we must minimize the work necessary to port our infrastructure to new platforms.

There it is. It is missing one thing that the company learned as the business started to grow, and that was a capability to assess what went wrong when something eventually did go wrong. For example, companies definitely wanted to know who accessed what files when so that when a file’s security was breached, you could review the operations associated with that file and determine how your (supposedly correct) security policy failed.

Community Standards

2

I want to publicly thank Professor Jonathan Zittrain (JZ) for his wonderfully informative and absolutely riveting discussion on the topic of Internet governance. It’s not a topic around which it is easy to get your arms. It’s a mix of individual actors, corporate entities, government agencies, and open communities. There is nothing straightforward about this conglomeration of actors, and I’ve always struggled to know where to start. Luckily, Jim and I know JZ, and it turns out that that’s always a great place to start. So, thank you, JZ!

Much of his presentation — on how jurisdiction and regulation happened as the Internet evolved — was told through the stories of a few key people. It was a great way to give all of us a narrative foundation on which we could anchor further discussion. And that’s what I’d like to try to do here!

While most of Monday’s discussion looked at the past, this issue remains important as the Internet continues to evolve, and some of the most interesting pieces of the current evolution take place in our social media platforms. This got me thinking, “How does Facebook handle this ongoing evolution? Or more specifically, how has Facebook’s own regulation of its platform evolved?”

While I could call up former students that currently work at Facebook, I took a different approach: I decided to look at how the Community Standards page on facebook.com has changed over time. A great way to do this is to take advantage of the Internet Archive Wayback Machine.

The first question I investigated was simply, “How much did the Community Standards page change over the nearly seven years captured by the Wayback Machine?” Instead of looking at every minor change in the page, I focused on the point where the look of the page changed dramatically to the format it has today (see the page as it looked on March 14, 2015 and then its new, basically current look on March 18, 2015). Then I asked, “How different is the content of the page today as compared to the first captured day of its current look?”

I was surprised by the answer: Very little. In my mind, a lot has happened in the 32 months between March 2015 and November 2017. This doesn’t mean that a lot didn’t happen behind the scenes (i.e., the code that automates some of the process, and the policies that the people involved in the “dedicated teams working around the world to review things you [the user] report to help make sure Facebook remains safe”). In a moment, I’ll dig into this behind-the-scenes question, but first I’ll summarize the differences between the content of the Facebook Community Standards page in March 2015 and November 2017.

Briefly, there’s now a video link to help the users “learn more about how it works” or in particular how Facebook decides to remove (or not remove) content, described at a high level with an emphasis on why rather than how. Facebook’s claimed mission has changed slightly, from “Our mission is to give people the power to share and make the world more open and connected” to “Our mission is to give people the power to build community and bring the world closer together.”

There was also an important addition in the second paragraph of the page stating, “Sometimes we will allow content if newsworthy, significant or important to the public interest – even if it might otherwise violate our standards.” Very little else changed – only the title of the category of “Nudity” under “Encouraging respectful behavior” which became “Adult Nudity & Sexual Activity”). So, the biggest change according these differences is the power of Facebook to overrule its own Community Standards. Probably a lot more could be said about this change.

But what actually happens behind the scenes? In the early days of the Internet, these sorts of questions were debated in open forums like the IETF community meetings. The best I could find around Facebook’s Community Standards work (I will admit that I didn’t spend more than an afternoon looking) were the following two articles:

Bickert talks about how hard it is to draw the line and how daunting the task turns out to be on a social network as large as Facebook’s. The key sentence in the post for me was, “We don’t always share the details of our policies, because we don’t want to encourage people to find workarounds – but we do publish our Community Standards, which set out what is and isn’t allowed on Facebook, and why.” I encourage you to think about whether this is an acceptable answer to you.

I’m not 100% decided, but I lean toward more transparency. I’d like to know how Facebook filters what I see, especially if I am using Facebook to “see the world through the eyes of others” as they state in the first paragraph on their Community Standards page. I may not want to see what they filter, but I want to know exactly what they filter in a manner more detailed than their standards. As Bickert says, what is art to one person might be pornography to another.

Finally, in the Morse article, I’m glad to read that Facebook hasn’t replaced their team of humans that do this messy work with AI. As we’ve discussed in this seminar, AI can be even less transparent than people about the decisions it makes. But that’s my take. Yours might be different.

A Short Story from the Future

ø

“Pops, what do you think of the new president?”

Samantha paused and looked up from her flexEpaper article. It wasn’t clear that her father had heard her. But then his spoon full of cereal slowed its progress toward his mouth, and he looked up. His hair, what was left of it, had gone completely gray years ago, but his eyes remained as bright as ever.

“President Trump thinks he’s king. How can a man run for U.S. president and not understand … no, and not respect the fundamental idea of separation of powers in our system of government?”

She had heard her dad slip into this rant many times, and she knew you either cut it off quickly, or you’d better make yourself comfortable for it would be awhile before it played itself out. “Dad, Trump hasn’t been president for 12 years.”

Her dad’s confusion was understandable. Trump was the last U.S. president elected the old-fashioned way. The days when you had to register to vote, and had individuals declaring themselves candidates for president, were long gone. No more stump speeches. No more rallies. No more, thank god, endless political advertisements and no more cheap theater marketed as candidate debates. The Tuesday next after the first Monday in the month of November – what a crazy definition – remained the U.S. Election Day, but no U.S. citizen did anything special on that day these days. You simply woke up and learned who we had “elected” based on an analysis of the data collected by the companies constantly mining us for our preferences. That’s what was in today’s headlines. Today was Tuesday, November 2, 2032.

It was a confluence of factors that brought about the change. The Russians had demonstrated how easy it was to manipulate us through our social media platforms and that led us to question what our vote for the 2016 presidential candidates actually meant. To that point, the fears that kept us from electronic voting had focused on the threat of a disgruntled hacker or foreign power changing the actual vote count. But why go to that trouble when you can simply make a mess of the entire campaign process? Subtly manipulating us through our most popular social media platforms each and every day during the ever-increasing length of our presidential campaigns turned out to be much more appealing than trying to surreptitiously change a country-wide vote count in one evening.

Frustration over money in politics certainly was factor too. CBS News estimated that $6.8 billion was spent during the 2016 elections. Seven billion dollars! Well, if you can’t decide how to regulate money in politics, the next best thing, it appears, is to just eliminate the need to spend it at all. There was angst when the suggestion was made, but now everyone wonders why we didn’t make this change earlier given the boost that spending those same dollars on stuff other than political advertising gave to the U.S. economy, especially in the states that had been struggling economically.

The biggest factor, possibly, was Mark Zuckerberg. While his ambitions for political office might have started earlier, 2016 was the year that the media started taking notice. The Silicon Valley tech sector liked to think of itself as disrupting industries to create a better future, and Zuckerberg simply put two and two together. Why should he be forced to get elected the old fashion way? It was messy, expensive, and terribly inefficient.

In 2016, Facebook, Google, and Amazon alone already knew more about each of our likes and dislikes than we knew about ourselves. And they saw that they could acquire this information in what’s today called the shrinking inch, which is a play on the last mile from the telecomm days when telecommunication companies were the most important commercial entities involved in the Internet. No more. Commerce, social, and search companies are now the ones in power, and they have relentlessly moved to eliminate the space between you and their platforms. First they sat on fixed-location desktops fighting for your attention among the many windows on the screen. They then moved to a more prominent position on our smart phones, which we quickly learned were more important than our wallets. Then these companies scattered devices throughout the spaces we live, collecting everything about our every moment. And in their labs right now, these companies are working on ways to integrate their platforms directly into our bodies.

Finally, the incentives align, much to the regret of the Russians. Our data are gold to the Facebooks, Googles, and Amazons of the world. It’s in their interest to ensure that these data are timely and authentic. They still may not protect our data from being eventually stolen, but they benefit from knowing exactly how we feel today.

In Zuckerberg’s amazing mind, this trend was the opportunity. The challenge then became simply a question of the right matching algorithm: integrate, over the entire voting-aged population, what these companies knew about the wants and desires of each U.S. citizen, and then match the result against the characteristics of every U.S. citizen old enough to be the U.S. president, weighted slightly by the requirements of the role of U.S. president.

It is somewhat ironic that a man pushed for this system in the hope that it would improve his chances for election, as we have had nothing but women elected to the Office of the President since.

Samantha tried to think about a range of things she could say next, but only one thought remained. “I wonder if Pops knows that I was elected president.”

Watching Big Brother

2

Our discussion this week with David Eaves on open government, government-directed censorship, and different governments’ views of local and global stability gave me the perfect excuse to catch up on a bit of reading that I’ve had sitting on my desk for quite some time now. In particular, I read two papers by Gary King, Jennifer Pan, and Margaret E. Roberts. The first paper is titled How Censorship in China Allows Government Criticism but Silences Collective Expression (2013), and the second is titled How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, Not Engaged Argument (2017). They are a fascinating read, showing how data collected from the Internet in China can provide a clear indication of what the Chinese government is trying to accomplish. The data collected not only informs, but it overturns long-held theories about the activities and goals of the Chinese government. If you’re interested in government and the Internet, I highly recommend these papers.

In the section titled “Types of Censorship” on page 3 of the 2013 paper, the authors describe the three known ways that China’s regime censors free speech and human expression: (1) the Great Firewall of China, which blocks entire websites in China; (2) keyword blocking, which prevents social media containing prohibited words and phrases from ever being posted; and (3) hand censoring of the social media posts that slip through the first two mechanisms. The first paper takes advantage of this last manual operation to create a dataset using an automated system that collects a huge number of social media posts and then monitors and analyzes the government’s actions toward certain posts. As my title hints, I find this use of the Internet and automated technology – technology that we discussed in class being used by governments (or corporations) to watch individuals – to watch a secretive government an intriguing turnabout.

If you think about it, this is a very different tack on open government using the Internet. If the government is going to use the Internet to monitor and influence our individual actions, the same openness of the Internet can be used to monitor and understand the actions of a secretive (or supposedly open) government. The 2013 paper says this explicitly: “In this article, we show that this program, designed to limit freedom of speech of the Chinese people, paradoxically also exposes an extraordinarily rich source of information about the Chinese government’s interests, intentions, and goals—a subject of long-standing interest to the scholarly and policy communities.”

So what do the two papers say the Chinese government is doing with its censorship and monitoring activities? The 2013 paper concludes that “the purpose of the censorship program is to reduce the probability of collective action by clipping social ties whenever any collective movements are in evidence or expected.” It goes on to say that “collective expression organized outside of governmental control equals factionalism and ultimately chaos and disorder.” The 2017 paper goes beyond censoring and identifies government-directed posts (fake user comments) meant to “distract and redirect public attention from discussions or events with collective action potential.” The authors estimate from the data they’ve gathered that “the [Chinese] government fabricates and posts about 448 million social media comments a yearlargely comprised of cheerleading and distraction rather than engaged argument.”

Here we have a government trying to keep order by squashing collective action through the cutting of social ties (uprisings can’t begin in virtual space and then spill over into our physical space because nothing that might incite collective action is allowed to take root in virtual space) and the flooding of any potentially inflammatory thread with lots of distracting topics (before a post ever rises to the level of something the government might want to censor).

If you’re a government using this approach to keep the piece at home, it isn’t much of a leap to think that you might turn this approach into a weapon to attack another country. Instead of injecting distracting topics into a potentially inflammatory thread you’ve identified, what if you inserted posts that engaged more deeply in the emerging argument in an strategic attempt to raise the level disagreement and prevent compromise? What if instead of blocking social media sites or posts that created collective action, you created sites or posts in a foreign nation’s social media environment explicitly for collective action meant to strategically undermine that foreign government?

I’m not saying I know that any government is doing this to another government in today’s world. I’m just asking if you think it is possible in 2017…

Our Corporate Lords

1

I don’t know about you, but I am amazed at the amount of mainstream media attention on topics related to our class. I can’t keep up with all the stories, and it feels significantly greater than at this time last year (or any other point in my lifetime). This is just an observation. I don’t know what it means.

As you might expect from this observation, I had a hard time picking a single topic for this week’s blog post. In the end, I thought I might try to meld ideas from the last two weeks: big data from the growing emphasis on the Internet of Things and the public and private sectors’ obsession with AI everything. I hope the following makes sense and it causes you to think. If it does, I’d love to hear your reactions, since I certainly don’t have the answers.

I’ll start with a Wired article I came across this week, although the article was written last February. It’s contains an interview with Yuval Harari, who had just finished his book titled Homo Deus: A Brief History of Tomorrow. I haven’t had a chance to read the book, but as the interview points out, it touches upon the topics of the last two weeks: artificial intelligence, big data, algorithms and hardware, and extending human capabilities through computational and networking enhancements (i.e., what Harari refers to as techno-humanism).

Techno-humanism is one of the two “new religions” that Harari believes will emerge going forward. The other is dataism, a fascinatingly complex topic. One example of dataism, mentioned in class and in the article, is Google Maps (or Waze), which amplify our ability to get from point A to point B while also reducing the need to navigate between the same two points using older technologies (e.g., AAA maps, compasses, or the stars). The question is not whether this advance is a good thing or a bad thing, but whether the quality-of-life improvements it brings outweighs our increasing dependence upon technology.

Dataism is not a new topic. I can remember reading about it in 2013 in an editorial by David Brooks. David begins the article talking about our increasing desire to measure everything we do, the assumption that analyses based on data are objective decisions free from bias and ideology, and dataism as a reliable way to foretell the future. Today, we see even more of the first (e.g., biosensors and health apps) while we’ve learned the second was a flawed assumption. And interestingly, the first and the third are leading Alphabet’s Sidewalk Labs to undertake the latest effort in planned cities.

The announcement about Toronto’s waterfront feels, for me, like technology corporations, especially those betting on AI, are quickly replacing religion and government as the biggest influence on our day-to-day lives. For many of us, the extensive and detailed view these corporations have into our online and physical lives is greater than anything the U.S. government gleams or that I’ve told my minister and the people my family regularly sees through our church.

So, connecting this train of thought back to the topic that consumed most of our class last week, I guess I’m less worried about the possibility of an AI singularity than I am about the creeping change described above. My view is certainly colored by my belief that we are quite good at creating things for specific tasks, but I’ve seen us be less successful (less successful than evolution) at creating technologies for general tasks. Just because we can construct lots of something, it doesn’t mean that those lots of something can work together to be more than the sum of its parts. Lots of cells do not make a human, which is a masterpiece of differentiation, coordination, and resilience.

In conclusion, I think our irrelevance to machines is farther off than we think, but the threat of other people/corporations manipulating our everyday lives to their benefit is closer than we think. While I think about the future, I’m going to keep an eye on the very real present.

Babel Fish in Just Black, Clearly White, and Kinda Blue

2

After Wednesday’s announcement by Google of its new hardware product line, a colleague of mine stopped by my office to tell me how excited he was about the new Google Pixel Buds that perform automatic language translation. You can see a video of them in action here.

While Google might be pushing Apple on a fully connected product line, I don’t think their events have reached the polish of Apple’s events. That’s beside the point of the event, which was to push Google’s move from “Mobile First to AI First.” According to Google, we’re moving from a world where we take our devices with us wherever we go to one where these devices know more about us than we do. Google is working toward a future where you have a seamless interaction with their smart computing devices. And it won’t be a single device, but multiple devices that melt into the background of every environment in which you find yourself. These devices will constantly collect data about you so that they can adapt themselves to what you’ll need hopefully before you know you need it. If you’re nervous today about your devices collecting information about you and your actions, and sharing that information with the company that created the device (and that company’s partners), just wait. It will get worse.

But back to the Pixel Buds. Automatic translation isn’t new to Google. If you used Google Chrome, it has for a long time automatically translated web sites written in a language different than the one you use to your language. Pixel Buds are just a natural next step, and one that has long been discussed in science fiction literature.

Once my colleague had finished expressing his excitement, I asked him two questions. First, I asked if Harvard College should continue to require our undergraduates to learn a second language. There are clearly advantages to learning about a different culture through their native language, but with automatic translation available to everyone everywhere, should our language requirement be rethought to take account of this significant change? I don’t know the answer and neither did he.

For my second question, I asked, “What kinds of new threats and biases will surface when we rely on machines for language translation?” Just last week, we talked about industries where the Internet and computing technology had drastically changed the types of jobs available, and already there’s talk about the future job prospects of professional translators. (The quotes in the linked article sound like people with their heads in the sand.) And this past week we talked about the lack of thought about security being put into the IoT world. Will we see attacks in the future where someone messed with a Pixel Buds’ translation to subvert an important conversation? Moving from security to the world of bias, will we see bias in a Pixel Buds’ translation depending upon what Google has learned that you like? If you’re a Democrat, will you get one type of translation and a different one if you’re a Republican? Will Pixel Buds translations continue societal biases like men are better than women at mathematics? I hope not on both counts, but I don’t see these issues in the Google announcements I’ve watched.

The Ever-Changing TV Landscape

ø

I was glad to hear that most of you enjoyed the Wired article titled “The Long Tail” by Chris Anderson. It’s not a new article, but it has had real staying power. One of the things that I like best about the paper is its implication that taste is not as mainstream as once thought. It’s great to have things that most of the population finds interesting and exciting, but as the paper powerfully points out, this same population of individuals also have their own unique tastes. We are multidimensional in our tastes. Some of our dimensions tie into large communities and some into small communities. Some of us might be members of only small communities, and we shouldn’t assume that the small communities are necessarily subsets of any large community. Tolerance of unique interests turns out to be good business!

Speaking of taste, not everyone in my family understands my excitement with the new Star Trek series that CBS recently launched. Star Trek: Discovery tries both to hook a new generation of Trekkies into the media franchise of space-age morality tales created from the mind of Gene Roddenberry and to appeal to the older generations that grew up on the Original Series and many subsequent series. I won’t try to convince you to watch the new series, but simply encourage you to think about the churn that continues in television and video markets. Unlike a lot of TV shows today, you can watch Discovery on Sundays at 8:30pm ET, but if you miss the broadcast, you can’t catch up on the streaming site for CBS without subscribing directly to CBS. Stated another way, your cable subscription does not give you streaming access to Star Trek: Discovery like it does for CBS’s primetime show Salvation. Further complicating things, you can get the show on Netflix, but only international viewers outside the U.S. can access it there.

The streaming world is a lucrative business, as the Long Tail paper points out, and it is not surprising that CBS wants to get into it and not just provide content for it. However, we’re going from a world where we pay for cable access and a few streaming services, which collect all the back content, into a world where every studio, network, and cell-phone manufacturer wants to launch its own streaming service. I can’t imagine that this will make Joe and Jane Consumer happy; I certainly don’t want to pay for multiple streaming services. It’s bad enough that I have multiple apps on my iPad, but then at least, I pay only one bill. (Ok, two. One to my cable provider and one to Netflix.) It seems like this space is ripe for disruption again. It will be interesting to watch.

Scattered, or Great Design Thinking?

1

Jim and I talked about a range of design topics and a bit about how today’s Internet works in the seminar this week, and I thought I’d write my blog in the same “scattered” fashion. While at first seemingly scattered and long, if you bear with me, we might get to the second half of this blog’s title. Ready?

I’ll start with a pointer to a Wired article about DNS hijacking. On Monday, we discussed, at a high level, how the Domain Name System (DNS) on the Internet helps people to get to web sites (or other resources on the Internet) by converting easy to memorize domain names (e.g., college.harvard.edu) to the numerical IP addresses (i.e., 184.73.172.23 in my example). The IP address is simply a standardized name space that works like U.S. postal addresses. For example, the address of Pinocchio’s Pizza & Subs shop is 74 Winthrop St., Cambridge MA. You’d find it by traveling to Massachusetts and then to the Massachusetts’s town of Cambridge. In Cambridge, you’d look for Winthrop Street and hope you’d find a building with the number 74 on it. IP addresses work the same way. The first numbers take you to a particular part of the network, and the last number corresponds to some particular machine (house) on the identified subnet (street). The Wired article talks about how adversaries have attacked this system to cut certain resources out of the Internet for a short time.

Unfortunately, adversaries are tougher to defeat than Mother Nature. So let’s stick with getting around the problems that Mother Nature throws at us for the rest of this blog post.

While Jim and I were repeatedly expressing our deep admiration for the Saltzer et al paper this week, we somehow stumbled into a discussion of bit-error detection and correction. I’m worried that I glossed over things too quickly. So, here’s a bit more explanation of how this stuff works. There’s a whole science behind this, but I’ll attempt to just give you a flavor of how it works and why it is powerful.

As we discussed, you can detect a single bit error by adding a parity bit to any group of bits. For example, if I wanted to protect a 4-bit data string against single bit errors while the packet was in transmission, I would add a single bit to the data string (creating a 5-bit packet), and the value of this fifth bit would be the sum of the other four bits modulo 2. As described, I’ve added what’s called an even parity bit to my packet. It’s called this because the addition of the parity bit makes the number of ones in the 5-bit packet even (or zero). Lots of things in the digital world have duals, and the same is true here for we could also create an odd parity bit that would make the number of ones in the packet odd. Let’s stick with even parity.

How does this work in practice? The machine sending the original 4-bit data string computes the even parity bit, appends it to the end of the four bits of data, and sends a 5-bit packet. The receiving machine grabs the 5-bit packet and computes the packet’s parity (over all 5 bits). If the result of this computation is zero, the first 4 bits are the data it wanted without any single-bit errors. If the result is one, then there was an error in transmission and the packet needs to be resent.

Notice that we can’t correct the error without retransmission because we don’t know where the error occurred in the 5 bits we received. One of the 1s was flipped to a 0 or a 0 was flipped to a 1, but all we know is that a bit was flipped somewhere.

Of course, this scheme isn’t guaranteed to work for 2 bit errors. If you flip any two bits, the effect of their flips cancels each other out in the parity calculation. I’ll let you figure out what happens when you have more than 2 bit errors.

We, however, were interested in fixing (Mother-Nature-type) errors in the network so that you could create a reliable network and remove the need for retransmissions. To explore this question, let’s assume that your network never has anything more than a single bit error in a packet no matter what its size (an unrealistic, but pedagogically interesting situation). What would I have to do to guarantee that the receiving machine had enough information to correct any single bit error in the packet?

Well, you could send the same data string twice in a single packet, which guarantees that you would get a clean copy of the data bits. Unfortunately, if the two received copies don’t match, you don’t know which copy of the duplicated data bits is the one you want. You simply made a very space-expensive, single-bit error detector that not only tells you that an error occurred, but where it occurred in the string of data bits.

Ok, that design failed, but fear not. We can always try again. Let’s try sending three copies of the data bits in a single packet. Now in the case of a single bit error in the packet, we can use a simple voting mechanism to decide what is the correct string of data bits. Success! The cost of this success was that we need a packet three-times longer than the data we wanted to send. Hmmm, maybe we can do better.

This is where the work of Richard Hamming comes in. Hamming did pioneering work in the middle of the last century on error detection and how to build efficient error correcting codes. Continuing with our example of wanting to send 4 data bits over a network, Hamming showed that you could add just 3 parity bits to the 4 data bits (creating a 7-bit packet) and correct any single bit error that occurs in the entire 7-bit packet. That’s less space overhead than we added in our first duplication scheme that failed to achieve error correction! More impressive, the overhead of Hamming’s approach decreases very quickly as you send longer and longer strings of data bits. For example, when we send 4 data bits in a 7-bit packet, the efficiency of the transmission is (4/7 =) 0.571. If we wanted to send 247 data bits, we’d need only 8 parity bits to cover all (247+8 =) 256 bits in the packet for a (247/256 =) 0.969 efficiency! Nearly every bit sent is a useful data bit! That’s the power of exponentials and clever encodings.

To see exactly how this works, I encourage you to look at the example on the Hamming(7,4) Wikipedia page. The matrix arithmetic might look intimidating, but it’s just a mathematical way of saying that the three parity bits looked at as a 3-bit number can cover all 7 locations in our 7-bit packet, allowing us through clever encoding to know exactly which bit was flipped, if one was. If no bit was flipped (i.e., no single bit error occurred), the eighth encoding of the three parity bits means no single bit error occurred.

In class, I said we needed to add only three parity bits to correct any single bit error in eight data bits. Off by one error! You clearly need four parity bits to correct a single bit error in eight data bits (a 12-bit packet). Please fix your notes if you took some.

I want to bring us back to design as I end this blog post. I often cold call on some of you during our seminar asking, for example, “Yasmin, what would you do to make this work?” In asking this, I’m asking a design question, and design questions are, at first, hard. Why? Because design questions have no right answer. (Please go back and read the previous line again.) And because they have no right answer, you can answer any way you like! The worst thing that will happen is that you’ll describe something that doesn’t do what we hoped it would do, as I did above in the doubling-the-data-string example. No problem with that. Maybe we’ll design something cool for a different application, or we’ll learn something and restart our design process a bit smarter.

Now, most of you, probably, have never done any significant design. That’s ok. More importantly, in my humble opinion, the only way to learn to do design is to do it. How might you start doing it? (How might you answer my question to Yasmin?) Well, you know where you want to go and so just take a small step in that direction. Suggest something that sounds like it moves us in the desired direction and maybe give the reason why you think it moves us in the right direction. We did that above. With enough practice, you’ll see patterns in design that lead to fundamental advances like Hamming did, and like Saltzer and his colleagues did.

Log in