You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

~ Archive for Uncategorized ~

Blockchain: The Solution to All Problems?

ø

In the past few years, the emergence of “blockchain” and “cryptocurrencies” in popular media has skyrocketed. Blockchain is often heralded as a revolutionary technology that will change all aspects of life, from businesses to daily life. A quick google search of the term will prove fruitful, with millions of hits linking to newly authored books about the revolution, newly created businesses riding the frenzy, and new overnight millionaires who made their millions by trusting the system that is built on a lack of trust. But what is blockchain really? How does it work? And is this technology as revolutionary as people say it is? For the remainder of this post, I’ll focus on providing a high level overview of the first two questions and some thoughts about the role of blockchain technology going into the future.

 

First, what is blockchain?

Blockchain is the technology behind cryptocurrencies like bitcoin which is at its core a decentralized, growing list of immutable records that are linked using cryptography. That is, the blockchain, is literally just a chain (link) of blocks (think of an excel sheet or a list containing records) that aren’t stored in any single place (decentralized) and cannot be changed (immutable) and are expanded using common cryptographic methods. The fact that the list, which is commonly referred to as the ledger, is both decentralized and immutable are key to understanding the technology. The technology itself consists of three fundamental parts: (1) the ledger (the list of transactions), (2) A consensus algorithm (will expand on this shortly), and (3) the digital currency. Each of these pieces are essential to keep the blockchain active and growing.

First, the ledger. The ledger, as mentioned above, is simply a list. You can think of an excel spreadsheets with columns and rows in which a list is kept. So what is in the list? The list stores all previous transactions involving the cryptocurrencies, contracts, records, or other information. The ledger is equivalent to the blockchain itself. That is, the ledger is a chain of a bunch of blocks linked together, where blocks are defined as newly added transactions to the previous ledger, creating a new ledger with the new block chained at the end. And a cool aspect is that this list is decentralized, that is, the same list exists on multiple computers on a network of nodes. But, the list is also immutable, meaning that once a transaction is added to the list, it cannot be changed or removed. The two natural questions to ask at this point are: 1) how is it that the list is immutable? and 2) what happens if two lists have conflicting values (given that the lists are decentralized)? To explore the first question of immutability, I will describe at a high level of abstraction, where this immutability comes from (ultimately it relies on an inability to solve difficult mathematical problems quickly).

When new blocks are added to the existing ledger, they are added cryptographically (using a hash function). A hash function is simply a one-way function (easy to compute in one direction but not the other — usually based on mathematical problems like prime factorization which is a one way function because it is easy to multiply two large prime numbers together, but difficult to factor the product of two prime numbers) that takes in a large input (in this case, the previous ledger or blockchain combined with the new block) and outputs a condensed random string of characters called a hash. The previous ledger is represented simply by another previous hash from the ledger before that and the new block that was added, and so on and so forth, creating the chain. The immutability comes from the fact that if any part of the ledger is changed, then the hash will change completely. Anyone can check to make sure that the ledger matches the legitimate ledger by running the hash function over the ledger and making sure that the hash matches the one that is published, which is where the immutability comes from.

The second question about conflicting values leads us to the second core component of the blockchain technology, namely the consensus algorithm. The consensus algorithm simply takes different ledgers that may have gotten out of sync, and chooses the ledger that everyone will agree on based on which ledger most people have. This allows for the decentralization of the ledgers, so that not one authoritative body has a right to the ledger, but the ledgers still stay in sync and updated over time.

The final component of the blockchain is the digital currency, such as bitcoin. This aspect is incredibly important to the technology because this is where the incentive comes for people to buy into the system and continue to keep it going. There is no physical manifestation of this currency, but it is represented digitally and kept on the ledger. This leads us to the discussion of bitcoin “miners” a term that you may be familiar with. When a block is hashed along with the previous ledger, the hash function produces a hash of a list of random characters (base 16 including numbers and some chars). However, the creators of blockchain have decided that most hashes will not be accepted except for hashes that start with 6 leading 0’s. What this means is that a hash must randomly be generated that starts with 6 leading 0’s in order to be considered a valid hash. Because the hash function is random, there is about a 1 in 86million chance of generating a valid hash. When a person generates a valid hash and publishes it, they are rewarded with a fixed amount of bitcoin and the system continues on. Here is a helpful graphic from this website

What is Blockchain Technology?

Given a quick, high level overview of the technology, it seems that the main draws of the technology are that it is a secure and decentralized manner to make transactions of just about anything. Given these benefits, it helps us to more clearly see the real benefits (and the lack thereof) of this technology in context. This technology is cool, but likely will not be a technology that revolutionizes all industries (as some hope). But, the technology does give us a secure outlet to make transactions in the future while keeping a list that we can trust will be safe from tampering in the future, which is helpful, but is a niche use-case that doesn’t extend as far as some of the hype entails. Rather than a solution to all problems, it seems that the blockchain is a potential solution to one problem, albeit a large one, the problem of making secure transactions.

 

Intelligence and Law Enforcement: Back Doors and Golden Keys in Cryptography

ø

This past week, we’ve had some very engaging and interesting discussions about the desire for, reasons against, and possibility of having back doors or golden keys in cryptography (a back door is way of subverting an encryption algorithm and a golden key is a theoretical key that would allow the holder to break any of the specified encrypted data). I’ll start this blog post by first setting up the context of the discussion as well as defining some key terms. First, the context of this discussion comes following the “Crypto Wars” of the 1990s [read here] in which parts of the US government were wrestling with private companies and individuals about the use of strong cryptography, which was at the time considered a firearm that one could not legally export (a bizarre law that lead to much confusion and difficulty given that other countries already had strong cryptography and the enforceability was also difficult and the punishment harsh). In summary, one side argued for some sort of ability for the government to have unbridled access to people’s encrypted data as they do with phones and wire taps. The other argued that this was either impossible or simply unethical. Regardless, the governments attempt to push a solution with the Clipper Chip using key escrow technology ultimately failed and ended the argument, with cryptography allowed for public use. Fast forward about 20 years and we get to the case with Apple and the FBI [read here], a privacy case in which Apple refused to provide a way for the FBI to break into all iPhones. Why did Apple do this? Was it just out of spite, to protect company image? Should authorities be given exceptional access? Is this even possible? I will focus on these final two questions and argue that any sort of exceptional access explicitly given to authorities would pose a problem with regards to the core idea of a global internet for the following two reasons.

1. First, there simply is no way to create a golden key, or any sort of intentional access to breaking cryptography without opening up potential security flaws or holes that someone else may be able to access (an adversary, say). If this is the case, then it seems that the idea of a exceptional access undermines the very same goals or reasons for having in the first place, namely for the higher level idea of increased security and safety; that is, the initial goal of safety wouldn’t necessarily be solved by exceptional access as new safety threats (or maybe just privacy threats?) could come into play as other players work to exploit the security hole.

2. The social idea of which governments will get access to this hypothetical golden key will inevitably affect global internet activity and commerce in ways that can’t even be immediately understood. If there is general knowledge that certain governments have the ability to break certain cryptographic schemes that underlie certain parts of the internet, it may change some people’s behaviors, especially those abroad who may not want a foreign government to have so much oversight over what they are doing on the internet.

Ultimately, it seems that the debate on cryptography will depend not so much on providing a technical solution, but on the political, human engineering part depending on what the goals of each party are.

Quantum Computing and Cryptography

1

If you google “Is Quantum Computing Dangerous”, you will find headline after headline about the imminent dangers of quantum computing. Articles instilling fear in readers through the talk of the ways that Quantum computing is a “tool of destruction”, or that it is the “end of cryptography”. Articles such as this one argue that modern cryptography will be defeated and even ends with the following statement “Should the Russian government break all of our encryption before the US develops countermeasures, stolen elections will seem like small potatoes. Welcome to the cyber-battlefield of the 21st century.” Where are these fears coming from, and are they substantiated? Should we actually fear a complete breakdown of cryptographic methods if quantum computing technology advances? Will it advance to that state?

 

First, the fears about cryptography. Many cryptographic methods and schemes, such as RSA, are built on top of the difficulty of solving one-way mathematical functions, such as prime factorization. These problems are easy to compute in one direction, but take exponential time in the other direction, so the ability to guess and break a method such as RSA are not feasible with modern computers, which at best solve problems in linear time. However, theoretically, quantum computers should be able to compute problems much faster, even problems that would normally take exponential time to solve. On the face of it, this would mean that all our current systems, i.e. banks, national security, etc. would be compromised if someone had a quantum computer that truly worked as a quantum computer (i.e. each qubit, rather than storing two possible states, 0 and 1, store three states, 0 and 1 and 0 and 1. When you multiply out the additional computing power for many hundreds of thousands of bits, then its easy to see the where the additional computing power would come from). Its easy to see the fear in this -> at face value it boils down to an arms race with a crazy powerful weapon that could break all cyper security and cryptographic schemes by solving exponential time algorithms in quadratic time.

 

But, in reality this fear seems a bit too strong. What is the current progress of quantum computers? Is it even possible to reach create a fully functional quantum computer with more than 50 qubits (the number needed for quantum supremacy, that is a quantum computer that cannot be simulated on a classical computer)? Currently, there are some serious roadblocks in the practical creation of such a machine, with IBM having the closet possibility of it. But, even if such a machine is to come, “it isn’t obvious how useful even a perfectly functioning quantum computer would be. It doesn’t simply speed up any task you throw at it; in fact, for many calculations, it would actually be slower than classical machines. Only a handful of algorithms have so far been devised where a quantum computer would clearly have an edge. And even for those, that edge might be short-lived. The most famous quantum algorithm, developed by Peter Shor at MIT, is for finding the prime factors of an integer. Many common cryptographic schemes rely on the fact that this is hard for a conventional computer to do. But cryptography could adapt, creating new kinds of codes that don’t rely on factorization.” -> https://www.technologyreview.com/s/610250/serious-quantum-computers-are-finally-here-what-are-we-going-to-do-with-them/

The truth is, there are serious difficulties ahead for the advancements of quantum computers to a useful state, and it doesn’t currently seem like there are that many practically useful efforts for quantum computers. Moreover, if a quantum computer was created, cryptography could adapt quickly to rely on a NP-Hard problem that is not easily solvable by a quantum computer (which would need some sort of quantum algorithm to solve the problem in the first place). All in all, the technology is incredibly interesting, but it does not seem like much of the current fear is merited. At least for the time being.

Psychometric Targeting in Political Campaigning: Is there an Issue?

4

Following the recent presidential election, news came out that Cambridge Analytica has used people’s private Facebook data to help the Trump campaign win the election. Based on recent articles such as this one, it seems that the data driven analysis and decisions were effective in increasing voter turnout for those who ended up voting for Trump, as there were many new, unpredicted voters in this election. However, there was also an uproar about the privacy concerns regarding this move. The data that Cambridge Analytica used was mostly accessible publicly and garnered through voluntary online quizzes/tests, and the targeting was done based on psychometric methods that predicted which way a person may vote. Initially, this may seem wrong in some ways, but what about this is any different than previous modes of political campaigning? Is it that psychometric methods are more accurate? Or is there something inherently different about targeting an individual based on interests and personality than on general demographics?

This post will focus on the question of what is the real difference between psychometric targeting for political campaigning today and regular political campaigning of the past? What made the Cambridge Analytica scandal in the recent presidential election so controversial compared to political campaigning of previous elections? Both types of campaigning targeted people in order to mobilize potentially beneficial voters to go out and vote. I will offer a few differences and walk through the plausibility of each of the potentially significant differences

  1. The psychometric targeting is more individual and specific than the demographic targeting of the past.
  2. The psychometric targeting can be done on a much larger scale than the older political campaigning.
  3. Only one side in the previous election used the effective psychometric political targeting and so may have had an advantage that pushed them over the edge.

These are just some of the different ideas I came up with for differences. It doesn’t seem like the first should be the significant difference for general concerns because political campaigns of the past did campaign in very specific methods, although locally rather than on a national scale. For example, even the founding fathers would capitalize on information that a person would vote a certain way when deciding whether or not to spend extra time encouraging that person to vote. This leads to the second potential question: the scale of data driven political campaigning. Is it really the scale that throws people off? This also doesn’t seem like the issue given that most people would agree that voter mobilization in general is good, and the goal of the data driven political campaigning using psychometrics is simply to increase voter turnout. However, the more refined goal is to increase voter turnout for one’s own supporting party. So is this the real distinction? That the psychometric targeting only mobilizes the side of the person using it? This also doesn’t seem to be an issue, because all political campaigning is entirely biased towards increasing voter turnout for the constituents that would help the candidate that you are supporting. In this case, it just so happened that a singular group decided to use this data-driven psychometric targeting, but there may not have been an issue if both sides had used it, increasing voter turnout in general.

Then what is the real difference between the two if not the three above?

  • The data used in this case should be private.
  • The method of influencing people to mobilize them to vote seems dishonest in some way.

Maybe it has to do with the data itself? Namely, that the data used in this case should be private. But, this also doesn’t seem like a possible distinction simply because public data for people in the age of social media and the internet is increasingly available to anyone. So, even if companies are within privacy guidelines, identifying data can still easily be accessed for people on a large scale. The only other possibility seems to be that the method of influencing people to mobilize them to vote seems dishonest in some way. But this is no different than previous political campaigning, where fake news has also been real (except not at the scale it exists at now). All in all, the real difference here seems just to be the (alleged) increased accuracy and effectiveness of the psychometric targeting method, and the trouble seems to come from the fact that only one party decided to use it whereas the other didn’t. Would the same reaction have come if both parties had used this method to mobilize voters? Is the issue with the method itself? If so, then the more pertinent question is simply how can we work to further people’s privacy protection in the digital age and how much of this privacy is necessary, desired, or even possible.

Algorithmic Bias: Where is the Problem?

ø

What is algorithmic bias? That is, how can we actually define it in a meaningful, constructive way that can help us to ultimately create a more equitable society to live in? To begin thinking about this question more deeply, we must consider a few different ideas of algorithmic bias, some clarifications, and then what benchmark we are comparing algorithms to.

When I first read the following ProPublica article about predictive policing (found here) over a year ago, I was caught off guard. I was convinced that there was some sort of problem, but had to work through what exactly the problem was and what that meant with regards to algorithms and the sort of responsibility for software engineers who come up with these algorithms. However, after reading some clarifications of the ProPublica article and some statistical studies showing that the data itself was biased (based on the data that ProPublica published -> they responsibly published all of the data that they used). Now, it’s also important here to define what I mean by bias. In this colloquial sense, I simply mean that the data shows a disparate impact against a group of people based on ethically non-essential characteristics, like race. I also believe that this is a common use of the term when speaking about bias within this context.

Following the ProPublica article, a common reaction is to be up-in-arms against the dangers of such a technology as predictive policing -> will this increase the disparity? Keep it the same such that we can’t improve it? While these fears are justified and legitimate fears to have, it is important to first acknowledge that there is a real problem that this article unearths, but then, not to jump to a conclusion about what is to blame, namely the algorithms. We should not draw conclusions about what to blame simply because of a lack of understanding or a lack of information. It is a major danger and error to jump to a conclusion based off of a lack of information, namely to blame algorithms for all bias simply because we don’t understand what the algorithm is doing. In this case, it turns out that the algorithm itself was okay, but the data was skewed because of bias in the world that already exists. The important point here is that the algorithm itself was constructed in such a manner that it did its job exactly, with no “bias” or mistakes. It just so happened that the data that the algorithm used to make predictions, in this case about which areas were more likely to have crime, was skewed based on an inherent disparity that exists in the world.

If we are comparing algorithms to a benchmark of perfection, then they will fall short. By nature of uncertainty, there will always be false positives and false negatives, although this can be limited by a very good algorithm. However, there are always false positives and false negatives when humans make important decisions too. As an example, consider the study that examined the increase in harsher rulings following the loss of a home football team (https://psmag.com/news/football-team-losses-can-impact-prison-sentences). So what sort of benchmark should we compare to? If an algorithm consistently performs better and more equitably than a human, then should we use that algorithm? It seems that the rational answer should be a clear yes, but when we really think of putting life or death decisions into the hands of a machine, many would likely say that we should not. Then, we should consider what the real difference between the two cases are and why one might be not be comfortable with choosing the algorithm. Is it a lack of transparency? (i.e. algorithmic transparency and education about the algorithm might help). Or is it a lack of an intangible humanity? Moving forward, it will be exceedingly important for us as a society to think about the different ways we can define algorithmic bias in a constructive way and consider which sort of situations the use of algorithms might be okay and why.

Universal Identity: Estonia e-residency

ø

This week we discussed a case-study of a country that is actually implementing universal identification systems for citizens through the use of technology. In the first case with Estonia, the government has already, successfully by many measures, implemented an identity system wherein all citizens are given a uniquely identifying public/private key pair, generated by the government, so that citizens are able to fully identify themselves online. This opens up new opportunities for all citizens to have an official identity, and use this ensured identity to vote online, complete tax returns online, obtain and fulfill prescriptions online,  set up businesses, sign contracts, etc. (https://www.theregister.co.uk/2015/06/02/estonia/). There are clearly many benefits with a system like this, and many loosely similar systems exist at less successful and smaller scales in other countries, like social security numbers in the US, etc., but the difference in the scale and success of the Estonian operation with regards to the percent of their citizens who enroll makes the Estonian system fundamentally different than any other. Many interesting questions arise with regards to the Estonian system upon further investigation, and for the rest of this blog I’ll focus on questions of the effectiveness of certain abilities that are created by the Estonian system, namely the ability to vote online.

 

One article regarding the effectiveness of Estonia’s digital government (here), suggests that after the system was implemented to allow e-voting to occur, e-voting actually became less popular, stating that “electronic voting is less popular because Estonians value their new found freedom to choose and many dress up in order to go to their polling station.” This is very interesting because I wonder whether voter turnout as a whole increased because of the e-voting initiative, even though less people actually decide to vote online. That is, even though e-voting is less popular, more people were compelled to go out and vote after e-voting was pushed. Given the potential of this technology and social phenomena that is created through the Estonian e-government system, I am hopeful that there is a way to really increase voter turnout and other functions such as census participation. This would be an interesting social phenomena or experiment to look into. Given the paradigm shift that the Estonian government has brought into being, I feel that there is potential for many of the fundamental issues in the citizenship of a nation to be more effectively addressed through this new system.

Watching the Science Center: Video Surveillance in Public Spaces

1

This past week, we touched on topics of privacy with regards to the widespread . The motivation for this post comes from a recent realization that there is a video camera overlooking the Harvard Science Center Plaza, one of the main student spaces between the dorms and classes, that is active 24/7 and is available for anyone to use. You can look at the live feed now. This fact troubled me a bit given that most students aren’t actively aware that they are being monitored. This raised many different questions for myself, including the following: Is the footage saved for future use? Why do they allow anyone access to this footage? More importantly, what is the purpose of having the camera here in the first place? For this post I’ll be focusing on the final question, namely, what are the different purposes for having surveillance cameras and video cameras in public spaces, and what further questions does this raise.

Although there isn’t an explicit reason for having this camera outlined on the website, we can safely assume it is a mix of the following common reasons: (1) To monitor events in a large public space for the safety of the community (similar to the way the Boston Marathon bomber was caught retroactively using security footage), or (2) As a fun way to allow people to see what is going on at the heart of Harvard’s campus.

The first reason seems plausible given that this sort of surveillance of public areas is an increasing trend across many institutions in the United States. However, it is difficult to know for sure for two reasons: (1) There is no information about whether or not the footage is saved for later use; my hunch is it probably is and (2) It seems strange that the University would publish this footage publicly if this was the primary purpose. This leads me to the second reason, which I think is the most likely reason. Given that the footage is published on the commonspaces website, promoting the public spaces at Harvard, it seems likely that the primary purpose of the footage is to give people a chance to see the plaza. This seems plausible; some people may just be interested in seeing what a common space actually looks like, and what better way than watching a live streamed video of it?

Would the general Harvard community act differently if they knew that they were constantly being watched while in the plaza? Does the capturing of every moment in these public spaces take away from the ability to fully relax without fear that a video may be taken out of context and used against a person? Will the constant surveillance in public spaces like the Science Center plaza lead to a notion of social cooling, or modified social behavior as is implied in Foucault’s notion of the Panopticon? Foucault’s idea was that in a circular prison where the prisoners cannot see the guard, but are always aware that the guard has the ability to constantly survey the prison, the prisoner’s will behave in a different manner as a result of their belief that they are always under surveillance. This idea can be pushed further in the notion of social cooling, where members of society will be less likely to take risks and be themselves, but rather conform to the norm because they are being watched. Will an increased awareness that we are being watched change our behavior and ultimately change or take away from the Harvard experience, or is it just harmless and a bit creepy?

Harvard Data Privacy Policy — Too Much or Too little?

ø

Over the last few days, I’ve had the opportunity to read through Harvard’s “Policy on Access to Electronic Information” for the first time as an undergraduate at the College. To be frank, this is one of the few privacy policies I’ve actually read through entirely, despite accepting hundreds of privacy policies all the time (e.g. Google search, Google mail, apple products, etc). The policy itself is incredibly short and readable, unlike most privacy policies that we are often presented with when using different products — think about the long privacy agreements that are required prior to the use of just about any software product. The readability of this document means that I was able to fully understand the policy in a short amount of time (An actual copy of the information can be found here: policy_on_access_to_electronic_information.).

The privacy policy itself is entirely grounded on six important principles, which are the following.

  1. Access should occur only for a legitimate and important University purpose.
  2. Access should be authorized by an appropriate and accountable person.
  3. In general, notice should be given when user electronic information will be or has been accessed.
  4. Access should be limited to the user electronic information needed to accomplish the purpose.
  5. Sufficient records should be kept to enable appropriate review of compliance with this policy.
  6. Access should be subject to ongoing, independent oversight by a committee that includes faculty representation.

Initially, I was a bit aghast at the ambiguity of words that seemingly allows the University to access information with broad power, when they see fit. For example, the third principle states that “notice should be given when user electronic information will be or has been accessed.” Notifying a user about investigating and accessing his or her data should be a key principle of the privacy agreement. However, the ambiguity of this statement already allows the University to wait for an undisclosed amount of time before they have to give notice to the user. There is no specific amount of time after the electronic information is accessed by which the university has to let the user know that the information was accessed, which is troubling, as they could technically get away without ever letting a user know on the grounds that they were planning to in the future.

Moreover, in section III of the contents under the “Notice” section, the University obfuscates the principle further by saying that “notice ordinarily should be given to the user. All reasonable efforts should be made to give notice at the time of access or as soon thereafter as reasonably possible.” What exactly is considered a “reasonable effort” and what is considered “soon thereafter?” Moreover, there is a further issue in the ambiguity of the word “ordinary.” What situation would be considered not ordinary such that no notice has to be given to the user?

These sorts of questions were some of the first that came to my mind. However, I soon realized I am judging Harvard’s privacy policy in contrast to the basis of having complete and full ownership over all of my data. In reality, this isn’t a fair comparison. The reality of the situation is a bit different for two reasons: (1) there are many tradeoffs and drawbacks that come with having full ownership over data such that I am willing to give over my data for certain benefits (such as automatic backing up of data), and (2) compared to most other corporations in the United States, Harvard provides much more protection over the privacy of student and faculty data. Compared to other universities, I don’t know where Harvard stacks up, but, based on a few quick searches, (Boston University Electronic Info Policy here), it seems like Harvard has more explicit guidelines on who can and cannot access data. As a side note, this policy itself is relatively recent; the policy was signed and put into effect on March 31, 2014. Moreover, in reality Harvard has, as a response to this policy, been able to provide a greater sense of security in that petitions to search electronic data across the University must follow through strict formal procedures which determine whether or not the search is permissible.

While the ambiguity of some of the wording in the privacy policy may be cause for concern, the current reality of the situation created by the policy is more reassuring than not and seems to be a step forward in the University’s efforts to protect its students and faculty.

Intro Blog Post

ø

Hi all!

This is my first, official blog post. The purpose of this first post is just to introduce the blog and set the stage for future posts. In this blog, I will be posting on a weekly basis on whatever philosophical and legal current issues and topics that I find interesting, either based on the discussions in class of from the assigned readings for IGA 538. The goal of this blog is continue and push forward the discussion on these incredibly relevant and important issues. I hope to achieve this by offering my own opinions on these issues on a public forum. You also have the ability to comment on posts as well which can help further the dialogue.

Thanks for reading the blog post! Enjoy.

-Sam

Log in