Some thoughts on GSoC 2017 and LibraryBox

This summer, I was a part of Summer of Code working with Jason Griffey on his Berkman Klein project, LibraryBox.

I wrote on Medium about parts, but I wanted to reflect here on some of the things that I learned.

First, I’d like to say that this summer was a blast. The end was a bit hectic, since I’d gotten right into the thick of getting everything pinned down when I went back to school at the University of New Mexico (shoutout to my burqueno peeps!) and ran into professors that like packing the first few weeks with content. When I started working on LibraryBox this summer, I was more or less familiar with its usage. I’d played around with PirateBox and getting things working on a laptop previously, having gone to my fair share of DefCon events where I saw them pop up for adhoc file sharing. There’s a lot of great stuff that LibraryBox and PirateBox do that I wanted to help make better.

Lesson 1: Real, working hardware and such is the greatest debugging platform.

When I picked up porting it to the raspberry pi, I ran into a lot of little tiny issues here and there. As a note to future people doing things like this: Be ready to do some footwork when moving from an embedded platform like OpenWRT onto a platform like the rpi. There’s subtle differences and I can’t stress enough that until I was about halfway through the summer, I really didn’t know what I was doing. Learning that getting things running on real hardware as fast as possible is your best bet to make sure that your life is going to be just fine. Once I got my software running on the pi, I had a lot more fun getting the project moved forward.

Lesson 2: A day you don’t code isn’t a day wasted

It’s easy to think that a day you write zero lines of code is a day you’ve wasted. The reality is that this is a terrible way to handle the situation and will only drive you to the edge of burnout before you realize what you’re doing. Take days to think, work on something different, and even just not work at all. All-nighters are not worth it unless you have a good reason and the more I tried to force myself to work the less I got done.

The reality of the matter is that taking a day to work out notes and lay out design on paper can be the best day you get things done. I did a lot of “go to starbucks and sketch out the problem I’m having” and my notebook is as a result filled with pages like this:

Those notes were a few hours of thinking, sketching out ideas and getting things down on paper. Some notes I took while on a plane, some notes I took at dinner, other times notes were the byproduct of someone mentioning something.

Lesson 3: You have people around you who can solve your problems.

Software dev isn’t done in a vacuum. It’s tempting to hunker down and hide away and not talk to people, but there’s a lot of things that your friends and collegues know that you don’t know they know. I learned that one friend of mine was currently dealing with some of the same container problems I was facing, but in a slightly different context. This helped me find the source of one of my woes and hammer down what I really needed to solve. I had been burning for a week trying to figure out an obscure problem with systemd and how to make chroots work, then a friend mentioned systemd-nspawn and the skies opened up and this happened:

true story.

This was a good week of me being unhappy with my work, getting progressively more and more frustrated. Ten minutes of conversation solved a week of me hiding from people and just beating at the problem. It was bad enough that I hadn’t bothered to really mention it to my partner, who was a bouncing board for so many other things this summer.

Wrapping up

These were my experiences. My mentor helped me get on track when I had been wavering, helped set up a schedule and my partner really kept me in line. We build software not in isolation but by looking at others for help. I hope to be back next ear for more of this and get to know more people as I refine my skills! This summer had challenges, things that I didn’t expect. Did I hit every goal? Not by a longshot. But I hit the important goals, getting somewhere that I felt was more than just a proof of concept and more something that can be actively demoed soon.

SwellRT/Wave E2E Encryption: Overview

The code can be downloaded from this git branch (compare changes).

Synopsis

Apache Wave is a software framework for online real-time collaborative
edition. Similarly to Google Docs and Etherpad, it uses Operational
Transformations
to manage user collaboration.

During this Google Summer of Code we have provided end-to-end encryption to wave
documents. This means that only the people who know a particular key, have
access to the documents and can edit and retreive the contents of a them,
protecting in that way the privacy of Wave users.

We have based our work on this awesome paper that explains how some
researchers encrypted Google Docs’ Operational Transformations. We have took
their ideas and adapted them to Apache Wave’s architecture.

Produced work

To sumarize the work we have produced, we have recorded this video:

To encrypt the messages we have used the algorithm AES-GCM from the WebCrypto
API. We have used JsInterop bindings to call it from our Java classes.

Messages are properly encrypted and decrypted when they are sent and received
by the clients. The texts of a documents are also properly recovered from the
server’s snapshot. Everything seems to run smoothly, except for some annoying
bugs that appear sparsely, and a serious user interface bug that prevents users
that did not created the wave to decrypt its snapshot. My mentor and me think
that we can fix them quickly, just after the program has ended.

How to use it

Running our modified version of Wave does not require any additional
configuration, just use Gradle commands as usual. To compile the code and
run the server use:

$ ./gradlew run

And open the url http://localhost:9898/ with any browser. Once registered and
logged in, use the “New Encrypted Wave” button to create a new encrypted wave.

Encrypted Wave button

In its URL you can see that the new wave’s identifier starts with “ew+” instead
of “w+”, as it is usual in common waves. Also, a symmetric cryptographic key is
attached, after the wave identifier, separated by an exclamation mark (!).

Encrypted Wave URL

The user must preserve that URL (or at least the key part) in order to open the
wave again in the future.

Future work

AES-GCM assures both confidentiality and integrity for the messages written by
the legitimate users, but an attacker who has the control over the server can
still do a lot of harm:

  • Only the text of a document is encrypted, but not other parts like the content
    of its hiperlinks, for example. We should extend the encryption beyond the
    inserted characters.
  • The authentication could also be extended to all the components, not only text
    ones. Also, as the paper states that the history of a document should
    also be authenticaded (see appendix A.2).
  • It is unlikely to hide the structure and format of the document to the server,
    but we may be able to hide some more information, like user’s typing traits.

On the other hand, it is not convenient having users handling symmetric keys by
themselves. Keys should be encrypted and stored in the server as user data. To
do so, we should derive a key from the user’s password using pbkdf2 (available
in the WebCrypto API), to encrypt all the keys a user generates or registers
for her waves.

The users could use public key cryptograpy in order to being able to invite each
other to edit in a wave document. This feature were part of the original plan of
work for this Summer, but we have had not enough time to develop this part.