You are viewing a read-only archive of the Blogs.Harvard network. Learn more.
Skip to content

Update on Chinese Circumvention Tool Snooping

Peter Li from the Global Internet Freedom Consortium has responded in the comments to my post about snooping by Chinese circumvention tools:

We apologize for the confusion here. The anti-censorship ranking service is provided by one of the GIFC partners. It only publishes the popularity ranks of destination websites users visit through our anti-censorship tools. It is similar to alexa.com but is only limited to anti-censorship web traffic.

The ranking service is not authorized to access, nor can it access, the data users transmit on the wire. It is not authorized to release logs containing information on the websites any individual user visits either.

The FAQ for the ranking service was not written properly, as originally “user” there meant website owners who may be interested in getting detailed statistics on how their websites are visited through our anti-censorship tools. We apologize that we have overlooked the wording.

The GIFC partner who runs the ranking service, the World Gates’ Inc, has been notified, and that FAQ entry has been removed. Thank you for discovering the problem.

Peter Li
Global Information Freedom Consortium

Also, Rebecca Mackinnon has written an excellent followup to the post that includes a response from Bill Xia of Dynamic Internet Technologies / Dynaweb that ‘DIT never gives out “personal-identifying user data”‘ and the following quote from Peter Li:

Yes, in some cases FBI asked us to provide logs for certain websites or destination IPs in some particular time periods, for example, they would request something like the original IPs who visited xyz.com at Jan 12, 2007, 12:20-30 EST, and the visited web pages. We provided such information as we feel we are obligated to work with law enforcement agencies in the free world.

Note that the above quote does not imply any sort of quid pro quo for FBI access to data. If Dynaweb is storing the data about individual users, they are required by U.S. law to give access to that data in response to government warrants and subpoenas.

Rebecca also gets the issue of the trust invested in circumvention tools precisely right:

The moral of this long story is important: when using circumvention tools, make sure you understand enough about how they work, what they’re meant to be used for, and who runs them, so that you’re not taking a leap of faith with people you would rather not trust.

The decision about who to trust is a personal one: I am more inclined to trust a VPN operating in the U.S. which is subject to FBI requests than a Beijing Telecom connection subject to Beijing public security bureau requests, but that’s just me. Other people might feel very differently and make different choices. Some people may feel very comfortable trusting the Falun Gong… others, well, might not… It appears that the VOA, RFA, and HRIC have decided to trust them and to recommend these services to their users.

Where does this leave the issue?

I’m happy that the data is no longer for sale on the website, but given all of these factors, I’m still concerned with the amount and sensitivity of the data being stored, the lack of disclosure to users about what data is being stored and how it is being used, and the care with which the data is being protected.

I want to make clear first that I am not attacking the motives of the developers of these tools. I have every reason to believe that the people building, distributing, and running these tools are doing so in honest resistance to the restrictive Internet policies of the Chinese government. I should have made that fact clear in my original post. I don’t think anyone was selling data to make a quick buck. I think any money made out of any hypothetical sale of personal data would have been plowed back into the circumvention projects.

Still, I am somewhat skeptical of Peter’s explanation that the issue was merely confusion arising from a misunderstanding of the word “user.” The key sentence seems pretty clear to me: “But data that can be used to identify a specific user are considered confidential and not shared with third parties unless you pass our strict screening test.” To the degree that websites are “identified,” they are already identified in the public aggregate data (google.com, live.com, etc). What additional, confidential data would be published about a website? I think it more likely that the confusion here is between the various projects contributing data and the ranking.edoors.com site displaying the data. In any case, the faq entry in question has now been removed form the site, so if they were offering to sell data, they are not anymore.

But Peter’s further comment about sharing data with the FBI indicates that, whether or not they are actively selling individual user data, they are definitely storing the data on an individual level. This fact alone is cause for concern, or at least for disclosure. There is no law in the U.S. that requires storage of web browsing histories, though the EU data retention directive does require that EU ISPs store the source and destination IP address of every Internet communication. The data flowing over the networks of these circumvention tools is particularly sensitive, since most of the users of the tools are breaking the laws of the countries merely to use them. Any data that is stored can be shared, stolen, subpoenaed, warranted, and otherwise distributed. The fact that some or all of the GIFC circumvention tools are storing browsing histories of individual users vastly increases the level of trust those users are investing in the tools, not just not intentionally to misuse the data but also to safeguard it from attack or from misuse by partners trusted with the data. The current confusion over what data is being shared with the ranking service and what the ranking service is doing with the data is a demonstration of the inherent dangers of storing and sharing the data, even with trusted partners.

Compounding this problem is the fact that none of these tools have anything that even looks like a privacy policy. U.S. style privacy policies have many, many problems, but they do provide some baseline view of what data an organization is collecting and what it is doing with the data. These tools should at the very least be clear to their users what data they are storing, whether or not they are storing data at an individual data, and with whom and under what circumstances they are sharing the data. In general, there’s nothing wrong with collecting personal data as long as you are explicit about what data you are collecting and what you are doing with the data. None of the involved tools (freegate, gpass, firephoenix) makes any attempt to disclose what sort of data they are collecting and storing. And they all make broad claims about protecting the anonymity of their users.

A user should be able to make an informed decision between using a tool that tracks her activity (like dynaweb, gpass, and firephoenix) and a tool that does not (like anonymizer). Note that this is not a personal recommendation on my part to use any tool over any other. Lots of folks have responded to my original post by saying “See, you should use Tor!”. I think Tor is a great project, but without going into depth, it is very open about the ways that it does and does not protect the privacy of its users. As Rebecca says, before using a tool, you should be aware of how it works and what it is doing with you data and then make your decision about what and whom to trust. But projects have to disclose what they are doing with their users’ data for users to be able to make this choice.

Update: Edited to remove sloppy wording that wrongly implied connection between State Department funding and access to data.

One Trackback/Pingback

  1. […] the data for these tools has now removed the faq entry offering to sell the data. Please read my subsequent update for responses from the tool developers and further […]