You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

The Web is Still Small

December 9th, 2009 by Christian

Given that we are fumbling toward a video-enabled Web, moving large video files around should be increasingly ordinary.  Services like Netflix streaming have invested in streaming movies because they want to get out from under their $300 million yearly postal bill (says Nightline).  All sorts of new home technologies are trying to address the problem of moving big HD video files around the many machines in the bourgeois home.  Verizon is investing over $10 billion in fiber infrastructure because they think we’ll want to move around really large files.

But all is not smooth and easy.  Zune is trying to sell me downloadable TV episodes yet my new XBox 360 is already full.  And I think I bought the biggest one?  (Can’t remember.)  Let’s consider something incredibly simple:  clicking on a link in a web browser to download a large file.

Yes, it’s amazing how difficult it is to move around a large file!  Here’s a test:  I’m using a wired ethernet connection from my University office with a fast new computer and 760 GB free on my disk.  You’d think: “no problem.”  (Or maybe “Bring it on!“)

Let’s try to download a 1.7GB file using a variety of methods.  That’s really not a very big file.  A Superbit edition DVD from Columbia Tri-Star is at least 4.3 GB for 90 minute movie in plain old DVD format (not Blu-Ray).  The movie “Ice Age 2” in Blu-Ray is 22 GB.  So hey, not a very hard test, right?

Web browsers, however, are not up to the job.  Google Chrome and whatever version of Internet Explorer I have can’t download files over 2GB because of a limitation in Windows.  Firefox built in a workaround to the limit, but it took me 3 hours to download the file and then I couldn’t open it because it had errors.  Errors?! What is this, the dawn of Fetch?  The early days of FTP?  I’m getting CRC error flashbacks.

windows-crc-error
Flashback to the early Internet!

So I tried some specialized downloaders with interesting results.

Here’s the summary:

Internet Explorer: FAIL (1)
Google Chrome: FAIL (1)
Firefox: 3 hours, then FAIL (2)
downthemall: 7 hours, then FAIL (3)
flashget: 12 minutes

Reasons:

(1) – due to HTTP download limit (probably in Windows?)
(2) – downloaded with errors, file unreadable
(3) – I think it fails while trying to preallocate space on disk — hard to know what it is doing

C’mon web browsers.  Let’s get it together here.

7 Responses to “The Web is Still Small”

  1. Anon Says:

    Haven’t tried it myself, but any thoughts on bittorrent sites that also allow you to download files via http? (The one and only example I can think of lives on http://www.kickasstorrents.com/)

  2. A123 Says:

    According to http://support.microsoft.com/kb/298618 “Windows Internet Explorer 8 can reliably download files over 4 GB in size”

    I pretty sure Chrome and Firefox can reliably download such files as well.

    Either there is a problem with the web server (not all servers on all platforms are able to send files larger then 2GB), or your internet connection, or the file systems you are using (you have to use NTFS other wise you want be able to store files larger then 2GB).

  3. Christian Says:

    Yes and we know WHY they have that Microsoft KB article — it doesn’t work reliably so they have to say something reassuring when all of the angry users come at them with pitchforks. I used IE7 in this test and it didn’t work even though IE7 is “rated” to 4GB (according to that KB article).

    I admit this was a pretty unscientific test but it does clarify some of the points you mentioned.

    In this test it’s the same server each time, so if the server couldn’t send large files then would never have worked at all. So that’s not the problem.

    Chrome and Firefox didn’t reliably download this file. I note your vote of confidence for them but… they didn’t work.

    I think FAT32 chokes at 4GB files, not 2GB. Anyway this was NTFS so that wasn’t the problem. Indeed, since the download worked in the end the problem couldn’t be the file system not storing it. Same hard disk each time.

    Internet conditions do vary but I’m having this problem from a wired ethernet connection at a major research university. Those are gold-plated compared to the average consumer’s ISP.

    I hate to think what Joe DSL is going through!

  4. Ben Colmery Says:

    To piggyback on what Anon said, bit torrent technology gets around this nicely. And, I’d say there are a number of other important uses for this technology, which is why I don’t think we should only think of it in a dark light (which I delve into more here http://aimd.wordpress.com/2009/08/12/bit-torrent-technology-as-a-tool-for-change/ ).

  5. Christian Says:

    I’m totally with you Ben. For this to move forward we have to build it into things so that people don’t have to geek out and fuss with special software. A great example: Hulu. When you are watching a video on Hulu you are moving around a large file (the video).

    While Hulu may not use the BitTorrent protocol it does use p2p. It’s built in — at least according to this p2p blog post from last year http://www.p2p-blog.com/?itemid=765 This also makes it clearer that p2p as networking architecture can be employed by giant companies like NBC Universal just as it can be used by scrappy pirates.

    To reiterate, I am still surprised at how difficult it is to click on large files with my browser and download them. It seems to me this is one of those things that should “just work” by now. Maybe building p2p into everything would help. Key wrinkle: as long as we are downloading POPULAR things.

    Christian

  6. Matthias Bärwolff Says:

    I would think (and this is unscientific thinking, too) that downloading really huge files with a web browser is still something of a niche application; if that kind of use would become more widespread, then surely browsers would enable users to reliably download large files. After all, downloading files of any size is a trivial operation these days. I tend to use a bash shell and curl -O for any downloads larger than, say, 5 MB.

    A browser would probably be best off to spawn an application or add-on or whatever to offload the task of downloading such files, they are not for “browsing” anyway, but need just be put into a place where the user finds it afterwards. A little aside here: the other day I put a large ZIP file on my website and sent the link to my uncle, he downloaded it (with his web browser), but then send me an email saying he cannot find the file. Hmmm.

  7. Ion Enache » Blog Archive » Articles of Note: December 2009 Says:

    […] • Downloading really LARGE files can be terribly burdensome, as Christian Sandvig finds #; […]

Leave a Reply

Bad Behavior has blocked 208 access attempts in the last 7 days.