Re: downloading a single file using multiple threads



On Wed, 28 Mar 2007 13:13:07 -0700, Willy Denoyette [MVP] <willy.denoyette@xxxxxxxxxx> wrote:

[...]
One client A --- one Server,
two client threads,
sharing the same socket connection,
thread 1 issues asynchronous range request x...
thread 2 issues asynchronous range request y...
server accepts request x and at the same time accepts request y, starts to return data for x and y.
AFAIK above is not possible, that doesn't mean it *fails* visibly.
IMO the server will choose to handle request x and queue request y, that means he will return the data for x before he starts to return data for y.

I think we all agree that that particular scenario won't work, and/or isn't useful (as you say, even if you have enabled reuse of the connection, repeated requests for different ranges will simply result in queuing of the replies).

The only way to get some parallel action is to have multiple connections.

[...]
Yep, but say you have 1Mbps to the US and 50Kbps to Afrika, is your download manager clever enough to request all chunks from the US instead of waiting for a chunk to arrive from Afrika, or is he clever enough to cancel the request and re-issue the same request over the US connection.

Any "download manager" worthy of the name ought to do that. I'm sure there are plenty of download managers NOT worthy of the name, but at the very least I would expect a download manager to break the download into small enough chunks that if one connection is considerably slower than another, the faster connection can wind up with the bulk of the effort.

For example, downloading 1MB might be broken into 100K sections. As each section completes, a new request is issued, meaning that while a slow server spends a long time on a single 100K section, a faster server gets to service the rest of the 100K sections.

A more sophisticated program may even be intelligent enough to notice one connection is faster and start issuing larger requests on that connection, to avoid wasting too much time on the latency for a single GET. An even more sophisticated program might even have logic to do as you suggest, and cancel the slower connection when it becomes apparent that re-issuing the remaining data from that request on a faster connection will net an improvement (presumably the data already retrieved would not be discarded).

It's clear that some amount of thought does need to go into implementing a servicable "download manager" that actually improves the situation when downloading files. But I've seen plenty of people just in this newsgroup with the insight to be able to do this (I'd dare say the three of us making so much noise in this thread are among them :) ), and I don't think any of these issues in and of themselves suggest that there's no benefit to parallel downloads of different sections of the same file.

This requires permanent real-time monitoring of the transfer rates, something that can't reliably be done at the application layer,

Why do you say that real-time monitoring of the transfer rates can't be reliably done at the application layer? I agree that you can't get detailed information about the exact rate of transfer, and especially of why the speed is fast or slow (a fast connection might have low throughput due to errors, for example). But at the granularity required to monitor a download and automatically compensate for slow and fast connections, sufficient for a user to notice an improvement in download performance, I'd say at the application layer you can easily get sufficient information.

Given that a download manager is most useful for downloads that take tens of minutes or even more, just following the averages is plenty sufficient to significantly improve performance. And of course, the bulk of the improvement would come simply from having parallel connections, which doesn't require any monitoring of throughput anyway (yes, with such a simple approach there would be cases where things didn't get better, or even got worse, but most of the time there would be a net improvement).

that's what I meant when I said you really need sophisticated software and be sure to measure (as always) the real benefits, I guess (expensive) download managers can do it while other pretend they can, big difference!.

Assuming there's a such thng as an "expensive download manager" (I wouldn't know, not being in the market for one), then yes...I'd hope they would at least go to as much trouble as described above. But I think all that can easily be handled at the application layer.

Pete
.



Relevant Pages

  • Re: IIS 6.0 Windows Authentication 401 Every Request
    ... both working for an internal server. ... every request to a page, it'll throw a 401, and then the next request ... It is up to the client to provide evidence, ... the request or connection maintained. ...
    (microsoft.public.inetserver.iis.security)
  • Re: async i/o completion routines, threading question
    ... Our code was using GetRequestStream() to post the request synchronously, ... I have both client and server logging and it's 13 seconds between ... HttpListener stuff on the server side, but 13 seconds to open a connection ...
    (microsoft.public.dotnet.framework)
  • Re: Problems with access to a web page
    ... Server: Apache ... Connection: close ... Look what I get now when I send the exact same request ... and got exactly the same 0 length response ...
    (microsoft.public.windows.inetexplorer.ie6.browser)
  • RE: async i/o completion routines, threading question
    ... Our code was using GetRequestStream() to post the request synchronously, ... the client does a WebRequest.Createand writes the post to the time the ... server logs receiving it. ... HttpListener stuff on the server side, but 13 seconds to open a connection ...
    (microsoft.public.dotnet.framework)
  • Re: downloading a single file using multiple threads
    ... One client A --- one Server, ... thread 1 issues asynchronous range request x... ... you say, even if you have enabled reuse of the connection, repeated requests for different ranges will simply result in queuing of the replies). ... Yep, but say you have 1Mbps to the US and 50Kbps to Afrika, is your download manager clever enough to request all chunks from the US instead of waiting for a chunk to arrive from Afrika, or is he clever enough to cancel the request and re-issue the same request over the US connection. ...
    (microsoft.public.dotnet.languages.csharp)