5

I would like to download a range of a file which I have explicitly defined. As far as I know:

wget --header="Range: bytes=1024-2048" http://www.example.com/file.tmp

should run well. Yet, it fails to do so with the following error when debug mode is on,

Registered socket 300 for persistent reuse.
Disabling further reuse of socket 300.
Closed fd 300

Why does it even gives that error and retries and how I can fix it?

The following are the actual full logs of the process.

Manually assigned resumable download

SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
Setting --server-response (serverresponse) to 1
Setting --page-requisites (pagerequisites) to 1
Setting --recursive (recursive) to 1
Setting --tries (tries) to 1
Setting --header (header) to Range: bytes=10024-
DEBUG output created by Wget 1.11.4 on Windows-MinGW.

Enqueuing http://www.example.com/file.tmp at depth 0
Queue count 1, maxcount 1.
Dequeuing http://www.example.com/file.tmp at depth 0
Queue count 0, maxcount 1.
--2012-01-11 07:02:46--  http:/www.example.com/file.tmp
www.example.com çözümleniyor... seconds 0,00, 127.0.0.1
Caching www.example.com => 127.0.0.1
www.example.com[127.0.0.1]:80 bağlanılıyor... seconds 0,00, bağlantı
kuruldu.
Created socket 300.
Releasing 0x0036a108 (new refcount 1).

---request begin---
GET /file.tmp HTTP/1.0
User-Agent: Wget/1.11.4
Accept: */*
Host: www.example.com
Connection: Keep-Alive
Range: bytes=10024-

---request end---
HTTP isteği gönderildi, yanıt bekleniyor...
---response begin---
HTTP/1.1 206 Partial Content
Server: nginx/0.7.65
Date: Wed, 11 Jan 2012 05:03:57 GMT
Content-Type: application/vnd.ms-powerpoint
Content-Length: 37651672
Last-Modified: Tue, 01 Nov 2011 21:18:50 GMT
Connection: keep-alive
Expires: Thu, 31 Dec 2037 23:55:55 GMT
Cache-Control: max-age=315360000
Content-Range: bytes 10024-37661695/37661696

---response end---

  HTTP/1.1 206 Partial Content
  Server: nginx/0.7.65
  Date: Wed, 11 Jan 2012 05:03:57 GMT
  Content-Type: application/vnd.ms-powerpoint
  Content-Length: 37651672
  Last-Modified: Tue, 01 Nov 2011 21:18:50 GMT
  Connection: keep-alive
  Expires: Thu, 31 Dec 2037 23:55:55 GMT
  Cache-Control: max-age=315360000
  Content-Range: bytes 10024-37661695/37661696
  Registered socket 300 for persistent reuse.
  Disabling further reuse of socket 300.
  Closed fd 300
  Vazgeçiliyor.

Wget supported resumable download (command: -c)

SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
Setting --server-response (serverresponse) to 1
Setting --continue (continue) to 1
Setting --http-keep-alive (httpkeepalive) to 1
DEBUG output created by Wget 1.11.4 on Windows-MinGW.

--2012-01-11 07:12:51--  http://www.example.com/file.tmp
www.example.com çözümleniyor... seconds 0,00, 127.0.0.1
Caching www.example.com => 127.0.0.1
www.example.com[127.0.0.1]:80 bağlanılıyor... seconds 0,00, bağlantı
kuruldu.
Created socket 300.
Releasing 0x0003a0b0 (new refcount 1).

---request begin---
GET /file.tmp HTTP/1.0
Range: bytes=557172-
User-Agent: Wget/1.11.4
Accept: */*
Host: www.example.com
Connection: Keep-Alive

---request end---
HTTP isteği gönderildi, yanıt bekleniyor...
---response begin---
HTTP/1.1 206 Partial Content
Server: nginx/0.7.65
Date: Wed, 11 Jan 2012 05:14:01 GMT
Content-Type: application/vnd.ms-powerpoint
Content-Length: 37104524
Last-Modified: Tue, 01 Nov 2011 21:18:50 GMT
Connection: keep-alive
Expires: Thu, 31 Dec 2037 23:55:55 GMT
Cache-Control: max-age=315360000
Content-Range: bytes 557172-37661695/37661696

---response end---

  HTTP/1.1 206 Partial Content
  Server: nginx/0.7.65
  Date: Wed, 11 Jan 2012 05:14:01 GMT
  Content-Type: application/vnd.ms-powerpoint
  Content-Length: 37104524
  Last-Modified: Tue, 01 Nov 2011 21:18:50 GMT
  Connection: keep-alive
  Expires: Thu, 31 Dec 2037 23:55:55 GMT
  Cache-Control: max-age=315360000
  Content-Range: bytes 557172-37661695/37661696
Registered socket 300 for persistent reuse.
Uzunluk: 37661696 (36M), 37104524 (35M) kalan [application/vnd.ms-powerpoint]
Saving to: `file.tmp'

 1% [                                       ] 622.314      149K/s              ^
Dennis
  • 50,701

1 Answers1

7

Depending on how you look at it, this is either a bug or a missing feature.

The headers specified with the --header only get sent by Wget, but they don't get interpreted.

In src/http.c of the tarball of Wget 1.13.4, there's a sanity check for partial content:

  if ((contrange != 0 && contrange != hs->restval)
      || (H_PARTIAL (statcode) && !contrange))
    {
      /* The Range request was somehow misunderstood by the server.
         Bail out.  */
      xfree_null (type);
      CLOSE_INVALIDATE (sock);
      xfree (head);
      return RANGEERR;
    }

The if condition covers two cases:

  • If there's a content range set, it must coincide with the number of missing bits of the file that's being downloaded.

  • If partial content is being sent, content range must be specified by the server.

The second case doesn't cause any problems, since Wget interprets the server response well. The first, however, does, since Wget didn't interpret the client-specified range.

To solve this problem, if you're willing to compile your own version of Wget, you can change the above source code to the following:

  if (H_PARTIAL (statcode) && !contrange)
  {
      xfree_null (type);
      CLOSE_INVALIDATE (sock);
      xfree (head);
      return RANGEERR;
  }
  if (contrange != 0 && contrange != hs->restval)
    hs->restval = contrange;

Now, Wget will deduct the number of missing bits of the file from the content range.

You could also give cURL a try. It has a built-in --range switch.

Dennis
  • 50,701