Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Oct 1999 23:27:47 -0700 (PDT)
From:      odip@bionet.nsc.ru
To:        freebsd-gnats-submit@freebsd.org
Subject:   ports/14343: [patch] wget-1.5.3 failed to continue retrieving files from true HTTP/1.1 web servers
Message-ID:  <19991015062747.C58C9152ED@hub.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         14343
>Category:       ports
>Synopsis:       [patch] wget-1.5.3 failed to continue retrieving files from true HTTP/1.1 web servers
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-ports
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Thu Oct 14 23:30:01 PDT 1999
>Closed-Date:
>Last-Modified:
>Originator:     Dmitry Grigorovich
>Release:        3.2-RELEASE
>Organization:
Institute of Cytolog and Genetics
>Environment:
FreeBSD ghost.bionet.nsc.ru 3.2-RELEASE FreeBSD 3.2-RELEASE #3: Thu Sep 16 17:40:21 NOVST 1999     root@ghost.bionet.nsc.ru:/usr/src/sys/compile/ODIP  i386
>Description:
Under some condition port of wget-1.5.3 failed to continue retrieving files from true ( wide-implemented ) web servers, which supports HTTP/1.1
( File patch-aa in port of wget is dirty hack which is incorrect, original version of wget is ok )

First condition: using time-stamping ( the switch -N of wget )
Second condition: connection is lost during retrieving

After connection lost wget trying to continue retrieving file.
Wget send "GET" request to server with additional headers "If-Modified-Sience" and "Range:", full debug listing showed in topic "How to repeat the problem".

As original file is not modified, then true HTTP/1.1 web server as consistent with RFC 2616 topic 14.25 answer "HTTP/1.1 304 Not Modified".
Wget break downloading file in spite of only first 4555 bytes are downloaded !!!

I repeat that bug contains in file patch-aa of port of wget.
This file patch file http.c of original wget in order to adding generating "If-Modified-Sience" header and processing it.
But logic of file http.c of wget have elaborate design and modifing it is difficulty task !!!

I note that I test problem on web servers such as:
apache-1.2.6, apache-1.3.6, apache-1.3.9, two IIS4 servers and other.
Some servers like one of IIS4 not correctly process header "If-Modified-Sience", but second IIS4 web server correctly process it consequently wget failed to continue :(

>How-To-Repeat:
Save url of file (http://www.apache.org/dist/apache_1.3.6.tar.gz
) in file url2
Note that web server MUST BE true HTTP/1.1 !
Try to downloading file and then we need to emulate connection lost during downloading file :)

May be simplest way is cable pull out, but I make firewall ipfw to emulate connection lost.
To fast progress in emulation we run wget with small timeout - 30 seconds ( switch -T 30 )
Second, we need small pause after connection lost, to connection restore to cable set into or firewall rules remove ( switch -w 30 )

Ok, running command like "wget -d -N -i url2 -T 30 -w 30"
Wait while wget downloading begin of file, then emulate connection lost
Wait about 30 seconds while wget detect timeout.
Then wget will be paused 30 second. In that time we need restore connection.
After pause wget trying to continue retrive and op - server answer "304 Not Modified" and we don't receive file !!!

The process is not simplest, but I include full debug listing of my test:

--------------------->

odip@ghost$ wget -d -N -i url2 -T 30 --dot-style=micro -w 30
DEBUG output created by Wget 1.5.3 on freebsd3.2.

Loaded url2 (size 47).
parseurl ("http://www.apache.org/dist/apache_1.3.6.tar.gz") -> host www.apache.o
rg -> opath dist/apache_1.3.6.tar.gz -> dir dist -> file apache_1.3.6.tar.gz ->
ndir dist
--11:58:39--  http://www.apache.org:80/dist/apache_1.3.6.tar.gz
           => `apache_1.3.6.tar.gz'
Connecting to www.apache.org:80... Created fd 3.
connected!
---request begin---
GET /dist/apache_1.3.6.tar.gz HTTP/1.0
User-Agent: Wget/1.5.3
Host: www.apache.org:80
Accept: */*

---request end---
HTTP request sent, awaiting response... HTTP/1.1 200 OK
Date: Fri, 15 Oct 1999 04:58:41 GMT
Server: Apache/1.3.10-dev (Unix) ApacheJServ/1.0 PHP/3.0.6
Cache-Control: max-age=86400
Expires: Sat, 16 Oct 1999 04:58:41 GMT
Last-Modified: Tue, 23 Mar 1999 22:50:08 GMT
ETag: "230ad6-14f078-36f81aa0"
Accept-Ranges: bytes
Content-Length: 1372280
Connection: close
Content-Type: application/x-tar
Content-Encoding: x-gzip


Length: 1,372,280 [application/x-tar]

    0K -> ........ ........ ........ ........ ...               [  0%]

Closing fd 3
11:59:19 (126.93 B/s) - Read error at byte 4555/1372280 (Operation timed out). R
etrying.

--11:59:49--  http://www.apache.org:80/dist/apache_1.3.6.tar.gz
  (try: 2) => `apache_1.3.6.tar.gz'
Connecting to www.apache.org:80... Created fd 3.
connected!
---request begin---
GET /dist/apache_1.3.6.tar.gz HTTP/1.0
User-Agent: Wget/1.5.3
Host: www.apache.org:80
Accept: */*
Range: bytes=4555-
If-Modified-Since: Fri, 15 Oct 1999 04:59:14 GMT

---request end---
HTTP request sent, awaiting response... HTTP/1.1 304 Not Modified
Date: Fri, 15 Oct 1999 04:59:51 GMT
Server: Apache/1.3.10-dev (Unix) ApacheJServ/1.0 PHP/3.0.6
Connection: close
ETag: "230ad6-14f078-36f81aa0"
Expires: Sat, 16 Oct 1999 04:59:51 GMT
Cache-Control: max-age=86400


Length: unspecified
Closing fd 3
Last-modified header missing -- time-stamps turned off.
11:59:52 (0.00 B/s) - `apache_1.3.6.tar.gz' saved [0]

>Fix:
No problem. Remove file patch-aa from patch directory from port of wget.
Rebuild port of wget and reinstall it.

In topic "How to repeat the problem" I described the test procedure.
After removing file patch-aa and rebuiling I testing again.

Next text is full debug listing
Now server answer "HTTP/1.1 206 Partial Content" and file continue retrived.

------------->
odip@ghost$ ./wget -d -N -i url2 -T 30 --dot-style=micro -w 30
DEBUG output created by Wget 1.5.3 on freebsd3.2.

Loaded url2 (size 47).
parseurl ("http://www.apache.org/dist/apache_1.3.6.tar.gz") -> host www.apache.o
rg -> opath dist/apache_1.3.6.tar.gz -> dir dist -> file apache_1.3.6.tar.gz ->
ndir dist
--12:01:43--  http://www.apache.org:80/dist/apache_1.3.6.tar.gz
           => `apache_1.3.6.tar.gz'
Connecting to www.apache.org:80... Created fd 3.
connected!
---request begin---
GET /dist/apache_1.3.6.tar.gz HTTP/1.0
User-Agent: Wget/1.5.3
Host: www.apache.org:80
Accept: */*

---request end---
HTTP request sent, awaiting response... HTTP/1.1 200 OK
Date: Fri, 15 Oct 1999 05:01:44 GMT
Server: Apache/1.3.10-dev (Unix) ApacheJServ/1.0 PHP/3.0.6
Cache-Control: max-age=86400
Expires: Sat, 16 Oct 1999 05:01:44 GMT
Last-Modified: Tue, 23 Mar 1999 22:50:08 GMT
ETag: "230ad6-14f078-36f81aa0"
Accept-Ranges: bytes
Content-Length: 1372280
Connection: close
Content-Type: application/x-tar
Content-Encoding: x-gzip


Length: 1,372,280 [application/x-tar]

    0K -> ........ ........ ........                            [  0%]

Closing fd 3
12:02:22 (87.29 B/s) - Read error at byte 3107/1372280 (Operation timed out). Re
trying.

--12:02:52--  http://www.apache.org:80/dist/apache_1.3.6.tar.gz
  (try: 2) => `apache_1.3.6.tar.gz'
Connecting to www.apache.org:80... Created fd 3.
connected!
---request begin---
GET /dist/apache_1.3.6.tar.gz HTTP/1.0
User-Agent: Wget/1.5.3
Host: www.apache.org:80
Accept: */*
Range: bytes=3107-

---request end---
HTTP request sent, awaiting response... HTTP/1.1 206 Partial Content
Date: Fri, 15 Oct 1999 05:02:53 GMT
Server: Apache/1.3.10-dev (Unix) ApacheJServ/1.0 PHP/3.0.6
Cache-Control: max-age=86400
Expires: Sat, 16 Oct 1999 05:02:53 GMT
Last-Modified: Tue, 23 Mar 1999 22:50:08 GMT
ETag: "230ad6-14f078-36f81aa0"
Accept-Ranges: bytes
Content-Length: 1369173
Content-Range: bytes 3107-1372279/1372280
Connection: close
Content-Type: application/x-tar
Content-Encoding: x-gzip


Length: 1,372,280 (1,369,173 to go) [application/x-tar]

    0K -> ,,,,,,,, ,,,,,,,, ,,,,,,,, ........ ........ ........ [  0%]
    6K -> ........ ...^C
odip@ghost$

------------------------>

Sorry for big report of problem !


>Release-Note:
>Audit-Trail:
>Unformatted:


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-ports" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19991015062747.C58C9152ED>