Date: Fri, 5 Nov 2010 16:24:28 -0700 From: Josh Carroll <josh.carroll@gmail.com> To: Dominic Fandrey <kamikaze@bsdforen.de> Cc: freebsd-stable@freebsd.org, Jeremy Chadwick <freebsd@jdc.parodius.com> Subject: Re: fetch hangs when trying to http-download from http://ftp5.de.FreeBSD.org/ Message-ID: <AANLkTi=hab6NTx3_LCjSaQn2SDn=XN9u-XsB56xx4i%2BL@mail.gmail.com> In-Reply-To: <AANLkTi=Ekxg=Z1kfsEF34iTSFkQYx6J39EeW8OH%2BpHBW@mail.gmail.com> References: <4CD44A23.2030707@bsdforen.de> <4CD46018.8060207@bsdforen.de> <20101105195948.GA29963@icarus.home.lan> <4CD46530.9080106@bsdforen.de> <AANLkTi=Ekxg=Z1kfsEF34iTSFkQYx6J39EeW8OH%2BpHBW@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> Here's the last 30 lines of the output from kdump after it has hung > (the trace file no longer gets written to once the fetch process > hangs): *snip* > =A038016 fetch =A0 =A0RET =A0 read 53/0x35 > =A038016 fetch =A0 =A0CALL =A0read(0x3,0x81006835,0x3cb) I believe this read corresponds to this part of fetch.c (with line numbers for reference from stable/8 svn): 625 if ((size =3D fread(buf, 1, size, f)) =3D=3D 0) { 626 if (ferror(f) && errno =3D=3D EINTR && !sigint) 627 clearerr(f); 628 else 629 break; 630 } This fread() never returns the second time through the loop in the bad case. Since I'm not very good with gdb, I just added some printf()'s throughout this section of the code and pulled the fread() out of the if() so I could check the return value explicitly. Comparing a fetch from that http server and from my own local http server (with a copy of the file in question) shows the following. The first time through the loop, the first 4096 bytes are properly read in: before stat_start() ecore-txt-0.9.9.042.tbz 0% of 6594 B 0 Bpsafter stat_start() reset sigalrm, siginfo and sigint to 0 setup SIGINFO handler while we don't get a sigint size set to B_size: 4096 Before calling: size =3D fread(buf, 1, 4096, f) (fileno(f) is: -1) after fread(), fread returned setting size =3D 4096 After check for size ?=3D 0 and fread() After stat_update() while we don't get a sigint size =3D 6594 - 4096 =3D 2498 size after: 2498 Before calling: size =3D fread(buf, 1, 2498, f) (fileno(f) is: -1) But this is where it hangs in the case of that particular server/file combination. If I fetch the same file from my local apache server, I see it properly read the remaining 2498 bytes and finish up: after fread(), fread returned setting size =3D 2498 After check for size ?=3D 0 and fread() After stat_update() while we don't get a sigint size =3D 6594 - 6594 =3D 0 size after: 0 Before calling: size =3D fread(buf, 1, 0, f) (fileno(f) is: -1) after fread(), fread returned setting size =3D 0 fread() returned 0 We weren't interrupted, break out of while() AFTER large while(!sigint) loop !sigalarm Set SIGINFO back to SIG_DFL Before stat_end() ecore-txt-0.9.9.042.tbz 100% of 6594 B 59 MBps after stat_end() So for some reason it's hanging during that second fread() for that particular file for that particular server. Perhaps the pcap Taras provided will shed some light on why this fread() is hanging. I was able to fetch a different tarball (zsh-4.3.10_4.tbz) from that server without any problem, so there is something in particular about the combination of that file/server that is causing the problem. Thanks, Josh
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=hab6NTx3_LCjSaQn2SDn=XN9u-XsB56xx4i%2BL>