From owner-freebsd-current@FreeBSD.ORG Sat Dec 27 04:32:00 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6599716A4CF for ; Sat, 27 Dec 2003 04:32:00 -0800 (PST) Received: from obh.snafu.de (obh.snafu.de [213.73.92.34]) by mx1.FreeBSD.org (Postfix) with ESMTP id D5F8D43D31 for ; Sat, 27 Dec 2003 04:31:50 -0800 (PST) (envelope-from ob@gruft.de) Received: from ob by obh.snafu.de with local (Exim 3.36 #1) id 1AaDbg-000LmV-00; Sat, 27 Dec 2003 13:31:48 +0100 Date: Sat, 27 Dec 2003 13:31:48 +0100 From: Oliver Brandmueller To: freebsd-current@freebsd.org Message-ID: <20031227123148.GB77531@e-Gitt.NET> Mail-Followup-To: freebsd-current@freebsd.org, David Malone References: <20031224154121.GA83770@e-Gitt.NET> <20031225204626.GA68589@e-Gitt.NET> <20031225215838.GB68589@e-Gitt.NET> <20031225222029.GC68589@e-Gitt.NET> <20031226002654.GB6757@e-Gitt.NET> <20031226162329.GA79023@e-Gitt.NET> <20031226190356.GD79023@e-Gitt.NET> <20031227001820.GA89334@walton.maths.tcd.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031227001820.GA89334@walton.maths.tcd.ie> User-Agent: Mutt/1.5.5.1i Sender: Oliver Brandmueller cc: David Malone Subject: Re: file descriptor leak in 5.2-RC X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Dec 2003 12:32:00 -0000 Hello David, hello everybody. On Sat, Dec 27, 2003 at 12:18:20AM +0000, David Malone wrote: > > during the machine is running on high load and after going to single > > user mode. You can clearly see, that even though kern.openfiles still > > shows a high number, pstat -f only finds very few files. > > Ahhh crud - the kern.file sysctl isn't completly calculated from > the list of all open files - it iterates through all the processes > to form the final list. Could you try rerunning pstat with the patch > below - it walks the full open file list, rather than checking each > process (this may leak open file info to people within jails on the > machine, hopefully that is not a problem for you...) Though I'm running out of time soon, the machine is still not in production. I do not have users and jails, so no problems at all. > (You'll need to recompile your kernel, but not anything else...) Even here no problem even in building a new world ;-) > If the files start to show up here, then we can begin to figure out > where they're comming from. OK, fstat and lsof still don't see the files, but pstat does now! The output is quite long and I'm not sure everybody here likes Mails of 250 Kilobytes, so I do give the URL here: http://the.addict.de/~ob/pstat-patched.txt (if someone likes to see that in a mail, I can send it of course). The main thing here is now: 4333/262144 open files LOC TYPE FLG CNT MSG DATA OFFSET c7757154 inode RW 5 0 c7540000 474d c75a2e14 inode W 1 0 c861f820 0 c8606198 inode RW 1 0 c7540000 0 c8594220 inode RW 1 0 c72ebb2c 0 c85a1b28 inode RW 1 0 c72ebb2c 0 c84b2dd0 inode RW 1 0 c72ebb2c 0 c852d908 inode RW 1 0 c72ebb2c 0 [...] I had a quick look over the rest of the table, and it seems as if nearly every other line looks the same as the fast few lines, except the LOC value changing. kern.openfiles show 4332, so the pstat -f output corresponds with these values just fine. Does that mean, that with the same value for "DATA" it is the same file all over that's opened? Can I somehow find the correspondig file? Thanx for the help, Oliver PS: The machine has to go live until the end of the year, including a period of about 16-24 hours of testing and a preiod of 24 hours of close monitoring. This means I have to have a running system up at least tomorrow evening. I currently plan something like installing 4.9 if I cannot see any quick fix within the next 24 hours. I would really like to track down the problem further, as it seems I'm one of the few who can reproduce that currently, but I don't have any hardware powerful enough to stick it into the testing place at the moment. If someones willing to go all through this during the weekend, I can offer IRC chat, phone call and maybe even access to the machine. -- | Oliver Brandmueller | Offenbacher Str. 1 | Germany D-14197 Berlin | | Fon +49-172-3130856 | Fax +49-172-3145027 | WWW: http://the.addict.de/ | | Ich bin das Internet. Sowahr ich Gott helfe. | | Eine gewerbliche Nutzung aller enthaltenen Adressen ist nicht gestattet! |