From owner-freebsd-questions@FreeBSD.ORG Fri Feb 8 09:32:50 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2B29916A417 for ; Fri, 8 Feb 2008 09:32:50 +0000 (UTC) (envelope-from lachlan@lkla.org) Received: from paul.kawaguchichurch.org (pd5e374.sitmff01.ap.so-net.ne.jp [202.213.227.116]) by mx1.freebsd.org (Postfix) with ESMTP id B93A713C455 for ; Fri, 8 Feb 2008 09:32:49 +0000 (UTC) (envelope-from lachlan@lkla.org) Received: from sm.lkla.org (localhost [127.0.0.1]) by paul.kawaguchichurch.org (8.14.2/8.14.2) with ESMTP id m189WhLa088116; Fri, 8 Feb 2008 18:32:44 +0900 (JST) (envelope-from lachlan@lkla.org) Received: from 137.153.0.25 (SquirrelMail authenticated user lachlan) by sm.lkla.org with HTTP; Fri, 8 Feb 2008 18:32:44 +0900 (JST) Message-ID: <26921.137.153.0.25.1202463164.squirrel@sm.lkla.org> In-Reply-To: <47A9BCB0.8020309@dial.pipex.com> References: <1153.137.153.0.37.1202210274.squirrel@sm.lkla.org> <69739C80-0639-4808-B5EB-0D9553826559@dpcsys.com> <30396.137.153.0.36.1202264253.squirrel@sm.lkla.org> <47A99B4E.1080707@dial.pipex.com> <28742.137.153.0.25.1202301936.squirrel@sm.lkla.org> <47A9BCB0.8020309@dial.pipex.com> Date: Fri, 8 Feb 2008 18:32:44 +0900 (JST) From: "Lachlan Michael" To: "Alex Zbyslaw" User-Agent: SquirrelMail/1.4.13 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-2022-jp Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-questions@freebsd.org, mark@msapiro.net Subject: Re: Memory Error using Mailman on FreeBSD. How to debug? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lachlan@lkla.org List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Feb 2008 09:32:50 -0000 >>>How big does the mailman process actually get? top will tell you. >>> >>Mailman values don't budge. None of the mailman processes go over about >>8.5M, which is what they are during idle time. >> > Real puzzler. I'm surprised not to have at least one process growing, > though. Maybe it's not using much CPU and you're not spotting it. > > Try running top, then sorting on size (o size inside top) then try your > mailman email again. Make sure the top refresh rate is fast enough. s > 1 inside top would do that, or even s 0 if desperate. Following you advice, as far as I can tell, the mailman qrunner process /usr/local/bin/python2.5 /usr/local/mailman/bin/qrunner --runner=IncomingRunner:0:1 -s is the one that crashes: all other mailman processes are unaffected. I couldn't see it increase much in size (maybe it went from 8.5M to 12.5M), then it just bombed and a new process was spawned (easy to tell by the large increase in PID). > Other things to try: Up the stack size > ulimit -s 262144 > > inside the mailman startup. Again, I've had processes in the past which > needed this. > > You'd have to check that from a shell (/bin/sh) and first to see that > your system will allow a bigger value. If not, I believe that there is > a sysctl to do that these days but don't have a modern enough system to > look it up. A search for MAXSSIZ on google or mail archives may turn it > up - that's the kernel option but requires a recompile. Ok, I am going to gradually try different limits. It seems as though setting kern.maxssiz="256M" and so on in /boot/loader.conf will allow me to increase the limits. Having to reboot is a pain, though. How far can I go? 512M? (Physical memory is 1GB) > Of course, limits may not be the issue at all. They are a likely > suspect given your error message, but maybe it's worth checking other > bits of the mail system. Can you email a file of the size your are > trying not through mailman? Maybe your MTA (sendmail/postfix etc) has a > limit that somehow causes mailman to get this error. This is definitely not the case. Users can receive (and send) similar sized large attachments individually, so the MTA (sendmail in this case) is not the cause. > The final suggestion is to try to trace (ktrace, strace from ports) the > process that is dying, but I suspect mailman forks a new process to deal > with the email so how you catch it, I don't know. Many demons have a > "run in foreground without forking" option which can be helpful to > debugging, but I don't know if anything like that is possible in > mailman. If you can figure out what mailman actually runs to process > the email, you could ktrace that from the command line. Maybe the > mailman mailing list could give you an incantation to try. I'll admit it is my first time to try a ktrace, but after noting which process it was that crashed I could identify the newly spawned PID, and obtained a ktrace.out (binary) and a kdump (called mailman_process_log.txt) when the problems occurs by sending another large mail attachment. I'll leave the files up for a couple of days. (Both files are about 2MB in size) http://lachlan.lkla.org/tmp/mailman_memory_error/ Not that I can properly interpret the results, but it seems the mail file is completely read, but whatever happens next causes the memory error. 52506 python2.5 RET read 354/0x162 52506 python2.5 CALL break(0x8add000) 52506 python2.5 RET break 0 52506 python2.5 CALL break(0x8cc3000) 52506 python2.5 RET break -1 errno 12 Cannot allocate memory Thanks again for your time and suggestions.