From owner-freebsd-questions@FreeBSD.ORG Wed Jun 15 17:42:28 2005 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7897916A41C for ; Wed, 15 Jun 2005 17:42:28 +0000 (GMT) (envelope-from matt@atopia.net) Received: from neptune.atopia.net (neptune.atopia.net [209.128.231.90]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3027F43D1F for ; Wed, 15 Jun 2005 17:42:28 +0000 (GMT) (envelope-from matt@atopia.net) Received: by neptune.atopia.net (Postfix, from userid 1001) id D92C940B4; Wed, 15 Jun 2005 13:42:27 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by neptune.atopia.net (Postfix) with ESMTP id D703540A6; Wed, 15 Jun 2005 13:42:27 -0400 (EDT) Date: Wed, 15 Jun 2005 13:42:27 -0400 (EDT) From: Matt Juszczak To: Greg Maruszeczka In-Reply-To: <42B064F5.8020500@grokking.org> Message-ID: <20050615133120.X87922@neptune.atopia.net> References: <20050615120621.S85701@neptune.atopia.net> <42B064F5.8020500@grokking.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-questions@freebsd.org Subject: Re: procmail keeps dieing on freebsd 5.4 with postfix X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Jun 2005 17:42:28 -0000 Appreciate the response :) Here's my message the way it should have been originally.... > Seriously though, you need to provide some more detailed information if > you want anyone here to be able to help you. Start with explaining why > you decided to change MDAs in the first place since I'm sure I'm not the > only one thinking you must be nuts to make such a major change on a > production system with a potential 3000-user lynch mob waiting in the > wings. What were you using for local delivery before this? Was there a > problem with it or were you looking for new features, etc.? We are currently moving to a new mail server that is FreeBSD-based. Our old mail server is a chrooted slackware box that hasn't been upgraded in years because no one even had access to it for a while (the management of the company I work for used to stink, its better now). Our new mail server has 3000 accounts on it, that are active, but only about 50 of them are actually functioning (one of our virtual domains). We haven't switched the MX record for our main ISP yet, we're waiting to make sure the box is stable first. So to answer your question, there is only about a 50-user lynch mob and most of those users are internal to our ISP (employees, etc.).... I would not make a change on something that had more live users, especially paying customers. Our current mail server supports procmail, and we have about 50 users who use it. Therefore, thats why I was turning it on on the new server. We're working on basically mirroring the old server to the new one and making sure that our change will be swift and efficient. I've considered using postfix's internal LDA and just calling procmail from inside a .forward file for those users who need it/want it ... this might end up fixing the problems. > If you're not around to see the console messages how do you know > "procmail is always the error causer"? Perhaps this is conveyed to you > by your co-workers but if so, why don't they tell you the complete error > message so you can convey it to us? Leaving that aside, however, what > about the logs? Certainly /var/log/maillog should provide some clues if > the problem is really your MDA (more on this below). Also we'd need to > know something about your configuration (i.e. contents of main.cf and > master.cf for starters) to help you with a MTA/MDA problem. Its happened twice now. The first time this problem happened was late at night, about 2 days after I made the change to the LDA. The machine would not respond to ping, and nagios was alerting us like crazy that the box was down. The machine was non-responsive to the keyboard, and the console had a "dump" on it, about 15 lines long, with procmail written all over it. I turned procmail off after rebooting the machine, running fsck, restoring postfix to a functioning state, etc. Procmail remained disabled for about three weeks, in which the box ran fine. Yesterday afternoon we switched the LDA back to procmail, and the machine ran fine over night. On my way into work today, I got paged that the box was down from nagios and called. The tech that was here rebooted the machine, but before he did he said, in his own words "There was a bunch of crap on the screen with procmail this and procmail that, and the machine was locked hard.". I've disabled procmail again and it seems to be running stable. As far as logs, nothing.... the maillog cuts out at 11:14 AM and cuts back in at 11:21 AM, with no "errors" in between. > FWIW this doesn't sound like a software issue (except maybe a massive > memory leak(??)) but then again, I'm saying this with very little useful > information provided by you. Have you done any basic hardware checks > (e.g. memtest, case and cpu cooling, power supply integrity, etc.)? Yes, the machine has been checked. We ran memtest on it, etc., with no problems. The machine is about 2 months old; however, so its passed its burn in test but could have issues, but I doubt thats the problem. > You've stated that these lock-ups occur every week at the beginning of > your post then you say later it's every couple of days. Which is it? > Also, please try to precisely define "locking up" and "crashes". It's > unclear to me based on your description and the (possibly misleading) > subject line what portions of the system are affected. Precision matters > IMHO. See above. Its occured twice in a one month span but most of that time procmail was not running. It occurs usually within 24-48 hours of switching procmail back on. Thanks, hope this helps a little more! -Matt