From owner-freebsd-questions@FreeBSD.ORG Tue Dec 4 03:45:49 2007 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1052916A46B for ; Tue, 4 Dec 2007 03:45:49 +0000 (UTC) (envelope-from kdk@daleco.biz) Received: from ezekiel.daleco.biz (southernuniform.com [66.76.92.18]) by mx1.freebsd.org (Postfix) with ESMTP id B9BA013C46A for ; Tue, 4 Dec 2007 03:45:48 +0000 (UTC) (envelope-from kdk@daleco.biz) Received: from localhost (localhost [127.0.0.1]) by ezekiel.daleco.biz (8.13.8/8.13.8) with ESMTP id lB43jlnv048607; Mon, 3 Dec 2007 21:45:47 -0600 (CST) (envelope-from kdk@daleco.biz) X-Virus-Scanned: amavisd-new at daleco.biz Received: from ezekiel.daleco.biz ([127.0.0.1]) by localhost (ezekiel.daleco.biz [127.0.0.1]) (amavisd-new, port 10024) with LMTP id VbM6OfLMU-Z7; Mon, 3 Dec 2007 21:45:41 -0600 (CST) Received: from archangel.daleco.biz (dsl.daleco.biz [209.125.108.70]) by ezekiel.daleco.biz (8.13.8/8.13.8) with ESMTP id lB43jawr048603; Mon, 3 Dec 2007 21:45:37 -0600 (CST) (envelope-from kdk@daleco.biz) Message-ID: <4754CD5B.90605@daleco.biz> Date: Mon, 03 Dec 2007 21:45:31 -0600 From: Kevin Kinsey User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.8.1.2) Gecko/20070418 SeaMonkey/1.1.1 MIME-Version: 1.0 To: Rudy References: <4754C19E.5060708@monkeybrains.net> In-Reply-To: <4754C19E.5060708@monkeybrains.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-questions@freebsd.org Subject: Re: cron pile up! Lot's of "cron: running job (cron)" X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Dec 2007 03:45:49 -0000 Rudy wrote: > > cron jobs seem to get stuck. Not always, but within a day, there are at > least 20 stuck. It is not always the same cronjob that does the > sticking. :) When this occurs, I can run ps ax| grep cron and get a > bunch of lines like this: > > 51921 ?? D 0:00.00 cron: running job (cron) > 51922 ?? IVs 0:00.00 cron: running job (cron) > 52544 ?? D 0:00.00 cron: running job (cron) > 52545 ?? IVs 0:00.00 cron: running job (cron) > 54418 ?? D 0:00.00 cron: running job (cron) > 54419 ?? IVs 0:00.00 cron: running job (cron) > 54667 ?? D 0:00.00 cron: running job (cron) > 54668 ?? IVs 0:00.00 cron: running job (cron) > 55835 ?? D 0:00.00 cron: running job (cron) > 55836 ?? IVs 0:00.00 cron: running job (cron) > > What is going on? Please help me remedy this situation. > > The PID numbers next to cron's with a STATE of "IVs" show up in > /var/log/cron, for example: > > # grep 54668 /var/log/cron > Dec 2 22:32:00 pita /usr/sbin/cron[54668]: (root) CMD > (/root/bin/raid-status.sh CRON) > # grep 55836 /var/log/cron > Dec 2 22:40:00 pita /usr/sbin/cron[55836]: (root) CMD > (/root/bin/10minutes.mail.sh | mail -E -s "[ERROR] > mail.monkeybrains.net" example@example.com) > > > If I run 'lsof' I can find these open handles: > > cron 54668 root cwd VDIR 0,80 512 > 471040 /var/cron > cron 54668 root rtd VDIR 0,77 > 512 2 / > cron 54668 root txt VREG 0,82 32496 > 122864 /usr/sbin/cron > cron 54668 root txt VREG 0,77 162712 > 49929 /libexec/ld-elf.so.1 > cron 54668 root txt VREG 0,77 44788 > 49922 /lib/libutil.so.5 > cron 54668 root txt VREG 0,77 941952 > 49923 /lib/libc.so.6 > cron 54668 root txt VREG 0,82 19277 > 826439 /usr/local/lib/nss_mysql.so.1 > cron 54668 root txt VREG 0,82 413626 > 826986 /usr/local/lib/mysql/libmysqlclient.so.15 > cron 54668 root txt VREG 0,77 64604 > 49928 /lib/libz.so.3 > cron 54668 root txt VREG 0,77 107432 > 49918 /lib/libm.so.4 > cron 54668 root txt VREG 0,77 28648 > 49916 /lib/libcrypt.so.3 > cron 54668 root 0u PIPE 0xca02c660 > 16384 ->0xca02c718 > cron 54668 root 1u PIPE 0xcc473250 > 0 ->0xcc473198 > cron 54668 root 2u PIPE 0xcc473250 > 0 ->0xcc473198 > cron 54668 root 5u unix 0xc6665858 > 0t0 ->0xc67e89bc > cron 54667 root cwd VDIR 0,80 512 > 471040 /var/cron > cron 54667 root rtd VDIR 0,77 > 512 2 / > cron 54667 root txt VREG 0,82 32496 > 122864 /usr/sbin/cron > cron 54667 root txt VREG 0,77 162712 > 49929 /libexec/ld-elf.so.1 > cron 54667 root txt VREG 0,77 44788 > 49922 /lib/libutil.so.5 > cron 54667 root txt VREG 0,77 941952 > 49923 /lib/libc.so.6 > cron 54667 root txt VREG 0,82 19277 > 826439 /usr/local/lib/nss_mysql.so.1 > cron 54667 root txt VREG 0,82 413626 > 826986 /usr/local/lib/mysql/libmysqlclient.so.15 > cron 54667 root txt VREG 0,77 64604 > 49928 /lib/libz.so.3 > cron 54667 root txt VREG 0,77 107432 > 49918 /lib/libm.so.4 > cron 54667 root txt VREG 0,77 28648 > 49916 /lib/libcrypt.so.3 > cron 54667 root 0u VCHR 0,26 0t0 > 26 /dev/null > cron 54667 root 1u VCHR 0,26 0t0 > 26 /dev/null > cron 54667 root 2u VCHR 0,26 0t0 > 26 /dev/null > cron 54667 root 3u PIPE 0xca02c660 > 16384 ->0xca02c718 > cron 54667 root 4u PIPE 0xca02c718 > 0 ->0xca02c660 > cron 54667 root 5u unix 0xc6665858 > 0t0 ->0xc67e89bc > cron 54667 root 6u PIPE 0xcc473198 > 16384 ->0xcc473250 > cron 54667 root 7u unix 0xc67e86f4 > 0t0 ->(none) > cron 54667 root 8u PIPE 0xcc473250 > 0 ->0xcc473198 > > What is going on? Is my libnss_mysql acting up? What scripts are running? Care to sanitize the crontab file and show it as well? Barring hardware issues (disk errors, etc.), I'd suspect the scripts. What about server load averages? KDK -- Law of Continuity: Experiments should be reproducible. They should all fail the same way.