From owner-freebsd-stable@FreeBSD.ORG Wed Mar 22 07:25:49 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C493A16A41F for ; Wed, 22 Mar 2006 07:25:49 +0000 (UTC) (envelope-from doconnor@gsoft.com.au) Received: from cain.gsoft.com.au (cain.gsoft.com.au [203.31.81.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2406743D46 for ; Wed, 22 Mar 2006 07:25:48 +0000 (GMT) (envelope-from doconnor@gsoft.com.au) Received: from inchoate.gsoft.com.au (ppp198-152.lns1.adl4.internode.on.net [203.122.198.152]) (authenticated bits=0) by cain.gsoft.com.au (8.13.5/8.13.4) with ESMTP id k2M7PaRV018526 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 22 Mar 2006 17:55:42 +1030 (CST) (envelope-from doconnor@gsoft.com.au) From: "Daniel O'Connor" To: freebsd-stable@freebsd.org Date: Wed, 22 Mar 2006 17:55:24 +1030 User-Agent: KMail/1.9.1 MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2094522.Sxvl1czhck"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200603221755.25666.doconnor@gsoft.com.au> X-Spam-Score: -0.343 () AWL X-Scanned-By: MIMEDefang 2.56 on 203.31.81.10 Subject: Process hanging on 6.0-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Mar 2006 07:25:49 -0000 --nextPart2094522.Sxvl1czhck Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hi, I work for a small company that makes radar systems for research=20 organisations and we use FreeBSD on the PCs for data acquisition and=20 processing. We have recently shifted to FreeBSD6/amd64 and one machine in=20 particular is exhibiting a strange problem. The acquisition process is a Tcl interpreter with a largish chunk of C code which talks to the hardware (via RS485 and a custom PCI card). Once the=20 system is set up it streams data back via the PCI card and runs it through various data processors (eg dump raw data to disk, FFT, winds, etc..).=20 The actual forking of processes is handled in Tcl and the C code only gets involved to write the data out (to an FD the Tcl layer keeps). The problem is that every now and then the process gets stuck and becomes unkillable just after forking, ie.. eureka:~>ps -axwwwwl | grep Reco 19999 881 1 12 -8 -5 21716 15984 piperd Igdb $GSHOME/libexec/Recorder =2E.. (gdb) attach 881 =2E.. (gdb) bt #0 0x00000008009c395c in read () from /lib/libc.so.6 #1 0x000000080072f77f in TclpCreateProcess () from /usr/local/lib/libtcl84= =2Eso.1 #2 0x0000000800717d25 in TclCreatePipeline () from /usr/local/lib/libtcl84= =2Eso.1 #3 0x00000008007186d0 in Tcl_OpenCommandChannel () from /usr/local/lib/lib= tcl84.so.1 #4 0x0000000800704af8 in Tcl_ExecObjCmd () from /usr/local/lib/libtcl84.so= =2E1 =2E.. However the newly made one.. (gdb) attach 80154 Attaching to program: /usr/home/radar/skiymet/libexec/Recorder, process 801= 54 ptrace: Resource temporarily unavailable. The original is killable.. eureka:~>kill 881 eureka:~>kill 881 881: No such process But the old one is not.. eureka:~>kill 80154 eureka:~>kill 80154 eureka:~>kill -9 80154 eureka:~>kill -9 80154 I can fstat the new process and it shows a slew of open FDs (presumably inherited from the old process), but I can't ktrace it.. eureka:~>ktrace -f 80154.ktr -p 80154 ktrace: 80154.ktr: Operation not permitted eureka:~>sudo ktrace -f 80154.ktr -p 80154 ktrace: 80154.ktr: Operation not permitted Or get a memory map.. eureka:~>dd if=3D/proc/80154/map bs=3D64k dd: /proc/80154/map: Resource temporarily unavailable 0+0 records in 0+0 records out 0 bytes transferred in 0.000096 secs (0 bytes/sec) Unfortunately the machine is at a very remote location and I have not been able to replicate it locally (and I can't run, say memtest remotely either). The custom PCI card has a driver which may be the cause of the problems but it does not appear to be involved from what I can see. Does anyone have any suggestions? The version of FreeBSD is a little=20 after 6.0-RELEASE but not much. =2D-=20 Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C --nextPart2094522.Sxvl1czhck Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (FreeBSD) iD8DBQBEIPvl5ZPcIHs/zowRAqdWAJ0ZmqEUpmwZjS5wJMXwCchhMM2KsgCfVvir pD6Pigdv9fp30hw/3nJ/iiY= =z4k3 -----END PGP SIGNATURE----- --nextPart2094522.Sxvl1czhck--