From owner-freebsd-stable@FreeBSD.ORG Wed Mar 11 04:00:51 2015 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A3340A93; Wed, 11 Mar 2015 04:00:51 +0000 (UTC) Received: from mail.akips.com (mail.akips.com [65.19.130.19]) by mx1.freebsd.org (Postfix) with ESMTP id 8D23E6D6; Wed, 11 Mar 2015 04:00:51 +0000 (UTC) Received: from [10.1.8.7] (CPE-120-146-191-2.static.qld.bigpond.net.au [120.146.191.2]) by mail.akips.com (Postfix) with ESMTPSA id D3B8B28029; Wed, 11 Mar 2015 14:00:43 +1000 (EST) Message-ID: <54FFBDE9.5060702@akips.com> Date: Wed, 11 Mar 2015 14:00:41 +1000 From: Nick Frampton User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Mark Johnston , John Baldwin Subject: Re: Suspected libkvm infinite loop References: <54FE3803.2000307@akips.com> <4637620.LE11f9AQj7@ralph.baldwin.cx> <20150310215913.GB52108@charmander.picturesperfect.net> In-Reply-To: <20150310215913.GB52108@charmander.picturesperfect.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on host1.akips.com Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Mar 2015 04:00:51 -0000 On 11/03/15 07:59, Mark Johnston wrote: > On Tue, Mar 10, 2015 at 02:10:09PM -0400, John Baldwin wrote: >> Often loops using libkvm are due to programs using libkvm are trying to read >> kernel data structures while they are changing. However, if you use sysctls >> to fetch this data instead, you should be able to get a stable snapshot of the >> system state without getting stuck in a possible loop. I believe for libkvm >> to use sysctl instead of /dev/kmem you have to pass a NULL for the kernel and >> "/dev/null" for the core image. In our code, we're invoking kvm_openfiles as you suggest: kd = kvm_openfiles (NULL, _PATH_DEVNULL, NULL, O_RDONLY, errbuf) > It sounds like this issue might be the one fixed in r272566: if the > KERN_PROC_ALL sysctl is read with an insufficiently large buffer, an > sbuf error return value could bubble up and be treated as ERESTART, > resulting in a loop. > > This can be confirmed with something like > > dtrace -n 'syscall:::entry /pid == $target/{@[probefunc] = count();} tick-3s {exit(0);}' -p > > If the output consists solely of __sysctl, this bug is likely the > culprit. Unfortunately, I accidentally killed fstat this morning before I could do any further debug. I ran truss -p on it yesterday and it was spinning solely on __sysctl. I'll try compiling with debug symbols in case it happens again. I haven't been able to reproduce the problem in a reasonable time frame so it could be days or weeks before we see it happen again. -Nick