From owner-freebsd-hackers@FreeBSD.ORG  Sun Sep 12 13:08:01 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id C4F711065675; Sun, 12 Sep 2010 13:08:01 +0000 (UTC)
Date: Sun, 12 Sep 2010 13:08:01 +0000
From: Alexander Best <arundel@freebsd.org>
To: Jilles Tjoelker <jilles@stack.nl>
Message-ID: <20100912130801.GA23538@freebsd.org>
References: <4C8A81D9.5020905@rawbw.com> <20100910194600.GB60815@stack.nl>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20100910194600.GB60815@stack.nl>
Cc: Yuri <yuri@rawbw.com>, freebsd-hackers@freebsd.org
Subject: Re: Why I can't trace linux process's childs with truss?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Sep 2010 13:08:01 -0000

On Fri Sep 10 10, Jilles Tjoelker wrote:
> On Fri, Sep 10, 2010 at 12:07:05PM -0700, Yuri wrote:
> > I am trying to get the log of all system calls that skype makes with 
> > truss -f /usr/local/share/skype/skype
> > For some reason the resulting log only has the leading process calls and 
> > nothing from it's 8 childs.
> > Truss doesn't show any 'cloned' processes. Is this a bug in truss that 
> > it doesn't follow 'cloned' processes?
> 
> > Is there any workaround or other way I can debug skype? strace doesn't 
> > work on amd64.
> > I am primarily interested why it can't read /dev/video0 device, created 
> > by webcamd.
> 
> Try using ktrace instead of truss. You will need devel/linux_kdump from
> ports to decode the resulting ktrace.out.

there's a PR related to this "issue" [1]. so is truss missing this
functionality or is this in fact a feature, because truss musn't be used on
any non freebsd executable?

if that is the case i vote to add a CAVEATS section to the truss(1) manual so
people rather use ktrace in combination with linux_kdump.

cheers.
alex

[1] http://www.freebsd.org/cgi/query-pr.cgi?pr=150262

> 
> Alternatively, if you're familiar with dtrace, you could try that.
> 
> -- 
> Jilles Tjoelker

-- 
a13x

From owner-freebsd-hackers@FreeBSD.ORG  Sun Sep 12 15:27:03 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4B3F11065673;
	Sun, 12 Sep 2010 15:27:03 +0000 (UTC)
	(envelope-from mjguzik@gmail.com)
Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com
	[209.85.215.182])
	by mx1.freebsd.org (Postfix) with ESMTP id A946C8FC1E;
	Sun, 12 Sep 2010 15:27:02 +0000 (UTC)
Received: by eyx24 with SMTP id 24so2810271eyx.13
	for <multiple recipients>; Sun, 12 Sep 2010 08:27:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type;
	bh=m6j26NU88iPwMai1zisxeFGeJ9x6JoUNitGUgojjdNk=;
	b=RdxK/vCAnxNJMQYHKNdcMsz0O1ErxDnkvXOIp9a1fogsEdyEbp10aLbIpYqs2qDo+d
	AtkUFzJl/PC+ARLpIYy1AB7qRKp48YRKGit8uaCkp7JxvVUW+Xo4n7OlgBa2ATGsNVk/
	03J6S+E6Re9Bfr3i3rNdeOSCDWUHQCMZtrAvs=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	b=pNpssS96zWm8ZtSeSqAHpGCzpapvYy/3YMyyZja0U5BhCGL7mQlml92g88pnHzvW7m
	AIjBneP1rp9j4841IcXFMUXPen6RhW/Pnv0EvRnvoT0co+g/QOMAdv6LXlZxCHlhjv6N
	XbXRlC0nZ+/LEvJTvzHyaxmtxn2BKaEPUhO50=
MIME-Version: 1.0
Received: by 10.213.22.139 with SMTP id n11mr1022763ebb.21.1284303669383; Sun,
	12 Sep 2010 08:01:09 -0700 (PDT)
Received: by 10.14.120.146 with HTTP; Sun, 12 Sep 2010 08:01:09 -0700 (PDT)
In-Reply-To: <20100912130801.GA23538@freebsd.org>
References: <4C8A81D9.5020905@rawbw.com> <20100910194600.GB60815@stack.nl>
	<20100912130801.GA23538@freebsd.org>
Date: Sun, 12 Sep 2010 17:01:09 +0200
Message-ID: <AANLkTikiWs9O+8+mwOaE4nVovT0yDQ3GvPO7E9H_MWkW@mail.gmail.com>
From: Mateusz Guzik <mjguzik@gmail.com>
To: Alexander Best <arundel@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Yuri <yuri@rawbw.com>, Jilles Tjoelker <jilles@stack.nl>,
	freebsd-hackers@freebsd.org
Subject: Re: Why I can't trace linux process's childs with truss?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Sep 2010 15:27:03 -0000

On Sun, Sep 12, 2010 at 3:08 PM, Alexander Best <arundel@freebsd.org> wrote:
> there's a PR related to this "issue" [1]. so is truss missing this
> functionality or is this in fact a feature, because truss musn't be used on
> any non freebsd executable?
>

Actually truss handles linux processes just fine, except for their children. :)
Linux process can create a child using linux_clone syscall, but truss does not
handle that case and this can be the problem that Yuri reported (since
no log was
provided, I can only guess).

This trivial patch should fix this:
http://student.agh.edu.pl/~mjguzik/truss-linux-forks.patch

Tested on this simple program:
http://student.agh.edu.pl/~mjguzik/fork.c

If it still does not work, log generated by truss would be helfpul.

Regards,
--
Mateusz Guzik

From owner-freebsd-hackers@FreeBSD.ORG  Sun Sep 12 15:40:54 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id 685B7106566B; Sun, 12 Sep 2010 15:40:54 +0000 (UTC)
Date: Sun, 12 Sep 2010 15:40:54 +0000
From: Alexander Best <arundel@freebsd.org>
To: Mateusz Guzik <mjguzik@gmail.com>
Message-ID: <20100912154054.GA42409@freebsd.org>
References: <4C8A81D9.5020905@rawbw.com> <20100910194600.GB60815@stack.nl>
	<20100912130801.GA23538@freebsd.org>
	<AANLkTikiWs9O+8+mwOaE4nVovT0yDQ3GvPO7E9H_MWkW@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <AANLkTikiWs9O+8+mwOaE4nVovT0yDQ3GvPO7E9H_MWkW@mail.gmail.com>
Cc: Yuri <yuri@rawbw.com>, Jilles Tjoelker <jilles@stack.nl>,
	freebsd-hackers@freebsd.org
Subject: Re: Why I can't trace linux process's childs with truss?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Sep 2010 15:40:54 -0000

On Sun Sep 12 10, Mateusz Guzik wrote:
> On Sun, Sep 12, 2010 at 3:08 PM, Alexander Best <arundel@freebsd.org> wrote:
> > there's a PR related to this "issue" [1]. so is truss missing this
> > functionality or is this in fact a feature, because truss musn't be used on
> > any non freebsd executable?
> >
> 
> Actually truss handles linux processes just fine, except for their children. :)
> Linux process can create a child using linux_clone syscall, but truss does not
> handle that case and this can be the problem that Yuri reported (since
> no log was
> provided, I can only guess).
> 
> This trivial patch should fix this:
> http://student.agh.edu.pl/~mjguzik/truss-linux-forks.patch
> 
> Tested on this simple program:
> http://student.agh.edu.pl/~mjguzik/fork.c
> 
> If it still does not work, log generated by truss would be helfpul.

looking good. could be post that patch as followup to yuri's PR?

hope it gets committed soon. :)

cheers.
alex

> 
> Regards,
> --
> Mateusz Guzik

-- 
a13x

From owner-freebsd-hackers@FreeBSD.ORG  Mon Sep 13 15:10:46 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2B3D2106584C
	for <freebsd-hackers@freebsd.org>; Mon, 13 Sep 2010 15:10:41 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 1F0418FC15
	for <freebsd-hackers@freebsd.org>; Mon, 13 Sep 2010 15:10:41 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id C3E1346C08;
	Mon, 13 Sep 2010 11:10:40 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 07C168A050;
	Mon, 13 Sep 2010 11:10:40 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-hackers@freebsd.org
Date: Mon, 13 Sep 2010 10:11:18 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <20100911060704.55B611065670@hub.freebsd.org>
In-Reply-To: <20100911060704.55B611065670@hub.freebsd.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201009131011.19089.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Mon, 13 Sep 2010 11:10:40 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: Simon <simon@optinet.com>
Subject: Re: MCE Decoding - MCA: Bank 8,
	Status 0xcc0031800001009f/0xc8000980000200cf
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Sep 2010 15:10:46 -0000

On Saturday, September 11, 2010 1:40:28 am Simon wrote:
> Hello,
> 
> Can someone please help me decode these two errors on FreeBSD 8.1-R:
> 
> MCA: Bank 8, Status 0xcc0031800001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x106a5, APIC ID 16
> MCA: CPU 0 COR (198) OVER RD channel ?? memory error
> MCA: Address 0x1b6188d80
> MCA: Misc 0x72ae242000000084
> 
> MCA: Bank 8, Status 0xc8000980000200cf
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x106a5, APIC ID 16
> MCA: CPU 0 COR (38) OVER MS channel ?? memory error
> MCA: Misc 0x72ae242000000140

HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 8 
MISC 72ae242000000084 ADDR 1b6188d80 
MCG status:
MCi status:
Error overflow
MCi_MISC register valid
MCi_ADDR register valid
MCA: MEMORY CONTROLLER RD_CHANNELunspecified_ERR
Transaction: Memory read error
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 198
Memory transaction Tracker ID (RTId): 84
Memory DIMM ID of error: 0
Memory channel ID of error: 0
Memory ECC syndrome: 72ae2420
STATUS cc0031800001009f MCGSTATUS 0
MCGCAP 1c09 APICID 10 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 26
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 BANK 8 
MISC 72ae242000000140 
MCG status:
MCi status:
Error overflow
MCi_MISC register valid
MCA: MEMORY CONTROLLER MS_CHANNELunspecified_ERR
Transaction: Memory scrubbing error
Memory ECC error occurred during scrub
Memory corrected error count (CORE_ERR_CNT): 38
Memory transaction Tracker ID (RTId): 40
Memory DIMM ID of error: 0
Memory channel ID of error: 0
Memory ECC syndrome: 72ae2420
STATUS c8000980000200cf MCGSTATUS 0
MCGCAP 1c09 APICID 10 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 26

You have some corrected memory errors (198+38 = 236) in the first DIMM (on the 
SuperMicro boards we have at work, it would correspond to the DIMM slot 
labeled P1_DIMM1A).  In my experience I would just ignore them unless the 
count gets much higher (say 10000+ / per hour).

-- 
John Baldwin

From owner-freebsd-hackers@FreeBSD.ORG  Mon Sep 13 21:28:32 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E36AA106566B
	for <freebsd-hackers@freebsd.org>; Mon, 13 Sep 2010 21:28:32 +0000 (UTC)
	(envelope-from cronfy@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 77FF28FC08
	for <freebsd-hackers@freebsd.org>; Mon, 13 Sep 2010 21:28:32 +0000 (UTC)
Received: by bwz20 with SMTP id 20so323618bwz.13
	for <freebsd-hackers@freebsd.org>; Mon, 13 Sep 2010 14:28:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:mime-version:received:from:date
	:message-id:subject:to:content-type;
	bh=OFS21fCPafcBZA8E2xRo2cgr6nGRr74PMqwUVHh7aXE=;
	b=pLGPdmyuIP2HcmKEzUqd+j8s2kV6rIilRuQ2irVZnHk1gJPO2yVCVnfomQxXV7qebj
	mI0reWWefZXVM7BuU1C4a/jiGn9kz41qafYPC7x6ysdptPhFRvqGTl/FWq31p9OfEsno
	Fz5kXMQTQ3z7vovH+g56hIAlmaou0eLXjwBiU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:from:date:message-id:subject:to:content-type;
	b=OO55HqOXfZW9IWZcYTGVALKo8fWUxb2qFcWME4jm/RGxyuyZq8TJ4x6RHp6Y8SgPSB
	ktdw7MmV6Dv+eGMZ2oVY8Mp8ZsozgrpXrU9OyDhwKAjOOzH3H0C0FMq6KEY0+g1+sbEC
	Z6YX85rbqGHudqPlWBLMvF3FmRcNLtCY58On0=
Received: by 10.204.85.90 with SMTP id n26mr3623589bkl.109.1284411465116; Mon,
	13 Sep 2010 13:57:45 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.204.99.197 with HTTP; Mon, 13 Sep 2010 13:57:15 -0700 (PDT)
From: cronfy <cronfy@gmail.com>
Date: Tue, 14 Sep 2010 00:57:15 +0400
Message-ID: <AANLkTinEje-+1P1n33YMKAaciaYHQH+dpwgX6UY1dOux@mail.gmail.com>
To: freebsd-hackers@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Subject: is vfs.lookup_shared unsafe in 7.3?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Sep 2010 21:28:33 -0000

Hello,

Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA >
40, very slow lstat() at these moments, looks like some kind of lock
contention) I enabled vfs.lookup_shared=1 on two servers today. One is
FreeBSD-7.3 kernel csup'ed and built Sep  9 2010 and other is
FreeBSD-7.3 csup'ed and built Jul 16 2010.

The server with more fresh kernel is running nice and does not show
high load anymore. But on the second server it did not help. More,
after a few hours of work with vfs.lookup_shared=1 I noticed processes
stucked in "ufs" state. I tried to kill them with no luck. Disabling
vfs.lookup_shared freezed the whole system.

So, is vfs.lookup_shared=1 unsafe in 7.3? Did it become more stable
between 16 Jul and 9 Sep (is it the reason why first system is still
running?), or should I expect that it will freeze in a near time too?

Thanks in advance!

-- 
// cronfy

From owner-freebsd-hackers@FreeBSD.ORG  Tue Sep 14 11:34:10 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6D9B510656A3
	for <freebsd-hackers@freebsd.org>; Tue, 14 Sep 2010 11:34:10 +0000 (UTC)
	(envelope-from freebsd-hackers@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id 290088FC0A
	for <freebsd-hackers@freebsd.org>; Tue, 14 Sep 2010 11:34:09 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-hackers@m.gmane.org>) id 1OvTm1-0005cu-Nf
	for freebsd-hackers@freebsd.org; Tue, 14 Sep 2010 13:34:05 +0200
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-hackers@freebsd.org>; Tue, 14 Sep 2010 13:34:05 +0200
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-hackers@freebsd.org>; Tue, 14 Sep 2010 13:34:05 +0200
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-hackers@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Tue, 14 Sep 2010 13:33:58 +0200
Lines: 13
Message-ID: <i6nmj5$l5c$1@dough.gmane.org>
References: <AANLkTinEje-+1P1n33YMKAaciaYHQH+dpwgX6UY1dOux@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.1.9) Gecko/20100518 Thunderbird/3.0.4
In-Reply-To: <AANLkTinEje-+1P1n33YMKAaciaYHQH+dpwgX6UY1dOux@mail.gmail.com>
X-Enigmail-Version: 1.0.1
Subject: Re: is vfs.lookup_shared unsafe in 7.3?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Sep 2010 11:34:10 -0000

On 09/13/10 22:57, cronfy wrote:
> Hello,
>
> Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA>
> 40, very slow lstat() at these moments, looks like some kind of lock
> contention) I enabled vfs.lookup_shared=1 on two servers today. One is
> FreeBSD-7.3 kernel csup'ed and built Sep  9 2010 and other is
> FreeBSD-7.3 csup'ed and built Jul 16 2010.

The important think you missed is *where* is the supposed lock 
contention. If you have lots of processes in "ufs" state, there are 
other things that can help you, such as increasing vfs.ufs.dirhash_maxmem.



From owner-freebsd-hackers@FreeBSD.ORG  Tue Sep 14 12:40:39 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 15E16106564A
	for <freebsd-hackers@freebsd.org>; Tue, 14 Sep 2010 12:40:39 +0000 (UTC)
	(envelope-from cronfy@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 99F4C8FC13
	for <freebsd-hackers@freebsd.org>; Tue, 14 Sep 2010 12:40:38 +0000 (UTC)
Received: by bwz15 with SMTP id 15so231136bwz.13
	for <freebsd-hackers@freebsd.org>; Tue, 14 Sep 2010 05:40:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:mime-version:received:in-reply-to
	:references:from:date:message-id:subject:to:content-type
	:content-transfer-encoding;
	bh=xIRs/8CLMx8Y6X31HRmvyrj3QNuwH7F6hIElIkaBp0Y=;
	b=bNbc8aCm77SStPdXFmHM/cEfo2mKUeltWVUmQHtf5fZSAsG/HtXkjmNO6yw9FPWMXv
	z1BQBZ4/lRENjNzzp+1LERD0UU/CnClk7rew5Xe3hW4Pno/3d+yi9kMIq+WAtJcCBsM3
	1njqhGIqmcMGYmBFlgf6N6XdMRsceN2c96WuM=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
	:content-type:content-transfer-encoding;
	b=JHwDqRtikl5IiURjofoeZ+73i/TM9P+/9k/wfdkVDbDr95eNDkv8/FhT8cweXl82Kg
	Milw81+HT8Svb9mKoaIwb6oXbswnZPCi7QR0G7GOIP20nqr/ZKU0I0lr2W9qUgw6XzKu
	1lFSb7eVhhwG01Oq/4D4foSNpm46d4KgwvnhE=
Received: by 10.204.76.140 with SMTP id c12mr4388021bkk.7.1284468037314; Tue,
	14 Sep 2010 05:40:37 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.204.99.197 with HTTP; Tue, 14 Sep 2010 05:40:07 -0700 (PDT)
In-Reply-To: <i6nmj5$l5c$1@dough.gmane.org>
References: <AANLkTinEje-+1P1n33YMKAaciaYHQH+dpwgX6UY1dOux@mail.gmail.com>
	<i6nmj5$l5c$1@dough.gmane.org>
From: cronfy <cronfy@gmail.com>
Date: Tue, 14 Sep 2010 16:40:07 +0400
Message-ID: <AANLkTi=ij_N_-S6imNfLw7Dgf7E5FC-w9wwt89Sg=a1s@mail.gmail.com>
To: freebsd-hackers@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Subject: Re: is vfs.lookup_shared unsafe in 7.3?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Sep 2010 12:40:39 -0000

>> Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA>
>> 40, very slow lstat() at these moments, looks like some kind of lock
>> contention) I enabled vfs.lookup_shared=3D1 on two servers today. One is
>> FreeBSD-7.3 kernel csup'ed and built Sep =A09 2010 and other is
>> FreeBSD-7.3 csup'ed and built Jul 16 2010.
>
> The important think you missed is *where* is the supposed lock contention=
.
> If you have lots of processes in "ufs" state, there are other things that
> can help you, such as increasing vfs.ufs.dirhash_maxmem.

Before I changed vfs.lookup_shared I did increase
vfs.ufs.dirhash_maxmem to 16M. It filled in ~5 minutes, but even while
it was not full, server was not running better.

Usually there is very small number of processes in ufs state (they are
even not in top). That processes I've been talking about I suspect
were the consequence of enabling vfs.lookup_shared.

I also enabled hwpmc to examine system at the moments of high load,
but did not have a chance to use it.

What am I afraid of now is that server that is running nice till now
may crash, that's why I am asking about stability of vfs.lookup_shared
in 7.3. At svn.freebsd.org I see a couple of commits in
stable/7/sys/fs/ and ufs/ for last 2 months that could change the
behaviour, and this may be the reason why one system is running
stable, and another was not. But I am not sure about it, so I am
asking experienced people here :)

--=20
// cronfy

From owner-freebsd-hackers@FreeBSD.ORG  Wed Sep 15 14:02:15 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A2BCE106564A
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 14:02:15 +0000 (UTC)
	(envelope-from simon@comsys.ntu-kpi.kiev.ua)
Received: from comsys.kpi.ua (comsys.kpi.ua [77.47.192.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 2E55E8FC14
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 14:02:15 +0000 (UTC)
Received: from pm513-1.comsys.kpi.ua ([10.18.52.101]
	helo=pm513-1.comsys.ntu-kpi.kiev.ua)
	by comsys.kpi.ua with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63)
	(envelope-from <simon@comsys.ntu-kpi.kiev.ua>) id 1OvsHX-0003Jg-KI
	for freebsd-hackers@freebsd.org; Wed, 15 Sep 2010 16:44:15 +0300
Received: by pm513-1.comsys.ntu-kpi.kiev.ua (Postfix, from userid 1001)
	id 1E5B61CC1E; Wed, 15 Sep 2010 16:44:16 +0300 (EEST)
Date: Wed, 15 Sep 2010 16:44:15 +0300
From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: freebsd-hackers@freebsd.org
Message-ID: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.20 (2009-06-14)
X-Authenticated-User: simon@comsys.ntu-kpi.kiev.ua
X-Authenticator: plain
X-Sender-Verify: SUCCEEDED (sender exists & accepts mail)
X-Exim-Version: 4.63 (build at 06-Jan-2007 23:14:37)
X-Date: 2010-09-15 16:44:15
X-Connected-IP: 10.18.52.101:21881
X-Message-Linecount: 48
X-Body-Linecount: 36
X-Message-Size: 2282
X-Body-Size: 1769
Subject: Questions about mutex implementation in kern/kern_mutex.c
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Sep 2010 14:02:15 -0000

Hello,

I have questions about mutex implementation in kern/kern_mutex.c
and sys/mutex.h files (current versions of these files):

1. Is the following statement correct for a volatile pointer or integer
   variable: if a volatile variable is updated by the compare-and-set
   instruction (e.g. atomic_cmpset_ptr(&val, ...)), then the current
   value of such variable can be read without any special instruction
   (e.g. v = val)?

   I checked Assembler code for a function with "v = val" and "val = v"
   like statements generated for volatile variable and simple variable
   and found differences: on ia64 "v = val" was implemented by ld.acq and
   "val = v" was implemented by st.rel; on mips and sparc64 Assembler code
   can have different order of lines for volatile and simple variable
   (depends on the code of a function).

2. Let there is a default (sleep) mutex and adaptive mutexes is enabled.
   A thread tries to obtain lock quickly and fails, _mtx_lock_sleep()
   is called, it gets the address of the current mutex's owner thread
   and checks whether that owner thread is running (on another CPU).
   How does _mtx_lock_sleep() know that that thread still exists
   (lines 311-337 in kern_mutex.c)?

   When adaptive mutexes was implemented there was explicit locking
   around adaptive mutexes code.  When turnstile in mutex code was
   implemented that's locking logic was changed.

3. Why there is no any memory barrier in mtx_init()?  If another thread
   (on another CPU) finds that mutex is initialized using mtx_initialized()
   then it can mtx_lock() it and mtx_lock() it second time, as a result
   mtx_recurse field will be increased, but its value still can be
   uninitialized on architecture with relaxed memory ordering model.

Thanks.

From owner-freebsd-hackers@FreeBSD.ORG  Wed Sep 15 15:46:03 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CEC83106566C
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 15:46:03 +0000 (UTC)
	(envelope-from mdf356@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 96CB28FC14
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 15:46:03 +0000 (UTC)
Received: by iwn34 with SMTP id 34so278483iwn.13
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 08:46:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=4F657KLvZXEkdXaD6MM/T61VWuvtyoffYvYV9GE9Rro=;
	b=Jt+pH0aJUr0mG/5fzLhHMdFx2kKJEfeF4oqyBpE6dkd/eIDd2eEkddm9zsQCuCkKu8
	fTA6doOsoDG90dvNQjOrTtHTxaLfphg0CsDV1wUv/VIKcZAXOvG6MCf4zgY0NtEmZ0sK
	gXCivjhIifG43MYmK1rkzgIWyPWsbk7t72XgU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=uLvQVTVq1HsnbPdYY5YBmvGvOF8Wsf7FMwTGZ0F1SYvO5mE6zpUp6jqwX9PptiehFk
	zNECRduuIa+CJ7WtCHcFUg1LrtK6HYNQXeEGbTTam7FsmeFvy5Pl5tB+4jHA+JPlfJXm
	+uL8iiBfRc10T3smFov5h8BgHltKwl2PsaSRA=
MIME-Version: 1.0
Received: by 10.231.58.198 with SMTP id i6mr1922724ibh.43.1284565560750; Wed,
	15 Sep 2010 08:46:00 -0700 (PDT)
Received: by 10.231.130.34 with HTTP; Wed, 15 Sep 2010 08:46:00 -0700 (PDT)
In-Reply-To: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
References: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
Date: Wed, 15 Sep 2010 08:46:00 -0700
Message-ID: <AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf@mail.gmail.com>
From: Matthew Fleming <mdf356@gmail.com>
To: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-hackers@freebsd.org
Subject: Re: Questions about mutex implementation in kern/kern_mutex.c
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Sep 2010 15:46:03 -0000

I'll take a stab at answering these...

On Wed, Sep 15, 2010 at 6:44 AM, Andrey Simonenko
<simon@comsys.ntu-kpi.kiev.ua> wrote:
> Hello,
>
> I have questions about mutex implementation in kern/kern_mutex.c
> and sys/mutex.h files (current versions of these files):
>
> 1. Is the following statement correct for a volatile pointer or integer
> =A0 variable: if a volatile variable is updated by the compare-and-set
> =A0 instruction (e.g. atomic_cmpset_ptr(&val, ...)), then the current
> =A0 value of such variable can be read without any special instruction
> =A0 (e.g. v =3D val)?
>
> =A0 I checked Assembler code for a function with "v =3D val" and "val =3D=
 v"
> =A0 like statements generated for volatile variable and simple variable
> =A0 and found differences: on ia64 "v =3D val" was implemented by ld.acq =
and
> =A0 "val =3D v" was implemented by st.rel; on mips and sparc64 Assembler =
code
> =A0 can have different order of lines for volatile and simple variable
> =A0 (depends on the code of a function).

I think this depends somewhat on the hardware and what you mean by
"current" value.

If you want a value that is not in-flux, then something like
atomic_cmpset_ptr() setting to the current value is needed, so that
you force any other atomic_cmpset to fail.  However, since there is no
explicit lock involved, there is no strong meaning for "current" value
and a read that does not rely on a value cached in a register is
likely sufficient.  While the "volatile" keyword in C has no explicit
hardware meaning, it often means that a load from memory (or,
presumably, L1-L3 cache) is required.


> 2. Let there is a default (sleep) mutex and adaptive mutexes is enabled.
> =A0 A thread tries to obtain lock quickly and fails, _mtx_lock_sleep()
> =A0 is called, it gets the address of the current mutex's owner thread
> =A0 and checks whether that owner thread is running (on another CPU).
> =A0 How does _mtx_lock_sleep() know that that thread still exists
> =A0 (lines 311-337 in kern_mutex.c)?
>
> =A0 When adaptive mutexes was implemented there was explicit locking
> =A0 around adaptive mutexes code. =A0When turnstile in mutex code was
> =A0 implemented that's locking logic was changed.

It appears that it's possible for the thread pointer to be recycled
between fetching the value of owner and looking at TD_IS_RUNNING.  On
actual hardware, this race is unlikely to occur due to the time it
takes for a thread to release a lock and perform all of thread exit
code before the struct thread is returned to the uma zone.  However,
even once returned to the uma zone on many FreeBSD implementations the
access is safe as the address of the thread is still dereferenceable,
due to the implementation of uma zones.

On e.g. AIX this issue was different because the address range for
threads was determined at compile time (one giant table) and the array
only grew, never shrank, so the thread pointer was always valid and
would be recycled at first opportunity.

It appears to me, from a strict correctness standpoint, that the use
of uma_zalloc/uma_zfree for thread objects is not safe.  But from a
practical implementation POV, the unsafe access in kern_mutex.c will
not cause trouble in the absence of a hypervisor controlling when
virtual CPUs get runtime.

> 3. Why there is no any memory barrier in mtx_init()? =A0If another thread
> =A0 (on another CPU) finds that mutex is initialized using mtx_initialize=
d()
> =A0 then it can mtx_lock() it and mtx_lock() it second time, as a result
> =A0 mtx_recurse field will be increased, but its value still can be
> =A0 uninitialized on architecture with relaxed memory ordering model.

It seems to me that it's generally a programming error to rely on the
return of mtx_initialized(), as there is no serialization with e.g. a
thread calling mtx_destroy().  A fully correct serialization model
would require that a single thread initialize the mtx and then create
any worker threads that will use the mtx.

Cheers,
matthew

From owner-freebsd-hackers@FreeBSD.ORG  Wed Sep 15 18:54:32 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A23DD106566C
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 18:54:32 +0000 (UTC)
	(envelope-from PHeyman@adaranet.com)
Received: from barracuda.adaranet.com (smtp.adaranet.com [72.5.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 80AAA8FC14
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 18:54:32 +0000 (UTC)
X-ASG-Debug-ID: 1284576871-506119a40001-P5m3U7
Received: from SJ-EXCH-1.adaranet.com ([10.10.1.29]) by barracuda.adaranet.com
	with ESMTP id P3HKNALWk1viV6wp for <freebsd-hackers@freebsd.org>;
	Wed, 15 Sep 2010 11:54:31 -0700 (PDT)
X-Barracuda-Envelope-From: PHeyman@adaranet.com
Received: from SJ-EXCH-1.adaranet.com ([fe80::7042:d8c2:5973:c523]) by
	SJ-EXCH-1.adaranet.com ([fe80::7042:d8c2:5973:c523%14]) with mapi;
	Wed, 15 Sep 2010 11:54:31 -0700
From: Paul Heyman <PHeyman@adaranet.com>
X-Barracuda-BBL-IP: fe80::7042:d8c2:5973:c523
X-Barracuda-RBL-IP: fe80::7042:d8c2:5973:c523
To: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Date: Wed, 15 Sep 2010 11:53:16 -0700
X-ASG-Orig-Subj: Crash dump on HP Proliant G6 broken as of V8.0
Thread-Topic: Crash dump on HP Proliant G6 broken as of V8.0
Thread-Index: AQHLVGJZfuEva5LEmEq7OF4hw9IBSpMTQSawgAAYjsOAAAwKSg==
Message-ID: <32AB5C9615CC494997D9ABB1DB12783C024C8C5A9F@SJ-EXCH-1.adaranet.com>
References: <32AB5C9615CC494997D9ABB1DB12783C024C8C5A95@SJ-EXCH-1.adaranet.com>,
	<32AB5C9615CC494997D9ABB1DB12783C024C8DE83F@SJ-EXCH-1.adaranet.com>,
	<32AB5C9615CC494997D9ABB1DB12783C024C8C5A9C@SJ-EXCH-1.adaranet.com>
In-Reply-To: <32AB5C9615CC494997D9ABB1DB12783C024C8C5A9C@SJ-EXCH-1.adaranet.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Barracuda-Connect: UNKNOWN[10.10.1.29]
X-Barracuda-Start-Time: 1284576871
X-Barracuda-URL: http://172.16.10.203:8000/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at adaranet.com
Cc: Patrick Mahan <PMahan@adaranet.com>
Subject: Crash dump on HP Proliant G6 broken as of V8.0
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Sep 2010 18:54:32 -0000

ALL,

The crash dump worked fine in V7.3.

I am debugging crash dump problem on a HP Proliant G6
which uses a SATA drive connected to a CISS Raid Controller.

I have tried this on a x86 box using a non-raid ATA/SATA disk controller
and it works well.

I noticed that in V8.0 there is a new SCSI operating method. In the v7.3 ve=
rsion there was only
CISS_TRANSPORT_METHOD_SIMPLE, but in v8.0 there has been  CISS_TRANSPORT_ME=
THOD_PERF
method added. These methods have different function calls in
ciss_poll_request.

The dump comand starts with a call to dadump.
This function will setup a struct ccb_scsiio structure. This is done by cal=
ling scsi_read_write.
Then the meat of  dump happens when it calls xpt_polled_action, which manag=
es and simualtes
interrupt functionality that is working fine. The disk operations work fine=
 except during a
crash dump.

I have turned debug on for CISS and CAMDEBUG to debug this problem.

In xpt_polled_action (cam_xpt.c) we get past the first polling loop at line=
 3013, as
both devq->send_opening and dev->ccbq.dev_openings are > 0  ( 256 and 254 )=
.

But we do get stuck in the second one at line 3025. We eventually time out
setting start_ccb->ccb_h.status to CAM_CMD_TIMEOUT. The timeout is set with
DA_DEFAULT_TIMEOUT (scsi_da.c) which is set to 60, and is used in the call =
to scsi_read_write.

Here is the debug trace:

Dumping 1240 MB:
ciss_cam_action_io: XPT_SCSI_IO 0:0:0
ciss_get_request: called
ciss_start: post command 150 tag 600
ciss_map_request: called
ciss_request_map_helper: called
ciss_cam_poll: called
ciss_perf_done: completed command 150
ciss_perf_done: completed command 150

ciss_complete: called
ciss_unmap_request: called
ciss_cam_complete: called
_ciss_report_request: called
ciss_cam_complete: SCSI_STATUS_OK
ciss_release_request: called
ciss_complete: called
ciss_unmap_request: called
ciss0: WARNING: completing non-busy request
ciss_cam_complete: called
_ciss_report_request: called
ciss_cam_complete: SCSI_STATUS_OK
 .
 .
 .
 .
after about 60 seconds
ciss0: WARNING: completing non-busy request
ciss0: WARNING: completed command with no submitter
ciss_unmap_request: called
.
.
.
This goes on forever

Thanks
Paul


Paul Heyman
pheyman@adaranetworks.com

From owner-freebsd-hackers@FreeBSD.ORG  Wed Sep 15 19:09:57 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6649D1065696
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 19:09:54 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id C3C8B8FC1A
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 19:09:53 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id 4AA5A46C20;
	Wed, 15 Sep 2010 15:09:53 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 4FECA8A04F;
	Wed, 15 Sep 2010 15:09:52 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-hackers@freebsd.org
Date: Wed, 15 Sep 2010 15:09:49 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <AANLkTinEje-+1P1n33YMKAaciaYHQH+dpwgX6UY1dOux@mail.gmail.com>
In-Reply-To: <AANLkTinEje-+1P1n33YMKAaciaYHQH+dpwgX6UY1dOux@mail.gmail.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201009151509.49728.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Wed, 15 Sep 2010 15:09:52 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: cronfy <cronfy@gmail.com>
Subject: Re: is vfs.lookup_shared unsafe in 7.3?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Sep 2010 19:09:57 -0000

On Monday, September 13, 2010 4:57:15 pm cronfy wrote:
> Hello,
> 
> Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA >
> 40, very slow lstat() at these moments, looks like some kind of lock
> contention) I enabled vfs.lookup_shared=1 on two servers today. One is
> FreeBSD-7.3 kernel csup'ed and built Sep  9 2010 and other is
> FreeBSD-7.3 csup'ed and built Jul 16 2010.
> 
> The server with more fresh kernel is running nice and does not show
> high load anymore. But on the second server it did not help. More,
> after a few hours of work with vfs.lookup_shared=1 I noticed processes
> stucked in "ufs" state. I tried to kill them with no luck. Disabling
> vfs.lookup_shared freezed the whole system.
> 
> So, is vfs.lookup_shared=1 unsafe in 7.3? Did it become more stable
> between 16 Jul and 9 Sep (is it the reason why first system is still
> running?), or should I expect that it will freeze in a near time too?
> 
> Thanks in advance!

No, 7.3 has a bug that can cause these hangs that is probably made worse by
vfs.lookup_shared=1, but can occur even if it is disabled.  You want
these fixes applied (in order, one of them reverts part of another):

Author: jhb
Date: Fri Jul 16 20:23:24 2010
New Revision: 210173
URL: http://svn.freebsd.org/changeset/base/210173

Log:
  When the MNTK_EXTENDED_SHARED mount option was added, some filesystems were
  changed to defer the setting of VN_LOCK_ASHARE() (which clears LK_NOSHARE
  in the vnode lock's flags) until after they had determined if the vnode was
  a FIFO.  This occurs after the vnode has been inserted into a VFS hash or
  some similar table, so it is possible for another thread to find this vnode
  via vget() on an i-node number and block on the vnode lock.  If the lockmgr
  interlock (vnode interlock for vnode locks) is not held when clearing the
  LK_NOSHARE flag, then the lk_flags field can be clobbered.  As a result
  the thread blocked on the vnode lock may never get woken up.  Fix this by
  holding the vnode interlock while modifying the lock flags in this case.
  
  The softupdates code also toggles LK_NOSHARE in one function to close a
  race with snapshots.  Fix this code to grab the interlock while fiddling
  with lk_flags.

Modified:
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
  stable/7/sys/fs/cd9660/cd9660_vfsops.c
  stable/7/sys/fs/udf/udf_vfsops.c
  stable/7/sys/ufs/ffs/ffs_softdep.c
  stable/7/sys/ufs/ffs/ffs_vfsops.c

Author: jhb
Date: Fri Aug 20 20:33:13 2010
New Revision: 211532
URL: http://svn.freebsd.org/changeset/base/211532

Log:
  MFC: Use VN_LOCK_AREC() and VN_LOCK_ASHARE() rather than manipulating
  lockmgr lock flags directly.

Modified:
  stable/7/sys/fs/nwfs/nwfs_node.c
  stable/7/sys/fs/pseudofs/pseudofs_vncache.c
  stable/7/sys/fs/smbfs/smbfs_node.c
  stable/7/sys/gnu/fs/xfs/FreeBSD/xfs_freebsd_iget.c
  stable/7/sys/kern/vfs_lookup.c

Author: jhb
Date: Fri Aug 20 20:58:57 2010
New Revision: 211533
URL: http://svn.freebsd.org/changeset/base/211533

Log:
  Revert 210173 as it did not properly fix the bug.  It assumed that the
  VI_LOCK() for a given vnode was used as the internal interlock for that
  vnode's v_lock lockmgr lock.  This is not the case.  Instead, add dedicated
  routines to toggle the LK_NOSHARE and LK_CANRECURSE flags.  These routines
  lock the lockmgr lock's internal interlock to synchronize the updates to
  the flags member with other threads attempting to acquire the lock.  The
  VN_LOCK_A*() macros now invoke these routines, and the softupdates code
  uses these routines to temporarly enable recursion on buffer locks.
  
  Reviewed by:  kib

Modified:
  stable/7/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c
  stable/7/sys/fs/cd9660/cd9660_vfsops.c
  stable/7/sys/fs/udf/udf_vfsops.c
  stable/7/sys/kern/kern_lock.c
  stable/7/sys/sys/lockmgr.h
  stable/7/sys/sys/vnode.h
  stable/7/sys/ufs/ffs/ffs_softdep.c
  stable/7/sys/ufs/ffs/ffs_vfsops.c

-- 
John Baldwin

From owner-freebsd-hackers@FreeBSD.ORG  Wed Sep 15 19:22:17 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D89BF106567A
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 19:22:17 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 8ED698FC12
	for <freebsd-hackers@freebsd.org>; Wed, 15 Sep 2010 19:22:17 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id 2D61B46C20;
	Wed, 15 Sep 2010 15:22:17 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 2B0808A03C;
	Wed, 15 Sep 2010 15:22:16 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-hackers@freebsd.org
Date: Wed, 15 Sep 2010 15:22:15 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
	<AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf@mail.gmail.com>
In-Reply-To: <AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf@mail.gmail.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201009151522.15593.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Wed, 15 Sep 2010 15:22:16 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>,
	Matthew Fleming <mdf356@gmail.com>
Subject: Re: Questions about mutex implementation in kern/kern_mutex.c
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Sep 2010 19:22:17 -0000

On Wednesday, September 15, 2010 11:46:00 am Matthew Fleming wrote:
> I'll take a stab at answering these...
> 
> On Wed, Sep 15, 2010 at 6:44 AM, Andrey Simonenko
> <simon@comsys.ntu-kpi.kiev.ua> wrote:
> > Hello,
> >
> > I have questions about mutex implementation in kern/kern_mutex.c
> > and sys/mutex.h files (current versions of these files):
> >
> > 1. Is the following statement correct for a volatile pointer or integer
> >   variable: if a volatile variable is updated by the compare-and-set
> >   instruction (e.g. atomic_cmpset_ptr(&val, ...)), then the current
> >   value of such variable can be read without any special instruction
> >   (e.g. v = val)?
> >
> >   I checked Assembler code for a function with "v = val" and "val = v"
> >   like statements generated for volatile variable and simple variable
> >   and found differences: on ia64 "v = val" was implemented by ld.acq and
> >   "val = v" was implemented by st.rel; on mips and sparc64 Assembler code
> >   can have different order of lines for volatile and simple variable
> >   (depends on the code of a function).
> 
> I think this depends somewhat on the hardware and what you mean by
> "current" value.
> 
> If you want a value that is not in-flux, then something like
> atomic_cmpset_ptr() setting to the current value is needed, so that
> you force any other atomic_cmpset to fail.  However, since there is no
> explicit lock involved, there is no strong meaning for "current" value
> and a read that does not rely on a value cached in a register is
> likely sufficient.  While the "volatile" keyword in C has no explicit
> hardware meaning, it often means that a load from memory (or,
> presumably, L1-L3 cache) is required.

Actually, all we care about is getting a consistent snapshot of the value of 
the lock cookie at some point in time.  For that 'v = val' works fine.  The 
value may certainly be stale, but the mutex code handles these races in two 
ways:

  1) If MTX_CONTESTED is not set, then the lock cookie value can change at
     any time to either be unlocked, locked by another thread, or to
     become contested.  If any of those actions occur, then the attempt to
     set the MTX_CONTESTED bit via atomic_cmpset() in _mtx_lock_sleep() will
     fail causing the code to retry its loop until it succesfully sets
     MTX_CONTESTED or it notices a different lock cookie state.

  2) Once MTX_CONTESTED is set, the value of the lock cookie will not be
     changed unless the associated turnstile chain is locked.  This means that
     once we have locked the turnstile chain and verified that MTX_CONTESTED
     is set (or successfully set the bit), we can call turnstile_wait() to
     block without assured that the owner of the lock will resume this thread
     via turnstile_wakeup() when it releases the lock.

> > 2. Let there is a default (sleep) mutex and adaptive mutexes is enabled.
> >   A thread tries to obtain lock quickly and fails, _mtx_lock_sleep()
> >   is called, it gets the address of the current mutex's owner thread
> >   and checks whether that owner thread is running (on another CPU).
> >   How does _mtx_lock_sleep() know that that thread still exists
> >   (lines 311-337 in kern_mutex.c)?
> >
> >   When adaptive mutexes was implemented there was explicit locking
> >   around adaptive mutexes code.  When turnstile in mutex code was
> >   implemented that's locking logic was changed.
> 
> It appears that it's possible for the thread pointer to be recycled
> between fetching the value of owner and looking at TD_IS_RUNNING.  On
> actual hardware, this race is unlikely to occur due to the time it
> takes for a thread to release a lock and perform all of thread exit
> code before the struct thread is returned to the uma zone.  However,
> even once returned to the uma zone on many FreeBSD implementations the
> access is safe as the address of the thread is still dereferenceable,
> due to the implementation of uma zones.
> 
> On e.g. AIX this issue was different because the address range for
> threads was determined at compile time (one giant table) and the array
> only grew, never shrank, so the thread pointer was always valid and
> would be recycled at first opportunity.
> 
> It appears to me, from a strict correctness standpoint, that the use
> of uma_zalloc/uma_zfree for thread objects is not safe.  But from a
> practical implementation POV, the unsafe access in kern_mutex.c will
> not cause trouble in the absence of a hypervisor controlling when
> virtual CPUs get runtime.

Yes, it is a known "accepted" race.  This does probably warrant a comment
to say as much.  One could perhaps remove the race by using the owning
thread's td_cpu to do a pcpu_find() and comparing pc_curthread against
the cached 'owner' value instead.  I think even in that case you can still
be subject to the same theoretical race however if a HV prevented you from
running in between setting 'owner' and dereferencing 'owner->td_oncpu'.

However, I might actually prefer switching to the 'pc_curthread' approach
only because it does less work on each spin.

> > 3. Why there is no any memory barrier in mtx_init()?  If another thread
> >   (on another CPU) finds that mutex is initialized using mtx_initialized()
> >   then it can mtx_lock() it and mtx_lock() it second time, as a result
> >   mtx_recurse field will be increased, but its value still can be
> >   uninitialized on architecture with relaxed memory ordering model.
> 
> It seems to me that it's generally a programming error to rely on the
> return of mtx_initialized(), as there is no serialization with e.g. a
> thread calling mtx_destroy().  A fully correct serialization model
> would require that a single thread initialize the mtx and then create
> any worker threads that will use the mtx.

Yes, it is the caller's job to not expose a mtx until after it has been 
initialized.  A memory barrier in mtx_init() can't solve all those races.  If 
you put an object containing a mutex on a global queue and only invoke 
mtx_init() after dropping the global lock protecting the global queue, no 
amount of memory barriers in mtx_init() will save you.

-- 
John Baldwin

From owner-freebsd-hackers@FreeBSD.ORG  Wed Sep 15 20:24:50 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A305B1065672;
	Wed, 15 Sep 2010 20:24:50 +0000 (UTC)
	(envelope-from nox@jelal.kn-bremen.de)
Received: from smtp.kn-bremen.de (gelbbaer.kn-bremen.de [78.46.108.116])
	by mx1.freebsd.org (Postfix) with ESMTP id 62EC48FC15;
	Wed, 15 Sep 2010 20:24:50 +0000 (UTC)
Received: by smtp.kn-bremen.de (Postfix, from userid 10)
	id 306C21E007A9; Wed, 15 Sep 2010 22:07:18 +0200 (CEST)
Received: from triton8.kn-bremen.de (noident@localhost [127.0.0.1])
	by triton8.kn-bremen.de (8.14.4/8.14.3) with ESMTP id o8FK6ZpK039590;
	Wed, 15 Sep 2010 22:06:35 +0200 (CEST)
	(envelope-from nox@triton8.kn-bremen.de)
Received: (from nox@localhost)
	by triton8.kn-bremen.de (8.14.4/8.14.3/Submit) id o8FK6ZUR039589;
	Wed, 15 Sep 2010 22:06:35 +0200 (CEST) (envelope-from nox)
From: Juergen Lock <nox@jelal.kn-bremen.de>
Date: Wed, 15 Sep 2010 22:06:34 +0200
To: hackers@freebsd.org
Message-ID: <20100915200634.GA38314@triton8.kn-bremen.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.20 (2009-06-14)
X-Mailman-Approved-At: Wed, 15 Sep 2010 20:41:01 +0000
Cc: Doug Rabson <dfr@freebsd.org>
Subject: So I got "The D Programming Language" (and: threaded .xz
 compression)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Sep 2010 20:24:50 -0000

(that's this book:

	http://amazon.com/exec/obidos/ASIN/0321635361/modecdesi-20

Author's homepage:

	http://erdani.com/

) ...and finally played with the language a bit.  I've posted
some notes about getting dmd 2.048 running on FreeBSD (that's the
D 2.0 compiler + runtime + phobos libs), debugging with gdb head and
Doug Rabson's D-aware debugger ngdb [1], and my first (useful) hack,
a threaded .xz compressor, here:

	http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=117243

(more links in there; also read the followups...)

[1] ngdb announce message:

	http://lists.freebsd.org/pipermail/freebsd-current/2009-August/011071.html

 Cheeers,
	Juergen

From owner-freebsd-hackers@FreeBSD.ORG  Wed Sep 15 21:43:32 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EECF3106564A;
	Wed, 15 Sep 2010 21:43:32 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 65A7C8FC08;
	Wed, 15 Sep 2010 21:43:32 +0000 (UTC)
Received: from lurza.secnetix.de (localhost [127.0.0.1])
	by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id o8FLhF2c022234;
	Wed, 15 Sep 2010 23:43:30 +0200 (CEST)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.14.3/8.14.3/Submit) id o8FLhE9p022233;
	Wed, 15 Sep 2010 23:43:14 +0200 (CEST) (envelope-from olli)
Date: Wed, 15 Sep 2010 23:43:14 +0200 (CEST)
Message-Id: <201009152143.o8FLhE9p022233@lurza.secnetix.de>
From: Oliver Fromme <olli@lurza.secnetix.de>
To: freebsd-hackers@FreeBSD.ORG, wblock@wonkity.com, mav@FreeBSD.ORG
In-Reply-To: <alpine.BSF.2.00.1003051636030.2481@wonkity.com>
X-Newsgroups: list.freebsd-hackers
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX)
	(FreeBSD/6.4-PRERELEASE-20080904 (i386))
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5
	(lurza.secnetix.de [127.0.0.1]);
	Wed, 15 Sep 2010 23:43:31 +0200 (CEST)
Cc: 
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Sep 2010 21:43:33 -0000

Warren Block <wblock@wonkity.com> wrote:
 > [...]
 > 8. Alexander Motin has an updated CAM version of the ATA system which 
 > will eventually replace the existing one.  In -CURRENT, anyway.  He was 
 > kind enough to look at my event handler.  My understanding is that he is 
 > looking at implementing the head parking/standby mechanism in that new 
 > code.

The patch below will work with the new CAM ATA driver
(i.e. ada(4) disks).  It adds a sysctl, so you can switch
the spin-down off if you're going to just reboot:
# sysctl kern.cam.ada.spindown_shutdown=0

This patch applies to stable/8, but I think it should
work with current, too (I haven't tried because I don't
have a machine running HEAD that has ada(4) disks).

Best regards
   Oliver


--- ata_da.c.orig	2010-05-23 18:16:33.000000000 +0200
+++ ata_da.c	2010-09-15 22:48:03.000000000 +0200
@@ -79,7 +79,8 @@
 	ADA_FLAG_CAN_TRIM	= 0x080,
 	ADA_FLAG_OPEN		= 0x100,
 	ADA_FLAG_SCTX_INIT	= 0x200,
-	ADA_FLAG_CAN_CFA        = 0x400
+	ADA_FLAG_CAN_CFA        = 0x400,
+	ADA_FLAG_CAN_POWERMGT   = 0x800
 } ada_flags;
 
 typedef enum {
@@ -180,6 +181,10 @@
 #define	ADA_DEFAULT_SEND_ORDERED	1
 #endif
 
+#ifndef	ADA_DEFAULT_SPINDOWN_SHUTDOWN
+#define	ADA_DEFAULT_SPINDOWN_SHUTDOWN	1
+#endif
+
 /*
  * Most platforms map firmware geometry to actual, but some don't.  If
  * not overridden, default to nothing.
@@ -191,6 +196,7 @@
 static int ada_retry_count = ADA_DEFAULT_RETRY;
 static int ada_default_timeout = ADA_DEFAULT_TIMEOUT;
 static int ada_send_ordered = ADA_DEFAULT_SEND_ORDERED;
+static int ada_spindown_shutdown = ADA_DEFAULT_SPINDOWN_SHUTDOWN;
 
 SYSCTL_NODE(_kern_cam, OID_AUTO, ada, CTLFLAG_RD, 0,
             "CAM Direct Access Disk driver");
@@ -203,6 +209,9 @@
 SYSCTL_INT(_kern_cam_ada, OID_AUTO, ada_send_ordered, CTLFLAG_RW,
            &ada_send_ordered, 0, "Send Ordered Tags");
 TUNABLE_INT("kern.cam.ada.ada_send_ordered", &ada_send_ordered);
+SYSCTL_INT(_kern_cam_ada, OID_AUTO, spindown_shutdown, CTLFLAG_RW,
+           &ada_spindown_shutdown, 0, "Spin down upon shutdown");
+TUNABLE_INT("kern.cam.ada.spindown_shutdown", &ada_spindown_shutdown);
 
 /*
  * ADA_ORDEREDTAG_INTERVAL determines how often, relative
@@ -665,6 +674,8 @@
 		softc->flags |= ADA_FLAG_CAN_48BIT;
 	if (cgd->ident_data.support.command2 & ATA_SUPPORT_FLUSHCACHE)
 		softc->flags |= ADA_FLAG_CAN_FLUSHCACHE;
+	if (cgd->ident_data.support.command2 & ATA_SUPPORT_POWERMGT)
+		softc->flags |= ADA_FLAG_CAN_POWERMGT;
 	if (cgd->ident_data.satacapabilities & ATA_SUPPORT_NCQ &&
 	    cgd->inq_flags & SID_CmdQue)
 		softc->flags |= ADA_FLAG_CAN_NCQ;
@@ -1222,6 +1233,57 @@
 					 /*getcount_only*/0);
 		cam_periph_unlock(periph);
 	}
+
+	if (ada_spindown_shutdown == 0)
+		return;
+
+	DELAY(500000);
+
+	TAILQ_FOREACH(periph, &adadriver.units, unit_links) {
+		union ccb ccb;
+
+		/* If we paniced with lock held - not recurse here. */
+		if (cam_periph_owned(periph))
+			continue;
+		cam_periph_lock(periph);
+		softc = (struct ada_softc *)periph->softc;
+		/*
+		 * We only spin-down the drive if it is capable of it..
+		 */
+		if ((softc->flags & ADA_FLAG_CAN_POWERMGT) == 0) {
+			cam_periph_unlock(periph);
+			continue;
+		}
+
+		/* XXX Hide this behind bootverbose? */
+		xpt_print(periph->path, "spin-down\n");
+
+		xpt_setup_ccb(&ccb.ccb_h, periph->path, CAM_PRIORITY_NORMAL);
+
+		ccb.ccb_h.ccb_state = ADA_CCB_DUMP;
+		cam_fill_ataio(&ccb.ataio,
+				    1,
+				    adadone,
+				    CAM_DIR_NONE,
+				    0,
+				    NULL,
+				    0,
+				    ada_default_timeout*1000);
+
+		ata_28bit_cmd(&ccb.ataio, ATA_STANDBY_IMMEDIATE, 0, 0, 0);
+		xpt_polled_action(&ccb);
+
+		if ((ccb.ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP)
+			xpt_print(periph->path, "Spin-down disk failed\n");
+
+		if ((ccb.ccb_h.status & CAM_DEV_QFRZN) != 0)
+			cam_release_devq(ccb.ccb_h.path,
+					 /*relsim_flags*/0,
+					 /*reduction*/0,
+					 /*timeout*/0,
+					 /*getcount_only*/0);
+		cam_periph_unlock(periph);
+	}
 }
 
 #endif /* _KERNEL */



-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch�ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M�n-
chen, HRB 125758,  Gesch�ftsf�hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

Python is executable pseudocode.  Perl is executable line noise.

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 00:12:28 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 78F42106564A
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 00:12:28 +0000 (UTC)
	(envelope-from PMahan@adaranet.com)
Received: from barracuda.adaranet.com (smtp.adaranet.com [72.5.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 5CE288FC16
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 00:12:28 +0000 (UTC)
X-ASG-Debug-ID: 1284595010-50611bbc0001-P5m3U7
Received: from SJ-EXCH-1.adaranet.com ([10.10.1.29]) by barracuda.adaranet.com
	with ESMTP id EWSJLDgI0sj1uFlI for <freebsd-hackers@freebsd.org>;
	Wed, 15 Sep 2010 16:56:50 -0700 (PDT)
X-Barracuda-Envelope-From: PMahan@adaranet.com
Received: from mycroft.adaranet.com (10.10.24.100) by SJ-EXCH-1.adaranet.com
	(10.10.1.29) with Microsoft SMTP Server (TLS) id 8.1.240.5;
	Wed, 15 Sep 2010 16:56:49 -0700
Message-ID: <4C915E4F.9030006@adaranet.com>
X-Barracuda-BBL-IP: nil
Date: Wed, 15 Sep 2010 17:01:19 -0700
From: Patrick Mahan <pmahan@adaranet.com>
User-Agent: Thunderbird 2.0.0.23 (X11/20091021)
MIME-Version: 1.0
To: <freebsd-hackers@freebsd.org>
X-ASG-Orig-Subj: odd issues with DDB vs GDB
Content-Type: multipart/mixed; boundary="------------090702000608020704010105"
X-Barracuda-Connect: UNKNOWN[10.10.1.29]
X-Barracuda-Start-Time: 1284595010
X-Barracuda-URL: http://172.16.10.203:8000/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at adaranet.com
Subject: odd issues with DDB vs GDB
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 00:12:28 -0000

--------------090702000608020704010105
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit

All,

I am trying to debug a system hang occurring on my HP Proliant G6 running some of our
kernel software.  I am seeing that under certain test loads, the system will hang-up
complete, no keyboard, no console, etc.  I suspect it is some of the kernel code that
I have inherited that contains a lot of locking (lots of data structure, each having
their own mutex lock (sleepable)).

I rebuilt the kernel to include the following:

options KDB
options DDB
options GDB
options MUTEX_NOINLINE
options MUTEX_DEBUG
options WITNESS
options WITNESS_SKIPSPIN

options SW_WATCHDOG  # Enable to force us into the debugger on a hang

This places me in the kernel DDB debugger.  The backtrace show by DDB
makes a lot of sense, it is showing we are blocked in _mtx_lock_flags()+0x6f.

Great, so I go to enable GDB -

db> gdb
Step to enter the remote GDB backend.
db> s
$T0510:a6f86c80fff*";thread:186c0;#62
gdb kernel.debug
Current directory is ~/devel/pm_bz5486/FBSD80REL/amd64/obj/usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/MPATH/
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

(gdb) target remote 10.10.29.111:7028
Remote debugging using 10.10.29.111:7028

0xffffffff806cf8a6 in kdb_init () at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/subr_kdb.c:361
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
warning: shared library handler failed to enable breakpoint

gdb>

So right away I am somewhat suspicious as it is showing me a completely different entry
point.

DDB showed

Tracing pid 0 tid 100032 td 0xffffff0002668390
breakpoint() at breakpoint+0x5
kdb_enter() at kdb_enter+0x52
watchdog_fire() at watchdog_fire+0xda
hardclock() at hardclock+0x73
lapic_handle_timer() at lapic_handle_timer+0x120
Xtimerint() at Xtimerint+0x8c

But GDB is showing the above.

A backtrace (bt) in GDB does not show the same stack signature.

I have attached the complete log for those who are interested.  Is there a reason for the wide
difference between DDB and GDB?  Am I invoking gdb incorrectly?

Thanks for the education, as always!

Patrick

--------------090702000608020704010105
Content-Type: text/plain; name="kernel_debug_prob.txt"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline; filename="kernel_debug_prob.txt"

Debugging a system hang.  Enabled watchdog(4) built kernel with KDB, DDB and
GDB.  I am trying to debug this via remote GDB but what DDB shows for a stack
trace and what GDB shows are two seperate animals.

External serial port setup with the following in /boot/loader.conf

console="comconsole vidconsole"
comconsole_speed=9600
hint.uart.0.flags="0x90"

Serial is accessed via a cyclades ACS console server.  'telnet 10.10.29.111 70XX' where XX is the physical port number.

System comes up fine, testing is initiated, eventually the system hangs and
the watchdog fires dropping us into DDB -

DDB output

db> trace
Tracing pid 0 tid 100032 td 0xffffff0002668390
breakpoint() at breakpoint+0x5
kdb_enter() at kdb_enter+0x52
watchdog_fire() at watchdog_fire+0xda
hardclock() at hardclock+0x73
lapic_handle_timer() at lapic_handle_timer+0x120
Xtimerint() at Xtimerint+0x8c
--- interrupt, rip = 0xffffffff80688532, rsp = 0xffffff800011e460, rbp = 0xffffff800011e4c0 ---
_mtx_lock_sleep() at _mtx_lock_sleep+0x92
_mtx_lock_flags() at _mtx_lock_flags+0x6f
VCDgetWithIIFremote() at VCDgetWithIIFremote+0x3f
ProcessDataPkt() at ProcessDataPkt+0x3dc
ip_input() at ip_input+0xa24
netisr_dispatch_src() at netisr_dispatch_src+0xe3
netisr_dispatch() at netisr_dispatch+0x20
gif_input() at gif_input+0x324
in_gif_input() at in_gif_input+0x28f
encap4_input() at encap4_input+0x1b8
ip_input() at ip_input+0xd1a
netisr_dispatch_src() at netisr_dispatch_src+0xe3
netisr_dispatch() at netisr_dispatch+0x20
ether_demux() at ether_demux+0x1f3
ether_input() at ether_input+0x4ab
em_rxeof() at em_rxeof+0x410
em_handle_que() at em_handle_que+0x6f
taskqueue_run() at taskqueue_run+0xbb
taskqueue_thread_loop() at taskqueue_thread_loop+0x33
fork_exit() at fork_exit+0xba
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff800011ed30, rbp = 0 ---

db>gdb
Step to enter the remote GDB backend.
db>s
^]<enter>
telnet> quit
#
# Enter the debugger via remote gdb
#
gdb kernel.debug
Current directory is ~/devel/pm_bz5486/FBSD80REL/amd64/obj/usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/MPATH/
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
(gdb) target remote 10.10.29.111:7028
Remote debugging using 10.10.29.111:7028
0xffffffff806cf8a6 in kdb_init () at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/subr_kdb.c:361
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
warning: shared library handler failed to enable breakpoint
(gdb) bt
#0  0xffffffff806cf8a6 in kdb_init () at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/subr_kdb.c:361
#1  0xffffffff8064c4da in _cv_wait (cvp=0xffffff800011e340, lock=0xffffffff80a9cd1d) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/kern_condvar.c:102
#2  0xffffffff8064bd33 in tvtohz (tv=0x2668390) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/kern_clock.c:371
#3  0xffffffff80988cf0 in lapic_handle_timer (frame=0xffffff800011e3b0) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/amd64/amd64/local_apic.c:792
#4  0xffffffff809816ac in Xinvlpg () at apic_vector.S:146
#5  0xffffff0107bff3a0 in ?? ()
#6  0xffffff0107bff3a0 in ?? ()
#7  0x0000000000000004 in ?? ()
#8  0xffffff0002668390 in ?? ()
#9  0x0000000000000943 in ?? ()
#10 0xffffff800011e5e4 in ?? ()
#11 0x0000000000000004 in ?? ()
#12 0xffffff0002668000 in ?? ()
#13 0xffffff800011e4c0 in ?? ()
#14 0x000000000afe0014 in ?? ()
#15 0x0000000000000006 in ?? ()
#16 0xffffffff806dfd30 in taskqueue_thread_loop (arg=0xffffff0002668000) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/subr_taskqueue.c:359
#17 0xffffffff8068811f in atomic_cmpset_long (dst=0x7bff300, exp=0xffffffff80a9bc70, src=0x9430011e530) at atomic.h:158
#18 0xffffffff8063b6cf in VAagingTimer (dummy=0xffffff0107bff388) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/ipr/virtual_circuits.c:2389
#19 0xffffffff8062998c in ProcessDataPkt (socklyr=0x0, iif=0xffffff010798de00, protocol=0x6, src_addr={s_addr = 0xafe0014}, dst_addr={s_addr = 0xafa001b}, src_port=0x1f90, dst_port=0x402, tcp_flags=0x12, pkt=0xffffff0003ad8700) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/ipr/mpvc_forward.c:227
#20 0xffffffff807aa6c4 in ip_input (m=0xffffff0003ad8700) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/netinet/ip_input.c:1032
#21 0xffffffff80778d43 in netisr_dispatch_src (proto=0x1, source=0x0, m=0xffffff0003ad8700) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/net/netisr.c:934
#22 0xffffffff80779060 in netisr_start_swi (cpuid=0xffffff00, pc=0xffffffff8104eee0) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/net/netisr.c:1034
#23 0xffffffff8076ff14 in gif_ioctl (ifp=0xffffff00026bd800, cmd=0x20011e790, data=0xffffffff8076ff14 "��fff\220ff\220ff\220UH\211�H\201�\220") at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/net/if_gif.c:694
#24 0xffffffff8079af8f in gif_validate4 (ip=0xffffffff807a67f4, sc=0xffffff0003ad8700, ifp=0x1449ba01c0) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/netinet/in_gif.c:396
#25 0xffffffff807a5c38 in encap6_input (mp=0xffffff0002668390, offp=0x1400000002, proto=0x4) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/netinet/ip_encap.c:206
#26 0xffffffff807aa9ba in __bswap16 (_x=0x0) at endian.h:135
#27 0xffffffff80778d43 in netisr_dispatch_src (proto=0x1, source=0x0, m=0xffffff0003ad8700) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/net/netisr.c:934
#28 0xffffffff80779060 in netisr_start_swi (cpuid=0xffffffff, pc=0xffffff800011ea10) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/net/netisr.c:1034
#29 0xffffffff8076bc83 in ether_demux (ifp=0xffffff00026f6800, m=0xffffff0003ad8700) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/net/if_ethersubr.c:911
#30 0xffffffff8076ba4b in ether_demux (ifp=0xffffff0003ad8700, m=0xffffff800011ead0) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/net/if_ethersubr.c:778
#31 0xffffffff8038aa70 in em_rxeof (rxr=0xffffff0002719c00, count=0x63, done=0x0) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/dev/e1000/if_em.c:4188
#32 0xffffffff8038360f in em_handle_que (context=0xffffff80003fc000, pending=0x1) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/dev/e1000/if_em.c:1451
#33 0xffffffff806df78b in taskqueue_drain (queue=0xffffff80004006e0, task=0x100000001) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/subr_taskqueue.c:256
#34 0xffffffff806dfd63 in taskqueue_thread_loop () at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/subr_taskqueue.c:375
#35 0x0000034380d52f40 in ?? ()
#36 0xffffff80004006e0 in ?? ()
#37 0xffffff0002711c00 in ?? ()
#38 0xffffff80004006e0 in ?? ()
#39 0xffffff800011ec70 in ?? ()
#40 0xffffffff8066b08a in fork_exit (callout=0xffffffff806df78b <taskqueue_drain+11>, arg=0xffffff800011ebc0, frame=0xffffff0002711c00) at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/kern_fork.c:856
Previous frame identical to this frame (corrupt stack?)

I also did an "info threads" (output omitted)

Here is thread 100032 as gdb sees it.

  392 Thread 100032  0xffffffff806cf8a6 in kdb_init () at /usr/home/pmahan/devel/pm_bz5486/FBSD80REL/src/sys/kern/subr_kdb.c:361

while ddb saw

Tracing pid 0 tid 100032 td 0xffffff0002668390

Why can I not see the stack correctly in gdb?

--------------090702000608020704010105--

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 00:49:02 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@FreeBSD.ORG
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id 20EF2106567A; Thu, 16 Sep 2010 00:49:02 +0000 (UTC)
Date: Thu, 16 Sep 2010 00:49:02 +0000
From: Alexander Best <arundel@freebsd.org>
To: Oliver Fromme <olli@lurza.secnetix.de>
Message-ID: <20100916004902.GA46401@freebsd.org>
References: <alpine.BSF.2.00.1003051636030.2481@wonkity.com>
	<201009152143.o8FLhE9p022233@lurza.secnetix.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <201009152143.o8FLhE9p022233@lurza.secnetix.de>
Cc: freebsd-hackers@FreeBSD.ORG, mav@FreeBSD.ORG
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 00:49:02 -0000

On Wed Sep 15 10, Oliver Fromme wrote:
> Warren Block <wblock@wonkity.com> wrote:
>  > [...]
>  > 8. Alexander Motin has an updated CAM version of the ATA system which 
>  > will eventually replace the existing one.  In -CURRENT, anyway.  He was 
>  > kind enough to look at my event handler.  My understanding is that he is 
>  > looking at implementing the head parking/standby mechanism in that new 
>  > code.
> 
> The patch below will work with the new CAM ATA driver
> (i.e. ada(4) disks).  It adds a sysctl, so you can switch
> the spin-down off if you're going to just reboot:
> # sysctl kern.cam.ada.spindown_shutdown=0

i haven't tested your patch yet, but i don't think deciding whether to spin
down the hdd should be decided merely from the sysctl value.

the hdd should spindown when a shutdown has been issued and not spindown,
if a reboot has been issued.

either people have the sysctl set to 1 in which case a reboot will cause a
spindown (which isn't healthy for the hdd)
...or people will set it to 0 in which case everything remains just the way it
is.

imo the sysctl should stay, but shuld have a different meaning. if it is set to
1 (which should be the default) a shutdown will issue a spindown; a reboot
won't.
if for some reason people want back the current behavior (no spindown even
during a shutdown) they need to set it to 0.

deciding whether freebsd reboots or shuts down cannot be done from a script,
since users might use the reboot or halt commands in which case (if i'm not
mistaken) all shutdown scripts get skipped.

cheers.
alex

> 
> This patch applies to stable/8, but I think it should
> work with current, too (I haven't tried because I don't
> have a machine running HEAD that has ada(4) disks).
> 
> Best regards
>    Oliver
> 
> <snip>
> 
> 
> -- 
> Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
> Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch�ftsfuehrung:
> secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M�n-
> chen, HRB 125758,  Gesch�ftsf�hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart
> 
> FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd
> 
> Python is executable pseudocode.  Perl is executable line noise.

-- 
a13x

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 01:01:20 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id C7B9E1065679; Thu, 16 Sep 2010 01:01:20 +0000 (UTC)
Date: Thu, 16 Sep 2010 01:01:20 +0000
From: Alexander Best <arundel@freebsd.org>
To: freebsd-hackers@freebsd.org
Message-ID: <20100916010120.GA49997@freebsd.org>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="W/nzBZO5zC0uMSeA"
Content-Disposition: inline
Subject: traling whitespace in CFLAGS if make.conf:CPUTYPE is not
	defined/empty
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 01:01:20 -0000


--W/nzBZO5zC0uMSeA
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

hi there,

after discovering PR #114082 i noticed that with CPUTYPE not being defined in
make.conf, `make -VCFLAGS` reports a trailing whitespace for CFLAGS.
the reason for this is that ${_CPUCFLAGS} gets added to CFLAGS even if it's
empty.

the following patch should take care of the problem. i also added the same
logik to COPTFLAGS. although i wasn't able to trigger the trailing whitespace,
it should still introduce a cleaner behaviour.

cheers.
alex

-- 
a13x

--W/nzBZO5zC0uMSeA
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="bsd.cpu.mk-and-kern.pre.mk.diff"
diff --git a/share/mk/bsd.cpu.mk b/share/mk/bsd.cpu.mk
index e3ad18b..fa7fb32 100644
--- a/share/mk/bsd.cpu.mk
+++ b/share/mk/bsd.cpu.mk
@@ -6,6 +6,7 @@
 
 .if !defined(CPUTYPE) || empty(CPUTYPE)
 _CPUCFLAGS =
+NO_CPU_CFLAGS =
 . if ${MACHINE_ARCH} == "i386"
 MACHINE_CPU = i486
 . elif ${MACHINE_ARCH} == "amd64"
diff --git a/sys/conf/kern.pre.mk b/sys/conf/kern.pre.mk
index d4bdc1f..9929176 100644
--- a/sys/conf/kern.pre.mk
+++ b/sys/conf/kern.pre.mk
@@ -23,6 +23,10 @@ NM?=		nm
 OBJCOPY?=	objcopy
 SIZE?=		size
 
+.if !defined(CPUTYPE) || empty(CPUTYPE)
+_CPUCFLAGS =
+NO_CPU_COPTFLAGS =
+.endif
 .if ${CC:T:Micc} == "icc"
 COPTFLAGS?=	-O
 .else

--W/nzBZO5zC0uMSeA--

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 01:12:28 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 60AB8106564A
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 01:12:28 +0000 (UTC)
	(envelope-from PMahan@adaranet.com)
Received: from barracuda.adaranet.com (smtp.adaranet.com [72.5.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 481278FC08
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 01:12:28 +0000 (UTC)
X-ASG-Debug-ID: 1284595102-50611bbe0001-P5m3U7
Received: from SJ-EXCH-1.adaranet.com ([10.10.1.29]) by barracuda.adaranet.com
	with ESMTP id dvUHndTDQGjm2ENe for <freebsd-hackers@freebsd.org>;
	Wed, 15 Sep 2010 16:58:22 -0700 (PDT)
X-Barracuda-Envelope-From: PMahan@adaranet.com
Received: from mycroft.adaranet.com (10.10.24.100) by SJ-EXCH-1.adaranet.com
	(10.10.1.29) with Microsoft SMTP Server (TLS) id 8.1.240.5;
	Wed, 15 Sep 2010 16:58:22 -0700
Message-ID: <4C915EAC.8020509@adaranet.com>
X-Barracuda-BBL-IP: nil
Date: Wed, 15 Sep 2010 17:02:52 -0700
From: Patrick Mahan <pmahan@adaranet.com>
User-Agent: Thunderbird 2.0.0.23 (X11/20091021)
MIME-Version: 1.0
To: <freebsd-hackers@freebsd.org>
X-ASG-Orig-Subj: [Fwd: Crash dump on HP Proliant G6 broken as of V8.0]
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
X-Barracuda-Connect: UNKNOWN[10.10.1.29]
X-Barracuda-Start-Time: 1284595102
X-Barracuda-URL: http://172.16.10.203:8000/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at adaranet.com
Subject: [Fwd: Crash dump on HP Proliant G6 broken as of V8.0]
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 01:12:28 -0000

Forwarding for a colleague,

Patrick

-------- Original Message --------
Subject: Crash dump on HP Proliant G6 broken as of V8.0
Date: Wed, 15 Sep 2010 11:53:16 -0700
From: Paul Heyman <PHeyman@adaranet.com>
To: freebsd-hackers@freebsd.org <freebsd-hackers@freebsd.org>
CC: Patrick Mahan <PMahan@adaranet.com>
References: 
<32AB5C9615CC494997D9ABB1DB12783C024C8C5A95@SJ-EXCH-1.adaranet.com>,<32AB5C9615CC494997D9ABB1DB12783C024C8DE83F@SJ-EXCH-1.adaranet.com>,<32AB5C9615CC494997D9ABB1DB12783C024C8C5A9C@SJ-EXCH-1.adaranet.com>

ALL,

The crash dump worked fine in V7.3.

I am debugging crash dump problem on a HP Proliant G6
which uses a SATA drive connected to a CISS Raid Controller.

I have tried this on a x86 box using a non-raid ATA/SATA disk controller
and it works well.

I noticed that in V8.0 there is a new SCSI operating method. In the v7.3 version there was only
CISS_TRANSPORT_METHOD_SIMPLE, but in v8.0 there has been  CISS_TRANSPORT_METHOD_PERF
method added. These methods have different function calls in
ciss_poll_request.

The dump comand starts with a call to dadump.
This function will setup a struct ccb_scsiio structure. This is done by calling scsi_read_write.
Then the meat of  dump happens when it calls xpt_polled_action, which manages and simualtes
interrupt functionality that is working fine. The disk operations work fine except during a
crash dump.

I have turned debug on for CISS and CAMDEBUG to debug this problem.

In xpt_polled_action (cam_xpt.c) we get past the first polling loop at line 3013, as
both devq->send_opening and dev->ccbq.dev_openings are > 0  ( 256 and 254 ).

But we do get stuck in the second one at line 3025. We eventually time out
setting start_ccb->ccb_h.status to CAM_CMD_TIMEOUT. The timeout is set with
DA_DEFAULT_TIMEOUT (scsi_da.c) which is set to 60, and is used in the call to scsi_read_write.

Here is the debug trace:

Dumping 1240 MB:
ciss_cam_action_io: XPT_SCSI_IO 0:0:0
ciss_get_request: called
ciss_start: post command 150 tag 600
ciss_map_request: called
ciss_request_map_helper: called
ciss_cam_poll: called
ciss_perf_done: completed command 150
ciss_perf_done: completed command 150

ciss_complete: called
ciss_unmap_request: called
ciss_cam_complete: called
_ciss_report_request: called
ciss_cam_complete: SCSI_STATUS_OK
ciss_release_request: called
ciss_complete: called
ciss_unmap_request: called
ciss0: WARNING: completing non-busy request
ciss_cam_complete: called
_ciss_report_request: called
ciss_cam_complete: SCSI_STATUS_OK
  .
  .
  .
  .
after about 60 seconds
ciss0: WARNING: completing non-busy request
ciss0: WARNING: completed command with no submitter
ciss_unmap_request: called
.
.
.
This goes on forever

Thanks
Paul


Paul Heyman
pheyman@adaranetworks.com

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 02:37:36 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DEEBF106566C;
	Thu, 16 Sep 2010 02:37:36 +0000 (UTC)
	(envelope-from yanegomi@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 880148FC14;
	Thu, 16 Sep 2010 02:37:36 +0000 (UTC)
Received: by iwn34 with SMTP id 34so774086iwn.13
	for <multiple recipients>; Wed, 15 Sep 2010 19:37:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=9IksdNRsHV2L49om2obd0n78exIv00WjA9NH9KFrdhI=;
	b=K2iRRGLdVIM3JfuU29yySB+pxQc6UNVJ5CKnuOkpBqibln5vakfX0Ujd0z4cC0+IMg
	eV26FplHn2VSvSwGs7k0kccwzk6iow534HPYrVGV+4tMVSwdFil5zvxY5MFVFMRWHin3
	g9RXfdtPFOC47kxIf7nLzcxwdtFS5LNXWQ338=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	b=VsHJ6HVq3oeWiO6F+ax7oy9E5OJNI8gAadm/lSlU5Vn3xryfpHczj6GN39RTezuo4P
	ccOS/O6dr0KprtclUjJG0lZ9NZUBDrbP59sMuXdBwl8pLafvGQUtcKPy39av9SF9eM1i
	swKx4m9NE8EoehvPkXeEN0VnSXOROiCl9RDEw=
MIME-Version: 1.0
Received: by 10.231.152.143 with SMTP id g15mr2684794ibw.76.1284604655652;
	Wed, 15 Sep 2010 19:37:35 -0700 (PDT)
Sender: yanegomi@gmail.com
Received: by 10.231.11.133 with HTTP; Wed, 15 Sep 2010 19:37:35 -0700 (PDT)
In-Reply-To: <20100916004902.GA46401@freebsd.org>
References: <alpine.BSF.2.00.1003051636030.2481@wonkity.com>
	<201009152143.o8FLhE9p022233@lurza.secnetix.de>
	<20100916004902.GA46401@freebsd.org>
Date: Wed, 15 Sep 2010 19:37:35 -0700
X-Google-Sender-Auth: jIrTdal_NsRjXlp-o-lqsIx6KQU
Message-ID: <AANLkTik9dCT60KDn5gVAsLi8-LRD5KFK1kKJO_9j=x-Z@mail.gmail.com>
From: Garrett Cooper <gcooper@FreeBSD.org>
To: Alexander Best <arundel@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-hackers@freebsd.org, mav@freebsd.org,
	Oliver Fromme <olli@lurza.secnetix.de>
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 02:37:37 -0000

On Wed, Sep 15, 2010 at 5:49 PM, Alexander Best <arundel@freebsd.org> wrote=
:
> On Wed Sep 15 10, Oliver Fromme wrote:
>> Warren Block <wblock@wonkity.com> wrote:
>> =A0> [...]
>> =A0> 8. Alexander Motin has an updated CAM version of the ATA system whi=
ch
>> =A0> will eventually replace the existing one. =A0In -CURRENT, anyway. =
=A0He was
>> =A0> kind enough to look at my event handler. =A0My understanding is tha=
t he is
>> =A0> looking at implementing the head parking/standby mechanism in that =
new
>> =A0> code.
>>
>> The patch below will work with the new CAM ATA driver
>> (i.e. ada(4) disks). =A0It adds a sysctl, so you can switch
>> the spin-down off if you're going to just reboot:
>> # sysctl kern.cam.ada.spindown_shutdown=3D0
>
> i haven't tested your patch yet, but i don't think deciding whether to sp=
in
> down the hdd should be decided merely from the sysctl value.
>
> the hdd should spindown when a shutdown has been issued and not spindown,
> if a reboot has been issued.
>
> either people have the sysctl set to 1 in which case a reboot will cause =
a
> spindown (which isn't healthy for the hdd)
> ...or people will set it to 0 in which case everything remains just the w=
ay it
> is.
>
> imo the sysctl should stay, but shuld have a different meaning. if it is =
set to
> 1 (which should be the default) a shutdown will issue a spindown; a reboo=
t
> won't.
> if for some reason people want back the current behavior (no spindown eve=
n
> during a shutdown) they need to set it to 0.

Agreed. Spinning down at reboot isn't smart and seems like a good way
to kill a disk quicker.

> deciding whether freebsd reboots or shuts down cannot be done from a scri=
pt,
> since users might use the reboot or halt commands in which case (if i'm n=
ot
> mistaken) all shutdown scripts get skipped.

I'm not so sure of that statement, in particular because halt(8),
reboot(8), and shutdown(8) send SIGTERM to processes (unless you use
halt -q / reboot -q ... there might be some other scenarios I'm not
envisioning here).

Thanks,
-Garrett

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 07:17:54 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9015E1065670;
	Thu, 16 Sep 2010 07:17:54 +0000 (UTC) (envelope-from des@des.no)
Received: from smtp.des.no (smtp.des.no [194.63.250.102])
	by mx1.freebsd.org (Postfix) with ESMTP id 4EC888FC19;
	Thu, 16 Sep 2010 07:17:53 +0000 (UTC)
Received: from ds4.des.no (des.no [84.49.246.2])
	by smtp.des.no (Postfix) with ESMTP id 9F7D91FFC34;
	Thu, 16 Sep 2010 07:17:52 +0000 (UTC)
Received: by ds4.des.no (Postfix, from userid 1001)
	id 70D5884550; Thu, 16 Sep 2010 09:17:52 +0200 (CEST)
From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des@des.no>
To: Garrett Cooper <gcooper@FreeBSD.org>
References: <alpine.BSF.2.00.1003051636030.2481@wonkity.com>
	<201009152143.o8FLhE9p022233@lurza.secnetix.de>
	<20100916004902.GA46401@freebsd.org>
	<AANLkTik9dCT60KDn5gVAsLi8-LRD5KFK1kKJO_9j=x-Z@mail.gmail.com>
Date: Thu, 16 Sep 2010 09:17:52 +0200
In-Reply-To: <AANLkTik9dCT60KDn5gVAsLi8-LRD5KFK1kKJO_9j=x-Z@mail.gmail.com>
	(Garrett Cooper's message of "Wed, 15 Sep 2010 19:37:35 -0700")
Message-ID: <86mxri17j3.fsf@ds4.des.no>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Cc: Alexander Best <arundel@freebsd.org>, mav@freebsd.org,
	Oliver Fromme <olli@lurza.secnetix.de>, freebsd-hackers@freebsd.org
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 07:17:54 -0000

Garrett Cooper <gcooper@FreeBSD.org> writes:
> Agreed. Spinning down at reboot isn't smart and seems like a good way
> to kill a disk quicker.

*not* spinning down at halt is far worse.  Most modern disks are rated
for hundreds of thousands of load-unload cycles, but far fewer emergency
unloads (which is what happens when the drive loses power while still
spinning).

DES
--=20
Dag-Erling Sm=C3=B8rgrav - des@des.no

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 07:54:19 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 292FA1065673
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 07:54:19 +0000 (UTC)
	(envelope-from cronfy@gmail.com)
Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 828AB8FC1B
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 07:54:18 +0000 (UTC)
Received: by bwz15 with SMTP id 15so1692964bwz.13
	for <multiple recipients>; Thu, 16 Sep 2010 00:54:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:mime-version:received:in-reply-to
	:references:from:date:message-id:subject:to:cc:content-type
	:content-transfer-encoding;
	bh=OYHEdS863mPnBUZrwgDQFTXSraAuIWi8EB+7DDTN0Lo=;
	b=jnVYTQYFL7fiuHbzo84ckt3cSxArUrYUejMpKRxHuWQmT1MlH/12wCFkFw7lSKxIj9
	To/tcLuUC4fB1yIKjobXZasTcKllZ8sh9ItqZl+LlkHIWWBxgfXETxcPxU8Mlx1wayTb
	ZEvt6yjhs/CPilWBlOm29SgIpqiUIsdsUJx6s=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
	:cc:content-type:content-transfer-encoding;
	b=ILpALBYbfO1m0sLFVOjl6RMBzr8bFUFBjsBIu04Q+tbR4yNa0fjX/2u5BIcARLzbKo
	qlJD/1uK4DbgmVT2FHI9VGlv9VBFCMn/I0r8Ubcg+LBYl4KsZknA068T4lEm0uUu7up3
	/ET5wEXpdpPwGifwlZJRx9oJeUP5F68t01sPs=
Received: by 10.204.82.18 with SMTP id z18mr2231732bkk.125.1284623657408; Thu,
	16 Sep 2010 00:54:17 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.204.99.197 with HTTP; Thu, 16 Sep 2010 00:53:47 -0700 (PDT)
In-Reply-To: <201009151509.49728.jhb@freebsd.org>
References: <AANLkTinEje-+1P1n33YMKAaciaYHQH+dpwgX6UY1dOux@mail.gmail.com>
	<201009151509.49728.jhb@freebsd.org>
From: cronfy <cronfy@gmail.com>
Date: Thu, 16 Sep 2010 11:53:47 +0400
Message-ID: <AANLkTincNbk3wdcgRPsoeLLkKkSrFe_xcY6K-KPkGH-7@mail.gmail.com>
To: freebsd-hackers@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: 
Subject: Re: is vfs.lookup_shared unsafe in 7.3?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 07:54:19 -0000

>> Hello,
>>
>> Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA >
>> 40, very slow lstat() at these moments, looks like some kind of lock
>> contention) I enabled vfs.lookup_shared=3D1 on two servers today. One is
>> FreeBSD-7.3 kernel csup'ed and built Sep =A09 2010 and other is
>> FreeBSD-7.3 csup'ed and built Jul 16 2010.
>>
>> The server with more fresh kernel is running nice and does not show
>> high load anymore. But on the second server it did not help. More,
>> after a few hours of work with vfs.lookup_shared=3D1 I noticed processes
>> stucked in "ufs" state. I tried to kill them with no luck. Disabling
>> vfs.lookup_shared freezed the whole system.
>>
>> So, is vfs.lookup_shared=3D1 unsafe in 7.3? Did it become more stable
>> between 16 Jul and 9 Sep (is it the reason why first system is still
>> running?), or should I expect that it will freeze in a near time too?
>>
>> Thanks in advance!
>
> No, 7.3 has a bug that can cause these hangs that is probably made worse =
by
> vfs.lookup_shared=3D1, but can occur even if it is disabled. =A0You want
> these fixes applied (in order, one of them reverts part of another):

Thank you for the fix and for the explanation, that's exactly what I
wanted to know. Just to be sure: do these patches completely fix the
bug with hangs (even without vfs.lookup_shared=3D1)?

--=20
// cronfy

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 08:41:24 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CEBC11065673;
	Thu, 16 Sep 2010 08:41:24 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 4DE158FC08;
	Thu, 16 Sep 2010 08:41:24 +0000 (UTC)
Received: from lurza.secnetix.de (localhost [127.0.0.1])
	by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id o8G8f7CZ047727;
	Thu, 16 Sep 2010 10:41:23 +0200 (CEST)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.14.3/8.14.3/Submit) id o8G8f7Q2047725;
	Thu, 16 Sep 2010 10:41:07 +0200 (CEST) (envelope-from olli)
From: Oliver Fromme <olli@lurza.secnetix.de>
Message-Id: <201009160841.o8G8f7Q2047725@lurza.secnetix.de>
To: arundel@FreeBSD.ORG (Alexander Best)
Date: Thu, 16 Sep 2010 10:41:07 +0200 (CEST)
In-Reply-To: <20100916004902.GA46401@freebsd.org>
X-Mailer: ELM [version 2.5 PL8]
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5
	(lurza.secnetix.de [127.0.0.1]);
	Thu, 16 Sep 2010 10:41:23 +0200 (CEST)
Cc: freebsd-hackers@FreeBSD.ORG, mav@FreeBSD.ORG
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 08:41:25 -0000


Alexander Best wrote:
 > On Wed Sep 15 10, Oliver Fromme wrote:
 > > Warren Block <wblock@wonkity.com> wrote:
 > > > [...]
 > > > 8. Alexander Motin has an updated CAM version of the ATA system which 
 > > > will eventually replace the existing one.  In -CURRENT, anyway.  He was 
 > > > kind enough to look at my event handler.  My understanding is that he is 
 > > > looking at implementing the head parking/standby mechanism in that new 
 > > > code.
 > > 
 > > The patch below will work with the new CAM ATA driver
 > > (i.e. ada(4) disks).  It adds a sysctl, so you can switch
 > > the spin-down off if you're going to just reboot:
 > > # sysctl kern.cam.ada.spindown_shutdown=0
 > 
 > i haven't tested your patch yet, but i don't think deciding whether to spin
 > down the hdd should be decided merely from the sysctl value.

It was the most simple and least intrusive way to introduce
some means to switch it on and off.  Of course there might
be better ways to do it.  You're welcome to submit your own
patch.

 > the hdd should spindown when a shutdown has been issued and not spindown,
 > if a reboot has been issued.

Right.  That's why my shutdown wrapper script sets the sysctl
to 0 when the -r option is present (I've got that wrapper
script for ages, for different reasons).

Also, there are cases where it is completely impossible to
decide automatically whether the disks should be spun down
or not.  For example, if the admin issues a shutdown -h
(halt), there's no way for the OS to know in advance whether
the admin is going to switch the machine off or reboot to
multi-user.  So there must be a way for the user to forcibly
enable/disable the spindown feature.  I think a sysctl is
the most appropriate way to do that, isn't it?

Actually, my plan is to have a mask of two bits for the
sysctl (the default value would be 3):

 - bit 0: enable (1) or disable (0) spindown
 - bit 1: automatic (1) or manual (0) setting

With the default setting (i.e. bit 1 == 1), at shutdown time
some facility would look at the reboot(2) "howto" flags and
then set bit 0 to either 0 or 1.

There are several ways where to handle that.  For example,
init(8) could be modified to pass the "howto" value to
rc.shutdown (which could be useful for other purposes, too).
Then a standard rc.d script could handle the spindown sysctl.
The advantage of that solution would be maximum flexibility,
because the actual logic is implemented in an rc.d script.

 > deciding whether freebsd reboots or shuts down cannot be done from a script,
 > since users might use the reboot or halt commands in which case (if i'm not
 > mistaken) all shutdown scripts get skipped.

Right, which is why it is a rather bad idea to use halt(8)
or reboot(8), except in an emergency.  Actually I think the
manpages and handbook should strongly discourage it, and
recommend to use shutdown(8) or init(8) instead, both of
which send a signal to PID 1 by default, so rc.shutdown is
executed properly.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch�ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M�n-
chen, HRB 125758,  Gesch�ftsf�hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"If Java had true garbage collection, most programs
would delete themselves upon execution."
        -- Robert Sewell

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 11:59:52 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6576F1065672
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 11:59:52 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 32C408FC1E
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 11:59:52 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id C17A246B8A;
	Thu, 16 Sep 2010 07:59:51 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 6EE038A03C;
	Thu, 16 Sep 2010 07:59:50 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: cronfy <cronfy@gmail.com>
Date: Thu, 16 Sep 2010 07:59:49 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <AANLkTinEje-+1P1n33YMKAaciaYHQH+dpwgX6UY1dOux@mail.gmail.com>
	<201009151509.49728.jhb@freebsd.org>
	<AANLkTincNbk3wdcgRPsoeLLkKkSrFe_xcY6K-KPkGH-7@mail.gmail.com>
In-Reply-To: <AANLkTincNbk3wdcgRPsoeLLkKkSrFe_xcY6K-KPkGH-7@mail.gmail.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201009160759.49179.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Thu, 16 Sep 2010 07:59:50 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: freebsd-hackers@freebsd.org
Subject: Re: is vfs.lookup_shared unsafe in 7.3?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 11:59:52 -0000

On Thursday, September 16, 2010 3:53:47 am cronfy wrote:
> >> Hello,
> >>
> >> Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA >
> >> 40, very slow lstat() at these moments, looks like some kind of lock
> >> contention) I enabled vfs.lookup_shared=1 on two servers today. One is
> >> FreeBSD-7.3 kernel csup'ed and built Sep  9 2010 and other is
> >> FreeBSD-7.3 csup'ed and built Jul 16 2010.
> >>
> >> The server with more fresh kernel is running nice and does not show
> >> high load anymore. But on the second server it did not help. More,
> >> after a few hours of work with vfs.lookup_shared=1 I noticed processes
> >> stucked in "ufs" state. I tried to kill them with no luck. Disabling
> >> vfs.lookup_shared freezed the whole system.
> >>
> >> So, is vfs.lookup_shared=1 unsafe in 7.3? Did it become more stable
> >> between 16 Jul and 9 Sep (is it the reason why first system is still
> >> running?), or should I expect that it will freeze in a near time too?
> >>
> >> Thanks in advance!
> >
> > No, 7.3 has a bug that can cause these hangs that is probably made worse by
> > vfs.lookup_shared=1, but can occur even if it is disabled.  You want
> > these fixes applied (in order, one of them reverts part of another):
> 
> Thank you for the fix and for the explanation, that's exactly what I
> wanted to know. Just to be sure: do these patches completely fix the
> bug with hangs (even without vfs.lookup_shared=1)?

Yes.

-- 
John Baldwin

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 12:38:02 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A128210656A5
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 12:38:02 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 72E6F8FC19
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 12:38:02 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id 0B43B46B5C;
	Thu, 16 Sep 2010 08:38:02 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 323D38A03C;
	Thu, 16 Sep 2010 08:38:01 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-hackers@freebsd.org
Date: Thu, 16 Sep 2010 08:15:18 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <4C915E4F.9030006@adaranet.com>
In-Reply-To: <4C915E4F.9030006@adaranet.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
Message-Id: <201009160815.18679.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Thu, 16 Sep 2010 08:38:01 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: Patrick Mahan <pmahan@adaranet.com>
Subject: Re: odd issues with DDB vs GDB
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 12:38:02 -0000

On Wednesday, September 15, 2010 8:01:19 pm Patrick Mahan wrote:
> All,
> 
> I am trying to debug a system hang occurring on my HP Proliant G6 running some of our
> kernel software.  I am seeing that under certain test loads, the system will hang-up
> complete, no keyboard, no console, etc.  I suspect it is some of the kernel code that
> I have inherited that contains a lot of locking (lots of data structure, each having
> their own mutex lock (sleepable)).

You need to use 'kgdb' rather than 'gdb' on kernel.debug.

-- 
John Baldwin

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 12:38:07 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5DA661065670;
	Thu, 16 Sep 2010 12:38:06 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 5F75C8FC0C;
	Thu, 16 Sep 2010 12:38:06 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id 01BE346B7E;
	Thu, 16 Sep 2010 08:38:06 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 85BDF8A04F;
	Thu, 16 Sep 2010 08:38:04 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-hackers@freebsd.org
Date: Thu, 16 Sep 2010 08:22:24 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <20100916010120.GA49997@freebsd.org>
In-Reply-To: <20100916010120.GA49997@freebsd.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
Message-Id: <201009160822.24460.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Thu, 16 Sep 2010 08:38:05 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: Alexander Best <arundel@freebsd.org>
Subject: Re: traling whitespace in CFLAGS if make.conf:CPUTYPE is not
	defined/empty
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 12:38:07 -0000

On Wednesday, September 15, 2010 9:01:20 pm Alexander Best wrote:
> hi there,
> 
> after discovering PR #114082 i noticed that with CPUTYPE not being defined in
> make.conf, `make -VCFLAGS` reports a trailing whitespace for CFLAGS.
> the reason for this is that ${_CPUCFLAGS} gets added to CFLAGS even if it's
> empty.
> 
> the following patch should take care of the problem. i also added the same
> logik to COPTFLAGS. although i wasn't able to trigger the trailing whitespace,
> it should still introduce a cleaner behaviour.

Does the trailing whitespace break anything?  In the past we have had a
non-empty default CPU CFLAGS (e.g. using '-mtune=pentiumpro' on i386 at one
point IIRC) which this change would break.  Unless the trailing whitespace
is causing non-cosmetic problems I'd probably just leave it as it is.

Also, if we were to go with this approach, I would not have changed
kern.pre.mk at all, but set both NO_CPU_CFLAGS and NO_CPU_COPTFLAGS in
bsd.cpu.mk when CPUTYPE was empty.

-- 
John Baldwin

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 12:41:24 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id 305861065698; Thu, 16 Sep 2010 12:41:24 +0000 (UTC)
Date: Thu, 16 Sep 2010 12:41:24 +0000
From: Alexander Best <arundel@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <20100916124124.GA52106@freebsd.org>
References: <20100916010120.GA49997@freebsd.org>
	<201009160822.24460.jhb@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201009160822.24460.jhb@freebsd.org>
Cc: freebsd-hackers@freebsd.org
Subject: Re: traling whitespace in CFLAGS if make.conf:CPUTYPE is not
	defined/empty
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 12:41:24 -0000

On Thu Sep 16 10, John Baldwin wrote:
> On Wednesday, September 15, 2010 9:01:20 pm Alexander Best wrote:
> > hi there,
> > 
> > after discovering PR #114082 i noticed that with CPUTYPE not being defined in
> > make.conf, `make -VCFLAGS` reports a trailing whitespace for CFLAGS.
> > the reason for this is that ${_CPUCFLAGS} gets added to CFLAGS even if it's
> > empty.
> > 
> > the following patch should take care of the problem. i also added the same
> > logik to COPTFLAGS. although i wasn't able to trigger the trailing whitespace,
> > it should still introduce a cleaner behaviour.
> 
> Does the trailing whitespace break anything?  In the past we have had a
> non-empty default CPU CFLAGS (e.g. using '-mtune=pentiumpro' on i386 at one
> point IIRC) which this change would break.  Unless the trailing whitespace
> is causing non-cosmetic problems I'd probably just leave it as it is.

the PR claims that a few ports are having problems with trailing whitespaces
during ./configure, but personally i haven't experienced any problems.
however i don't use the port system a lot so i'm not really able to comment on
that.

cheers.
alex

> 
> Also, if we were to go with this approach, I would not have changed
> kern.pre.mk at all, but set both NO_CPU_CFLAGS and NO_CPU_COPTFLAGS in
> bsd.cpu.mk when CPUTYPE was empty.
> 
> -- 
> John Baldwin

-- 
a13x

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 14:10:00 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 148871065672;
	Thu, 16 Sep 2010 14:10:00 +0000 (UTC)
	(envelope-from tijl@coosemans.org)
Received: from mailrelay001.isp.belgacom.be (mailrelay001.isp.belgacom.be
	[195.238.6.51])
	by mx1.freebsd.org (Postfix) with ESMTP id 7B5728FC08;
	Thu, 16 Sep 2010 14:09:59 +0000 (UTC)
X-Belgacom-Dynamic: yes
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Aj4FAIe7kUxbsVM9/2dsb2JhbACUMY1icsFXhUEE
Received: from 61.83-177-91.adsl-dyn.isp.belgacom.be (HELO
	kalimero.tijl.coosemans.org) ([91.177.83.61])
	by relay.skynet.be with ESMTP; 16 Sep 2010 15:40:09 +0200
Received: from kalimero.tijl.coosemans.org (kalimero.tijl.coosemans.org
	[127.0.0.1])
	by kalimero.tijl.coosemans.org (8.14.4/8.14.4) with ESMTP id
	o8GDe8sH004587; Thu, 16 Sep 2010 15:40:08 +0200 (CEST)
	(envelope-from tijl@coosemans.org)
From: Tijl Coosemans <tijl@coosemans.org>
To: freebsd-hackers@freebsd.org
Date: Thu, 16 Sep 2010 15:40:01 +0200
User-Agent: KMail/1.13.5 (FreeBSD/8.1-PRERELEASE; KDE/4.4.5; i386; ; )
References: <201009160841.o8G8f7Q2047725@lurza.secnetix.de>
In-Reply-To: <201009160841.o8G8f7Q2047725@lurza.secnetix.de>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="nextPart10116848.qj6mstj0Su";
	protocol="application/pgp-signature"; micalg=pgp-sha256
Content-Transfer-Encoding: 7bit
Message-Id: <201009161540.08029.tijl@coosemans.org>
Cc: Alexander Best <arundel@freebsd.org>, mav@freebsd.org,
	Oliver Fromme <olli@lurza.secnetix.de>
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 14:10:00 -0000

--nextPart10116848.qj6mstj0Su
Content-Type: Text/Plain;
  charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

On Thursday 16 September 2010 10:41:07 Oliver Fromme wrote:
> Alexander Best wrote:
>> On Wed Sep 15 10, Oliver Fromme wrote:
>>> The patch below will work with the new CAM ATA driver
>>> (i.e. ada(4) disks).  It adds a sysctl, so you can switch
>>> the spin-down off if you're going to just reboot:
>>> # sysctl kern.cam.ada.spindown_shutdown=3D0
>>
>> the hdd should spindown when a shutdown has been issued and not spindown,
>> if a reboot has been issued.
>
> Right.  That's why my shutdown wrapper script sets the sysctl
> to 0 when the -r option is present (I've got that wrapper
> script for ages, for different reasons).
>=20
> Also, there are cases where it is completely impossible to
> decide automatically whether the disks should be spun down
> or not.  For example, if the admin issues a shutdown -h
> (halt), there's no way for the OS to know in advance whether
> the admin is going to switch the machine off or reboot to
> multi-user.  So there must be a way for the user to forcibly
> enable/disable the spindown feature.  I think a sysctl is
> the most appropriate way to do that, isn't it?

I would just spin down the disk in case of a halt. An unwanted spin
down is harmless compared to an emergency shutdown and usually the
intention is to power off rather than reboot.

Part of your patch modifies ada_shutdown. That function already gets
the reboot(2) howto flags passed to it, so you could test for
(howto & (RB_HALT | RB_POWEROFF)) !=3D 0 before issuing the STANDBY
command. There's no need to make this more complicated with a sysctl
that can override this in my opinion.

Also command2 should be command1 in this line:

+       if (cgd->ident_data.support.command2 & ATA_SUPPORT_POWERMGT)

--nextPart10116848.qj6mstj0Su
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)

iF4EABEIAAYFAkySHjcACgkQfoCS2CCgtispsgD+LN0j62if3uUa43YFwYM0CeQv
NPOutTmV6xb7ynDC3JsA/2abG7cabPUjYNCbXzQWwjjvOwSM3eDDS9aq/RA9R0Ov
=tIod
-----END PGP SIGNATURE-----

--nextPart10116848.qj6mstj0Su--

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 14:10:42 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0A7B81065670;
	Thu, 16 Sep 2010 14:10:42 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 764198FC15;
	Thu, 16 Sep 2010 14:10:41 +0000 (UTC)
Received: from lurza.secnetix.de (localhost [127.0.0.1])
	by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id o8GEAMuA029068;
	Thu, 16 Sep 2010 16:10:37 +0200 (CEST)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.14.3/8.14.3/Submit) id o8GEAM1n029066;
	Thu, 16 Sep 2010 16:10:22 +0200 (CEST) (envelope-from olli)
From: Oliver Fromme <olli@lurza.secnetix.de>
Message-Id: <201009161410.o8GEAM1n029066@lurza.secnetix.de>
To: tijl@coosemans.org (Tijl Coosemans)
Date: Thu, 16 Sep 2010 16:10:22 +0200 (CEST)
In-Reply-To: <201009161540.08029.tijl@coosemans.org>
X-Mailer: ELM [version 2.5 PL8]
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5
	(lurza.secnetix.de [127.0.0.1]);
	Thu, 16 Sep 2010 16:10:37 +0200 (CEST)
Cc: freebsd-hackers@freebsd.org, mav@freebsd.org,
	Alexander Best <arundel@freebsd.org>
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 14:10:42 -0000


Tijl Coosemans wrote:
 > On Thursday 16 September 2010 10:41:07 Oliver Fromme wrote:
 > > Also, there are cases where it is completely impossible to
 > > decide automatically whether the disks should be spun down
 > > or not.  For example, if the admin issues a shutdown -h
 > > (halt), there's no way for the OS to know in advance whether
 > > the admin is going to switch the machine off or reboot to
 > > multi-user.  So there must be a way for the user to forcibly
 > > enable/disable the spindown feature.  I think a sysctl is
 > > the most appropriate way to do that, isn't it?
 > 
 > I would just spin down the disk in case of a halt. An unwanted spin
 > down is harmless compared to an emergency shutdown and usually the
 > intention is to power off rather than reboot.

Is it?  When I intend to power-off, I use shutdown -p, not
shutdown -h.  Quite often (but not always) when I halt a
machine, I'm going to reboot to multi-user, not power off.

In that case I certainly wouldn't want to spin the drives
down and have them spun up immediately afterwards.  I don't
think that wear&tear caused by that procedure is completely
insignificant (although it's certainly less of a problem
than emergency unloads).

For that reason I definitely want to have a way to disable
the spindown function manually.

 > Part of your patch modifies ada_shutdown. That function already gets
 > the reboot(2) howto flags passed to it, so you could test for
 > (howto & (RB_HALT | RB_POWEROFF)) != 0 before issuing the STANDBY
 > command.

Right, good point.  I didn't notice because the shutdown
function in ad(4) doesn't get the howto flag, so I assumed
(without checking) that ada(4) doesn't get it either.

 > There's no need to make this more complicated with a sysctl
 > that can override this in my opinion.

I'm afraid I have to disagree (see above).  Apart from that,
there's nothing complicated at all about a sysctl.

 > Also command2 should be command1 in this line:
 > 
 > +       if (cgd->ident_data.support.command2 & ATA_SUPPORT_POWERMGT)

Oops ...  You're right.  Thanks for pointing that out.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch�ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M�n-
chen, HRB 125758,  Gesch�ftsf�hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"I made up the term 'object-oriented', and I can tell you
I didn't have C++ in mind."
        -- Alan Kay, OOPSLA '97

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 14:57:22 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AF75A1065679;
	Thu, 16 Sep 2010 14:57:22 +0000 (UTC)
	(envelope-from wblock@wonkity.com)
Received: from wonkity.com (wonkity.com [67.158.26.137])
	by mx1.freebsd.org (Postfix) with ESMTP id 4E7EB8FC21;
	Thu, 16 Sep 2010 14:57:22 +0000 (UTC)
Received: from wonkity.com (localhost [127.0.0.1])
	by wonkity.com (8.14.4/8.14.4) with ESMTP id o8GEg04S068467;
	Thu, 16 Sep 2010 08:42:00 -0600 (MDT)
	(envelope-from wblock@wonkity.com)
Received: from localhost (wblock@localhost)
	by wonkity.com (8.14.4/8.14.4/Submit) with ESMTP id o8GEg0Fi068464;
	Thu, 16 Sep 2010 08:42:00 -0600 (MDT)
	(envelope-from wblock@wonkity.com)
Date: Thu, 16 Sep 2010 08:42:00 -0600 (MDT)
From: Warren Block <wblock@wonkity.com>
To: Alexander Best <arundel@FreeBSD.ORG>
In-Reply-To: <20100916004902.GA46401@freebsd.org>
Message-ID: <alpine.BSF.2.00.1009160831560.68314@wonkity.com>
References: <alpine.BSF.2.00.1003051636030.2481@wonkity.com>
	<201009152143.o8FLhE9p022233@lurza.secnetix.de>
	<20100916004902.GA46401@freebsd.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.6
	(wonkity.com [127.0.0.1]); Thu, 16 Sep 2010 08:42:00 -0600 (MDT)
Cc: freebsd-hackers@FreeBSD.ORG, mav@FreeBSD.ORG,
	Oliver Fromme <olli@lurza.secnetix.de>
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 14:57:22 -0000

On Thu, 16 Sep 2010, Alexander Best wrote:

> On Wed Sep 15 10, Oliver Fromme wrote:
>> Warren Block <wblock@wonkity.com> wrote:
>> > [...]
>> > 8. Alexander Motin has an updated CAM version of the ATA system which
>> > will eventually replace the existing one.  In -CURRENT, anyway.  He was
>> > kind enough to look at my event handler.  My understanding is that he is
>> > looking at implementing the head parking/standby mechanism in that new
>> > code.
>>
>> The patch below will work with the new CAM ATA driver
>> (i.e. ada(4) disks).  It adds a sysctl, so you can switch
>> the spin-down off if you're going to just reboot:
>> # sysctl kern.cam.ada.spindown_shutdown=0
>
> i haven't tested your patch yet, but i don't think deciding whether to spin
> down the hdd should be decided merely from the sysctl value.
>
> the hdd should spindown when a shutdown has been issued and not spindown,
> if a reboot has been issued.

It's been a while, but the problem I found when comparing the NetBSD 
code was that there didn't appear to be a way to tell from within the 
FreeBSD driver whether it was a shutdown or reboot.

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 15:06:46 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B3072106567A
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 15:06:46 +0000 (UTC)
	(envelope-from nwhitehorn@freebsd.org)
Received: from agogare.doit.wisc.edu (agogare.doit.wisc.edu [144.92.197.211])
	by mx1.freebsd.org (Postfix) with ESMTP id 7AFD28FC22
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 15:06:46 +0000 (UTC)
MIME-version: 1.0
Content-transfer-encoding: 7BIT
Content-type: text/plain; CHARSET=US-ASCII
Received: from avs-daemon.smtpauth2.wiscmail.wisc.edu by
	smtpauth2.wiscmail.wisc.edu
	(Sun Java(tm) System Messaging Server 7u2-7.05 32bit (built Jul 30
	2009)) id <0L8U00600HZ93V00@smtpauth2.wiscmail.wisc.edu> for
	freebsd-hackers@freebsd.org; Thu, 16 Sep 2010 10:06:45 -0500 (CDT)
Received: from comporellon.tachypleus.net ([unknown] [76.210.68.10])
	by smtpauth2.wiscmail.wisc.edu
	(Sun Java(tm) System Messaging Server 7u2-7.05 32bit (built Jul 30
	2009))
	with ESMTPSA id <0L8U00HVUHZ8TI40@smtpauth2.wiscmail.wisc.edu> for
	freebsd-hackers@freebsd.org; Thu, 16 Sep 2010 10:06:45 -0500 (CDT)
Date: Thu, 16 Sep 2010 10:06:44 -0500
From: Nathan Whitehorn <nwhitehorn@freebsd.org>
In-reply-to: <alpine.BSF.2.00.1009160831560.68314@wonkity.com>
To: freebsd-hackers@freebsd.org
Message-id: <4C923284.20304@freebsd.org>
X-Spam-Report: AuthenticatedSender=yes, SenderIP=76.210.68.10
X-Spam-PmxInfo: Server=avs-9, Version=5.6.0.2009776,
	Antispam-Engine: 2.7.2.376379, Antispam-Data: 2010.9.16.145715,
	SenderIP=76.210.68.10
X-Enigmail-Version: 1.0.1
References: <alpine.BSF.2.00.1003051636030.2481@wonkity.com>
	<201009152143.o8FLhE9p022233@lurza.secnetix.de>
	<20100916004902.GA46401@freebsd.org>
	<alpine.BSF.2.00.1009160831560.68314@wonkity.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.12)
	Gecko/20100909 Thunderbird/3.0.7
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 15:06:46 -0000

On 09/16/10 09:42, Warren Block wrote:
> On Thu, 16 Sep 2010, Alexander Best wrote:
>
>> On Wed Sep 15 10, Oliver Fromme wrote:
>>> Warren Block <wblock@wonkity.com> wrote:
>>> > [...]
>>> > 8. Alexander Motin has an updated CAM version of the ATA system which
>>> > will eventually replace the existing one.  In -CURRENT, anyway. 
>>> He was
>>> > kind enough to look at my event handler.  My understanding is that
>>> he is
>>> > looking at implementing the head parking/standby mechanism in that
>>> new
>>> > code.
>>>
>>> The patch below will work with the new CAM ATA driver
>>> (i.e. ada(4) disks).  It adds a sysctl, so you can switch
>>> the spin-down off if you're going to just reboot:
>>> # sysctl kern.cam.ada.spindown_shutdown=0
>>
>> i haven't tested your patch yet, but i don't think deciding whether
>> to spin
>> down the hdd should be decided merely from the sysctl value.
>>
>> the hdd should spindown when a shutdown has been issued and not
>> spindown,
>> if a reboot has been issued.
>
> It's been a while, but the problem I found when comparing the NetBSD
> code was that there didn't appear to be a way to tell from within the
> FreeBSD driver whether it was a shutdown or reboot.

Register a shutdown event handler? The second argument can be tested
against RB_HALT to determine what is happening.
-Nathan

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 15:42:28 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2DD421065781;
	Thu, 16 Sep 2010 15:42:28 +0000 (UTC)
	(envelope-from tijl@coosemans.org)
Received: from mailrelay004.isp.belgacom.be (mailrelay004.isp.belgacom.be
	[195.238.6.170])
	by mx1.freebsd.org (Postfix) with ESMTP id 35FB58FC16;
	Thu, 16 Sep 2010 15:42:26 +0000 (UTC)
X-Belgacom-Dynamic: yes
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Aj4FADnXkUxbsVM9/2dsb2JhbACUJI1icsJuhUEE
Received: from 61.83-177-91.adsl-dyn.isp.belgacom.be (HELO
	kalimero.tijl.coosemans.org) ([91.177.83.61])
	by relay.skynet.be with ESMTP; 16 Sep 2010 17:42:25 +0200
Received: from kalimero.tijl.coosemans.org (kalimero.tijl.coosemans.org
	[127.0.0.1])
	by kalimero.tijl.coosemans.org (8.14.4/8.14.4) with ESMTP id
	o8GFgOQm005501; Thu, 16 Sep 2010 17:42:24 +0200 (CEST)
	(envelope-from tijl@coosemans.org)
From: Tijl Coosemans <tijl@coosemans.org>
To: freebsd-hackers@freebsd.org
Date: Thu, 16 Sep 2010 17:42:18 +0200
User-Agent: KMail/1.13.5 (FreeBSD/8.1-PRERELEASE; KDE/4.4.5; i386; ; )
References: <201009161410.o8GEAM1n029066@lurza.secnetix.de>
In-Reply-To: <201009161410.o8GEAM1n029066@lurza.secnetix.de>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="nextPart1394869.hKoHA25AgX";
	protocol="application/pgp-signature"; micalg=pgp-sha256
Content-Transfer-Encoding: 7bit
Message-Id: <201009161742.24228.tijl@coosemans.org>
Cc: Alexander Best <arundel@freebsd.org>, mav@freebsd.org,
	Oliver Fromme <olli@lurza.secnetix.de>
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 15:42:28 -0000

--nextPart1394869.hKoHA25AgX
Content-Type: Text/Plain;
  charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

On Thursday 16 September 2010 16:10:22 Oliver Fromme wrote:
> Tijl Coosemans wrote:
>> I would just spin down the disk in case of a halt. An unwanted spin
>> down is harmless compared to an emergency shutdown and usually the
>> intention is to power off rather than reboot.
>=20
> Is it?  When I intend to power-off, I use shutdown -p, not
> shutdown -h.  Quite often (but not always) when I halt a
> machine, I'm going to reboot to multi-user, not power off.

Hmm, I suppose support for power off is ubiquitous nowadays. It used to
be that halt meant: bring the system in a state where we can safely cut
the power. In that case it makes sense to let halt spin down the disks.
If you intend to reboot why not explicitly reboot rather than halt?
Also, to go from single to multi user mode you can just exit(1) the
shell.

> In that case I certainly wouldn't want to spin the drives
> down and have them spun up immediately afterwards.  I don't
> think that wear&tear caused by that procedure is completely
> insignificant (although it's certainly less of a problem
> than emergency unloads).
>=20
> For that reason I definitely want to have a way to disable
> the spindown function manually.

Ok, I'm soft on the sysctl really, it wouldn't hurt anyone. Although,
if the intention is to just override the default behaviour at the time
of shutdown you might as well just add an option to halt(8). A "don't
spin down disks" option would fit in with the other options there.

--nextPart1394869.hKoHA25AgX
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)

iF4EABEIAAYFAkySOt8ACgkQfoCS2CCgtivkLwD/cjNQVg2WjEC0GxsxQBQZZdLW
tGouE291l49ypQZ4DGIA/j9rGCo+idLc+CeGLeYhG7X1ES9Z8d4zSZqwg3Nl5mpp
=XBCi
-----END PGP SIGNATURE-----

--nextPart1394869.hKoHA25AgX--

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 16:19:29 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F1777106566C;
	Thu, 16 Sep 2010 16:19:29 +0000 (UTC)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 2BAA38FC08;
	Thu, 16 Sep 2010 16:19:28 +0000 (UTC)
Received: from lurza.secnetix.de (localhost [127.0.0.1])
	by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id o8GGJA1T035380;
	Thu, 16 Sep 2010 18:19:26 +0200 (CEST)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.14.3/8.14.3/Submit) id o8GGJAmv035378;
	Thu, 16 Sep 2010 18:19:10 +0200 (CEST) (envelope-from olli)
From: Oliver Fromme <olli@lurza.secnetix.de>
Message-Id: <201009161619.o8GGJAmv035378@lurza.secnetix.de>
To: tijl@coosemans.org (Tijl Coosemans)
Date: Thu, 16 Sep 2010 18:19:10 +0200 (CEST)
In-Reply-To: <201009161742.24228.tijl@coosemans.org>
X-Mailer: ELM [version 2.5 PL8]
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.5
	(lurza.secnetix.de [127.0.0.1]);
	Thu, 16 Sep 2010 18:19:26 +0200 (CEST)
Cc: freebsd-hackers@freebsd.org, mav@freebsd.org,
	Alexander Best <arundel@freebsd.org>
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 16:19:30 -0000


Tijl Coosemans wrote:
 > On Thursday 16 September 2010 16:10:22 Oliver Fromme wrote:
 > > Tijl Coosemans wrote:
 > > > I would just spin down the disk in case of a halt. An unwanted spin
 > > > down is harmless compared to an emergency shutdown and usually the
 > > > intention is to power off rather than reboot.
 > > 
 > > Is it?  When I intend to power-off, I use shutdown -p, not
 > > shutdown -h.  Quite often (but not always) when I halt a
 > > machine, I'm going to reboot to multi-user, not power off.
 > 
 > Hmm, I suppose support for power off is ubiquitous nowadays. It used to
 > be that halt meant: bring the system in a state where we can safely cut
 > the power. In that case it makes sense to let halt spin down the disks.
 > If you intend to reboot why not explicitly reboot rather than halt?

For example, I use shutdown -h in order to swap disks that
are not hot-swappable, or other kind of hardware work that
can be done while the machine is switched on.

Of course, in that particular case the disk which is about
to be swapped out should be spun down, while the others
should not.  But that's not a problem because I can use
atacontrol(8) and camcontrol(8) to spin down a specific
disk drive manually.

 > Also, to go from single to multi user mode you can just exit(1) the
 > shell.

Yes, of course, that's a different matter.

I've updated the patch for ada(4).  It includes a bug fix
(command1 vs. command2) and uses the howto flags passed to
the shutdown function.  Thanks again for pointing these out.

Best regards
   Oliver


--- ata_da.c.orig	2010-05-23 18:16:33.000000000 +0200
+++ ata_da.c	2010-09-16 17:21:10.000000000 +0200
@@ -42,6 +42,7 @@
 #include <sys/eventhandler.h>
 #include <sys/malloc.h>
 #include <sys/cons.h>
+#include <sys/reboot.h>
 #include <geom/geom_disk.h>
 #endif /* _KERNEL */
 
@@ -79,7 +80,8 @@
 	ADA_FLAG_CAN_TRIM	= 0x080,
 	ADA_FLAG_OPEN		= 0x100,
 	ADA_FLAG_SCTX_INIT	= 0x200,
-	ADA_FLAG_CAN_CFA        = 0x400
+	ADA_FLAG_CAN_CFA        = 0x400,
+	ADA_FLAG_CAN_POWERMGT   = 0x800
 } ada_flags;
 
 typedef enum {
@@ -180,6 +182,10 @@
 #define	ADA_DEFAULT_SEND_ORDERED	1
 #endif
 
+#ifndef	ADA_DEFAULT_SPINDOWN_SHUTDOWN
+#define	ADA_DEFAULT_SPINDOWN_SHUTDOWN	1
+#endif
+
 /*
  * Most platforms map firmware geometry to actual, but some don't.  If
  * not overridden, default to nothing.
@@ -191,6 +197,7 @@
 static int ada_retry_count = ADA_DEFAULT_RETRY;
 static int ada_default_timeout = ADA_DEFAULT_TIMEOUT;
 static int ada_send_ordered = ADA_DEFAULT_SEND_ORDERED;
+static int ada_spindown_shutdown = ADA_DEFAULT_SPINDOWN_SHUTDOWN;
 
 SYSCTL_NODE(_kern_cam, OID_AUTO, ada, CTLFLAG_RD, 0,
             "CAM Direct Access Disk driver");
@@ -203,6 +210,9 @@
 SYSCTL_INT(_kern_cam_ada, OID_AUTO, ada_send_ordered, CTLFLAG_RW,
            &ada_send_ordered, 0, "Send Ordered Tags");
 TUNABLE_INT("kern.cam.ada.ada_send_ordered", &ada_send_ordered);
+SYSCTL_INT(_kern_cam_ada, OID_AUTO, spindown_shutdown, CTLFLAG_RW,
+           &ada_spindown_shutdown, 0, "Spin down upon shutdown");
+TUNABLE_INT("kern.cam.ada.spindown_shutdown", &ada_spindown_shutdown);
 
 /*
  * ADA_ORDEREDTAG_INTERVAL determines how often, relative
@@ -665,6 +675,8 @@
 		softc->flags |= ADA_FLAG_CAN_48BIT;
 	if (cgd->ident_data.support.command2 & ATA_SUPPORT_FLUSHCACHE)
 		softc->flags |= ADA_FLAG_CAN_FLUSHCACHE;
+	if (cgd->ident_data.support.command1 & ATA_SUPPORT_POWERMGT)
+		softc->flags |= ADA_FLAG_CAN_POWERMGT;
 	if (cgd->ident_data.satacapabilities & ATA_SUPPORT_NCQ &&
 	    cgd->inq_flags & SID_CmdQue)
 		softc->flags |= ADA_FLAG_CAN_NCQ;
@@ -1222,6 +1234,58 @@
 					 /*getcount_only*/0);
 		cam_periph_unlock(periph);
 	}
+
+	if (ada_spindown_shutdown == 0 ||
+	    (howto & (RB_HALT | RB_POWEROFF)) == 0)
+		return;
+
+	DELAY(500000);
+
+	TAILQ_FOREACH(periph, &adadriver.units, unit_links) {
+		union ccb ccb;
+
+		/* If we paniced with lock held - not recurse here. */
+		if (cam_periph_owned(periph))
+			continue;
+		cam_periph_lock(periph);
+		softc = (struct ada_softc *)periph->softc;
+		/*
+		 * We only spin-down the drive if it is capable of it..
+		 */
+		if ((softc->flags & ADA_FLAG_CAN_POWERMGT) == 0) {
+			cam_periph_unlock(periph);
+			continue;
+		}
+
+		/* XXX Hide this behind bootverbose? */
+		xpt_print(periph->path, "spin-down\n");
+
+		xpt_setup_ccb(&ccb.ccb_h, periph->path, CAM_PRIORITY_NORMAL);
+
+		ccb.ccb_h.ccb_state = ADA_CCB_DUMP;
+		cam_fill_ataio(&ccb.ataio,
+				    1,
+				    adadone,
+				    CAM_DIR_NONE,
+				    0,
+				    NULL,
+				    0,
+				    ada_default_timeout*1000);
+
+		ata_28bit_cmd(&ccb.ataio, ATA_STANDBY_IMMEDIATE, 0, 0, 0);
+		xpt_polled_action(&ccb);
+
+		if ((ccb.ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP)
+			xpt_print(periph->path, "Spin-down disk failed\n");
+
+		if ((ccb.ccb_h.status & CAM_DEV_QFRZN) != 0)
+			cam_release_devq(ccb.ccb_h.path,
+					 /*relsim_flags*/0,
+					 /*reduction*/0,
+					 /*timeout*/0,
+					 /*getcount_only*/0);
+		cam_periph_unlock(periph);
+	}
 }
 
 #endif /* _KERNEL */



-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch�ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M�n-
chen, HRB 125758,  Gesch�ftsf�hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"I have stopped reading Stephen King novels.
Now I just read C code instead."
        -- Richard A. O'Keefe

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 17:32:28 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AC3FE106567A
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 17:32:28 +0000 (UTC)
	(envelope-from PMahan@adaranet.com)
Received: from barracuda.adaranet.com (smtp.adaranet.com [72.5.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 7C3618FC18
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 17:32:28 +0000 (UTC)
X-ASG-Debug-ID: 1284657190-506121d90001-P5m3U7
Received: from SJ-EXCH-1.adaranet.com ([10.10.1.29]) by barracuda.adaranet.com
	with ESMTP id uITb2gqaTP51pqht; Thu, 16 Sep 2010 10:13:10 -0700 (PDT)
X-Barracuda-Envelope-From: PMahan@adaranet.com
Received: from mycroft.adaranet.com (10.10.24.100) by SJ-EXCH-1.adaranet.com
	(10.10.1.29) with Microsoft SMTP Server (TLS) id 8.1.240.5;
	Thu, 16 Sep 2010 10:13:10 -0700
Message-ID: <4C925133.4060309@adaranet.com>
X-Barracuda-BBL-IP: nil
Date: Thu, 16 Sep 2010 10:17:39 -0700
From: Patrick Mahan <pmahan@adaranet.com>
User-Agent: Thunderbird 2.0.0.23 (X11/20091021)
MIME-Version: 1.0
To: John Baldwin <jhb@freebsd.org>
X-ASG-Orig-Subj: Re: odd issues with DDB vs GDB
References: <4C915E4F.9030006@adaranet.com>
	<201009160815.18679.jhb@freebsd.org>
In-Reply-To: <201009160815.18679.jhb@freebsd.org>
Content-Type: text/plain; charset="iso-8859-15"; format=flowed
Content-Transfer-Encoding: 7bit
X-Barracuda-Connect: UNKNOWN[10.10.1.29]
X-Barracuda-Start-Time: 1284657190
X-Barracuda-URL: http://172.16.10.203:8000/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at adaranet.com
Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject: Re: odd issues with DDB vs GDB
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 17:32:28 -0000



John Baldwin wrote:
> On Wednesday, September 15, 2010 8:01:19 pm Patrick Mahan wrote:
>> All,
>>
>> I am trying to debug a system hang occurring on my HP Proliant G6 running some of our
>> kernel software.  I am seeing that under certain test loads, the system will hang-up
>> complete, no keyboard, no console, etc.  I suspect it is some of the kernel code that
>> I have inherited that contains a lot of locking (lots of data structure, each having
>> their own mutex lock (sleepable)).
> 
> You need to use 'kgdb' rather than 'gdb' on kernel.debug.
>

Doh! *-(

I'm so used to gdb even though I use kgdb for looking at crash dumps.

Thanks,

Patrick

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 17:33:09 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 52D7F1065673
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 17:33:09 +0000 (UTC)
	(envelope-from simon@comsys.ntu-kpi.kiev.ua)
Received: from comsys.kpi.ua (comsys.kpi.ua [77.47.192.42])
	by mx1.freebsd.org (Postfix) with ESMTP id CC2858FC16
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 17:33:08 +0000 (UTC)
Received: from pm513-1.comsys.kpi.ua ([10.18.52.101]
	helo=pm513-1.comsys.ntu-kpi.kiev.ua)
	by comsys.kpi.ua with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63)
	(envelope-from <simon@comsys.ntu-kpi.kiev.ua>)
	id 1OwIKZ-00052m-8B; Thu, 16 Sep 2010 20:33:07 +0300
Received: by pm513-1.comsys.ntu-kpi.kiev.ua (Postfix, from userid 1001)
	id 9A4B81CC1E; Thu, 16 Sep 2010 20:33:07 +0300 (EEST)
Date: Thu, 16 Sep 2010 20:33:07 +0300
From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: Matthew Fleming <mdf356@gmail.com>
Message-ID: <20100916173307.GA1994@pm513-1.comsys.ntu-kpi.kiev.ua>
References: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
	<AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf@mail.gmail.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
X-Authenticated-User: simon@comsys.ntu-kpi.kiev.ua
X-Authenticator: plain
X-Sender-Verify: SUCCEEDED (sender exists & accepts mail)
X-Exim-Version: 4.63 (build at 06-Jan-2007 23:14:37)
X-Date: 2010-09-16 20:33:07
X-Connected-IP: 10.18.52.101:59547
X-Message-Linecount: 121
X-Body-Linecount: 105
X-Message-Size: 6165
X-Body-Size: 5370
Cc: freebsd-hackers@freebsd.org
Subject: Re: Questions about mutex implementation in kern/kern_mutex.c
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 17:33:09 -0000

On Wed, Sep 15, 2010 at 08:46:00AM -0700, Matthew Fleming wrote:
> I'll take a stab at answering these...
> 
> On Wed, Sep 15, 2010 at 6:44 AM, Andrey Simonenko
> <simon@comsys.ntu-kpi.kiev.ua> wrote:
> > Hello,
> >
> > I have questions about mutex implementation in kern/kern_mutex.c
> > and sys/mutex.h files (current versions of these files):
> >
> > 1. Is the following statement correct for a volatile pointer or integer
> >  variable: if a volatile variable is updated by the compare-and-set
> >  instruction (e.g. atomic_cmpset_ptr(&val, ...)), then the current
> >  value of such variable can be read without any special instruction
> >  (e.g. v = val)?
> >
> >  I checked Assembler code for a function with "v = val" and "val = v"
> >  like statements generated for volatile variable and simple variable
> >  and found differences: on ia64 "v = val" was implemented by ld.acq and
> >  "val = v" was implemented by st.rel; on mips and sparc64 Assembler code
> >  can have different order of lines for volatile and simple variable
> >  (depends on the code of a function).
> 
> I think this depends somewhat on the hardware and what you mean by
> "current" value.

"Current" value means that the value of a variable read by one thread
is equal to the value of this variable successfully updated by another
thread by the compare-and-set instruction.  As I understand from the kernel
source code, atomic_cmpset_ptr() allows to update a variable in a way that
all other CPUs will invalidate corresponding cache lines that contain
the value of this variable.

The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
special to compare the value of m->mtx_lock (volatile) with current thread
pointer, all other functions that update m->mtx_lock of unowned mutex use
compare-and-set instruction.  Also I cannot find anything special in
generated Assembler code for volatile variables (except for ia64 where
acquire loads and release stores are used).

> 
> If you want a value that is not in-flux, then something like
> atomic_cmpset_ptr() setting to the current value is needed, so that
> you force any other atomic_cmpset to fail.  However, since there is no
> explicit lock involved, there is no strong meaning for "current" value
> and a read that does not rely on a value cached in a register is
> likely sufficient.  While the "volatile" keyword in C has no explicit
> hardware meaning, it often means that a load from memory (or,
> presumably, L1-L3 cache) is required.

The "volatile" keyword here and all questions are related to the base C
compiler, current version and currently supported architectures in FreeBSD.
Yes, here under "volatile" I want to say that the value of a variable is
not cached in a register and it is referenced by its address in all
commands.

There are some places in the kernel where a variable is updated in
something like "do { v = value; } while (!atomic_cmpset_int(&value, ...));"
and that variable is not "volatile", but the compiler generates correct
Assembler code.  So "volatile" is not a requirement for all cases.

> 
> > 2. Let there is a default (sleep) mutex and adaptive mutexes is enabled.
> >  A thread tries to obtain lock quickly and fails, _mtx_lock_sleep()
> >  is called, it gets the address of the current mutex's owner thread
> >  and checks whether that owner thread is running (on another CPU).
> >  How does _mtx_lock_sleep() know that that thread still exists
> >  (lines 311-337 in kern_mutex.c)?
> >
> >  When adaptive mutexes was implemented there was explicit locking
> >  around adaptive mutexes code. When turnstile in mutex code was
> >  implemented that's locking logic was changed.
> 
> It appears that it's possible for the thread pointer to be recycled
> between fetching the value of owner and looking at TD_IS_RUNNING.  On
> actual hardware, this race is unlikely to occur due to the time it
> takes for a thread to release a lock and perform all of thread exit
> code before the struct thread is returned to the uma zone.  However,
> even once returned to the uma zone on many FreeBSD implementations the
> access is safe as the address of the thread is still dereferenceable,
> due to the implementation of uma zones.

I checked exactly this scenario, that's why asked this question to verify
my understanding.

> 
> > 3. Why there is no any memory barrier in mtx_init()? If another thread
> >  (on another CPU) finds that mutex is initialized using mtx_initialized()
> >  then it can mtx_lock() it and mtx_lock() it second time, as a result
> >  mtx_recurse field will be increased, but its value still can be
> >  uninitialized on architecture with relaxed memory ordering model.
> 
> It seems to me that it's generally a programming error to rely on the
> return of mtx_initialized(), as there is no serialization with e.g. a
> thread calling mtx_destroy().  A fully correct serialization model
> would require that a single thread initialize the mtx and then create
> any worker threads that will use the mtx.

I agree that this should not happen in practice.  Another thread can get
a pointer to just initialized mutex and begin to work with it, so
mtx_initialized() is not a requirement.  I just want to say that when
mtx_init() is finished, it does not mean that just initialized mutex by
one thread is ready to be used by another thread.

Thank you for answers.

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 18:02:36 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5A78D1065672
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 18:02:36 +0000 (UTC)
	(envelope-from rwmaillists@googlemail.com)
Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com
	[74.125.82.182])
	by mx1.freebsd.org (Postfix) with ESMTP id DC9ED8FC17
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 18:02:35 +0000 (UTC)
Received: by wyb33 with SMTP id 33so2135362wyb.13
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 11:02:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=googlemail.com; s=gamma;
	h=domainkey-signature:received:received:date:from:to:subject
	:message-id:in-reply-to:references:x-mailer:mime-version
	:content-type:content-transfer-encoding;
	bh=KabAr2bqs7bh6gCNdZI1LiSGjqw1JKfmkzEmIKaVSBY=;
	b=Z1LuLy/zTiK2km6AA6WTrzC5LAUAYZEK6QVtsMeziuCDOJav8j6UYnxnRhb1nSf4Q1
	8FX5PjhUDAnNJHS7TGNnZ0gzgHKoi6yUE3VkJpGu8IaJrz7JVSVjr0PFwI4D/pMdoMF3
	x5ldIo1NM5oDkLiUiFgouATiKAfN/mUSa1ptc=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma;
	h=date:from:to:subject:message-id:in-reply-to:references:x-mailer
	:mime-version:content-type:content-transfer-encoding;
	b=BYx1Iq1QUR9JFR4J/vFEWFNnfVnJhgl50YvjVBR+vDmTA82XgpFQuQs3ZSNhFNgM5g
	boBartZaCkP02iCqpRfEjHAnWLFWmGcj8m7olznk1wG8+8+b5TVDWSeh/TRk2PgG5Htc
	k5ohRKQBactYA3efSvA0HaXQkbgzKf4OoKJOQ=
Received: by 10.216.21.204 with SMTP id r54mr3001019wer.95.1284658266706;
	Thu, 16 Sep 2010 10:31:06 -0700 (PDT)
Received: from gumby.homeunix.com (bb-87-81-140-128.ukonline.co.uk
	[87.81.140.128])
	by mx.google.com with ESMTPS id p82sm2001464weq.3.2010.09.16.10.31.05
	(version=SSLv3 cipher=RC4-MD5); Thu, 16 Sep 2010 10:31:05 -0700 (PDT)
Date: Thu, 16 Sep 2010 18:31:03 +0100
From: RW <rwmaillists@googlemail.com>
To: freebsd-hackers@freebsd.org
Message-ID: <20100916183103.20c70a5a@gumby.homeunix.com>
In-Reply-To: <86mxri17j3.fsf@ds4.des.no>
References: <alpine.BSF.2.00.1003051636030.2481@wonkity.com>
	<201009152143.o8FLhE9p022233@lurza.secnetix.de>
	<20100916004902.GA46401@freebsd.org>
	<AANLkTik9dCT60KDn5gVAsLi8-LRD5KFK1kKJO_9j=x-Z@mail.gmail.com>
	<86mxri17j3.fsf@ds4.des.no>
X-Mailer: Claws Mail 3.7.6 (GTK+ 2.20.1; i386-portbld-freebsd8.1)
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 18:02:36 -0000

On Thu, 16 Sep 2010 09:17:52 +0200
Dag-Erling Sm=F8rgrav <des@des.no> wrote:

> Garrett Cooper <gcooper@FreeBSD.org> writes:
> > Agreed. Spinning down at reboot isn't smart and seems like a good
> > way to kill a disk quicker.
>=20
> *not* spinning down at halt is far worse.  Most modern disks are rated
> for hundreds of thousands of load-unload cycles, but far fewer
> emergency unloads (which is what happens when the drive loses power
> while still spinning).

As I understand it wear from spinning-down used to come from the head
actually scraping the disk surface as it lost lift, parking placed the
head on a disposable area, but modern drives take the head off the disk
altogether.

When Hitachi was specifying 300,000 unloads, they said that in testing
the drives were still working at 1,000,000, someone quoted 600,000 as
the current spec. At these levels you can be spinning the drives
down and up  ever few minutes for the normal lifetime of the drive.

Even on very old drives I doubt reboot are much of a problem, they're
rare on servers. On laptops and desktops they're rare compared to
shutdowns and suspends. =20

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 18:16:08 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B14BC1065674
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 18:16:08 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 7EFE88FC1C
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 18:16:08 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id D323646B52;
	Thu, 16 Sep 2010 14:16:07 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 8A9128A03C;
	Thu, 16 Sep 2010 14:16:06 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
Date: Thu, 16 Sep 2010 14:16:05 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
	<AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf@mail.gmail.com>
	<20100916173307.GA1994@pm513-1.comsys.ntu-kpi.kiev.ua>
In-Reply-To: <20100916173307.GA1994@pm513-1.comsys.ntu-kpi.kiev.ua>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201009161416.05759.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Thu, 16 Sep 2010 14:16:06 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: freebsd-hackers@freebsd.org, Matthew Fleming <mdf356@gmail.com>
Subject: Re: Questions about mutex implementation in kern/kern_mutex.c
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 18:16:08 -0000

On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
> On Wed, Sep 15, 2010 at 08:46:00AM -0700, Matthew Fleming wrote:
> > I'll take a stab at answering these...
> > 
> > On Wed, Sep 15, 2010 at 6:44 AM, Andrey Simonenko
> > <simon@comsys.ntu-kpi.kiev.ua> wrote:
> > > Hello,
> > >
> > > I have questions about mutex implementation in kern/kern_mutex.c
> > > and sys/mutex.h files (current versions of these files):
> > >
> > > 1. Is the following statement correct for a volatile pointer or integer
> > >  variable: if a volatile variable is updated by the compare-and-set
> > >  instruction (e.g. atomic_cmpset_ptr(&val, ...)), then the current
> > >  value of such variable can be read without any special instruction
> > >  (e.g. v = val)?
> > >
> > >  I checked Assembler code for a function with "v = val" and "val = v"
> > >  like statements generated for volatile variable and simple variable
> > >  and found differences: on ia64 "v = val" was implemented by ld.acq and
> > >  "val = v" was implemented by st.rel; on mips and sparc64 Assembler code
> > >  can have different order of lines for volatile and simple variable
> > >  (depends on the code of a function).
> > 
> > I think this depends somewhat on the hardware and what you mean by
> > "current" value.
> 
> "Current" value means that the value of a variable read by one thread
> is equal to the value of this variable successfully updated by another
> thread by the compare-and-set instruction.  As I understand from the kernel
> source code, atomic_cmpset_ptr() allows to update a variable in a way that
> all other CPUs will invalidate corresponding cache lines that contain
> the value of this variable.

That is not true.  It is likely true on x86, but it is certainly not true on
other architectures such as sparc64 where a write may be held in a store 
buffer for an indeterminate amount of time (and note that some lock releases 
are simple stores with a "rel" memory barrier).  All that we require is that 
if the value is stale, the atomic_cmpset() that attempts to set MTX_CONTESTED 
will fail.

> The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
> special to compare the value of m->mtx_lock (volatile) with current thread
> pointer, all other functions that update m->mtx_lock of unowned mutex use
> compare-and-set instruction.  Also I cannot find anything special in
> generated Assembler code for volatile variables (except for ia64 where
> acquire loads and release stores are used).

No, mtx_owned() is just not harmed by the races it loses.  You can certainly 
read a stale value of mtx_lock in mtx_owned() if some other thread owns the 
lock or has just released the lock.  However, we don't care, because in both 
of those cases, mtx_owned() returns false.  What does matter is that 
mtx_owned() can only return true if we currently hold the mutex.  This works 
because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the 
same time, and 2) even CPUs that hold writes in store buffers will snoop their 
store buffer for local reads on that CPU.  That is, a given CPU will never 
read a stale value of a memory word that is "older" than a write it has 
performed to that word.

> > If you want a value that is not in-flux, then something like
> > atomic_cmpset_ptr() setting to the current value is needed, so that
> > you force any other atomic_cmpset to fail.  However, since there is no
> > explicit lock involved, there is no strong meaning for "current" value
> > and a read that does not rely on a value cached in a register is
> > likely sufficient.  While the "volatile" keyword in C has no explicit
> > hardware meaning, it often means that a load from memory (or,
> > presumably, L1-L3 cache) is required.
> 
> The "volatile" keyword here and all questions are related to the base C
> compiler, current version and currently supported architectures in FreeBSD.
> Yes, here under "volatile" I want to say that the value of a variable is
> not cached in a register and it is referenced by its address in all
> commands.
> 
> There are some places in the kernel where a variable is updated in
> something like "do { v = value; } while (!atomic_cmpset_int(&value, ...));"
> and that variable is not "volatile", but the compiler generates correct
> Assembler code.  So "volatile" is not a requirement for all cases.

Hmm, I suspect that many of those places actually do use volatile.  The 
various lock cookies (mtx_lock, etc.) are declared volatile in the structure.  
Otherwise the compiler would be free to conclude that 'v = value;' is a loop 
invariant and move it out of the loop which would break.  Given that, the 
construct you referred to does in fact require 'value' to be volatile.

-- 
John Baldwin

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 19:00:37 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CEDB31065670
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 19:00:37 +0000 (UTC)
	(envelope-from mj@feral.com)
Received: from ns1.feral.com (ns1.feral.com [192.67.166.1])
	by mx1.freebsd.org (Postfix) with ESMTP id 9DAAF8FC12
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 19:00:37 +0000 (UTC)
Received: from [192.168.221.2] (remotevpn [192.168.221.2])
	by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o8GJ0ap9029969
	(version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 12:00:36 -0700 (PDT)
	(envelope-from mj@feral.com)
Message-ID: <4C92694D.1070705@feral.com>
Date: Thu, 16 Sep 2010 12:00:29 -0700
From: Matthew Jacob <mj@feral.com>
Organization: Feral Software
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.9.2.9) Gecko/20100825 Thunderbird/3.1.3
MIME-Version: 1.0
To: freebsd-hackers@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.6
	(ns1.feral.com [192.168.221.1]);
	Thu, 16 Sep 2010 12:00:37 -0700 (PDT)
Subject: race conditions for destroying and opening a dev
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 19:00:37 -0000


Has anyone seen this scenario before? I am seeing it in RELENG_7, but 
the code in question exists through to head.

Thread 1:

(kgdb) where
#0  sched_switch (td=0xffffff003a04ea80, newtd=0xffffff00210b4000, 
flags=Variable "flags" is not available.
) at ../../../kern/sched_ule.c:1944
#1  0xffffffff803b6091 in mi_switch (flags=1, newtd=0x0) at 
../../../kern/kern_synch.c:450
#2  0xffffffff80402399 in sleepq_switch (wchan=0xffffff8413b50b60) at 
../../../kern/subr_sleepqueue.c:497
#3  0xffffffff80402e8c in sleepq_timedwait (wchan=0xffffff8413b50b60) at 
../../../kern/subr_sleepqueue.c:615
#4  0xffffffff803b682d in _sleep (ident=0xffffff8413b50b60, 
lock=0xffffffff80b0ee00, priority=76, wmesg=0xffffffff806583bb "devdrn", 
timo=100) at ../../../kern/kern_synch.c:228
#5  0xffffffff8037640c in destroy_devl (dev=0xffffff003aaf0000) at 
../../../kern/kern_conf.c:874
#6  0xffffffff80376759 in destroy_dev (dev=0xffffff003aaf0000) at 
../../../kern/kern_conf.c:916
#7  0xffffffff8034c939 in g_dev_orphan (cp=0xffffff003a544800) at 
../../../geom/geom_dev.c:438
#8  0xffffffff803506a0 in g_run_events () at ../../../geom/geom_event.c:164
#9  0xffffffff80351f1c in g_event_procbody () at 
../../../geom/geom_kern.c:141
#10 0xffffffff8038a73a in fork_exit (callout=0xffffffff80351eb0 
<g_event_procbody at ../../../geom/geom_kern.c:132>, arg=0x0, 
frame=0xffffff8413b50c80) at ../../../kern/kern_fork.c:829
#11 0xffffffff805a747e in fork_trampoline () at 
../../../amd64/amd64/exception.S:564
#12 0x0000000000000000 in ?? ()

This thread is waiting on the threadcount to go away- i.e., the last 
close of the device to occur ("da16" in this case).

Thread 2:

(kgdb) where
#0  sched_switch (td=0xffffff009bb4ca80, newtd=0xffffff003af43380, 
flags=Variable "flags" is not available.
) at ../../../kern/sched_ule.c:1944
#1  0xffffffff803b6091 in mi_switch (flags=1, newtd=0x0) at 
../../../kern/kern_synch.c:450
#2  0xffffffff80402399 in sleepq_switch (wchan=0xffffffff80b0e040) at 
../../../kern/subr_sleepqueue.c:497
#3  0xffffffff80402f84 in sleepq_wait (wchan=0xffffffff80b0e040) at 
../../../kern/subr_sleepqueue.c:580
#4  0xffffffff803b5385 in _sx_xlock_hard (sx=0xffffffff80b0e040, 
tid=18446742976810240640, opts=Variable "opts" is not available.
) at ../../../kern/kern_sx.c:562
#5  0xffffffff803b5731 in _sx_xlock (sx=0xffffffff80b0e040, opts=0, 
file=0xffffffff80652d27 "../../../geom/geom_dev.c", line=196) at sx.h:154
#6  0xffffffff8034d1bc in g_dev_open (dev=0xffffff003aaf0000, flags=1, 
fmt=Variable "fmt" is not available.
) at ../../../geom/geom_dev.c:196
#7  0xffffffff80333741 in devfs_open (ap=0xffffff841dea88b0) at 
../../../fs/devfs/devfs_vnops.c:902
#8  0xffffffff80601daf in VOP_OPEN_APV (vop=0xffffffff8089fb80, 
a=0xffffff841dea88b0) at vnode_if.c:371
#9  0xffffffff80467246 in vn_open_cred (ndp=0xffffff841dea8a00, 
flagp=0xffffff841dea894c, cmode=Variable "cmode" is not available.
) at vnode_if.h:199
#10 0xffffffff80463770 in kern_open (td=0xffffff009bb4ca80, 
path=0x5114a0 <Address 0x5114a0 out of bounds>, pathseg=Variable 
"pathseg" is not available.
) at ../../../kern/vfs_syscalls.c:1054
#11 0xffffffff805c599e in syscall (frame=0xffffff841dea8c80) at 
../../../amd64/amd64/trap.c:911
#12 0xffffffff805a723b in Xfast_syscall () at 
../../../amd64/amd64/exception.S:349
#13 0x00000008009a219c in ?? ()

This thread was opening the device, bumped the refcount, but then wedged 
on the geom topology lock .....

the refcount field is protected under devmtx....

Anyone seen this?

I'm half inclined to either add in CDP_SCHED_DTR when one calls 
destroy_dev, or make dev_refthread look at CDP_ACTIVE, leaning more 
toward the latter.

Any thoughts on this?




From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 19:11:06 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 68F4F106564A
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 19:11:06 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id F3B758FC17
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 19:11:05 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o8GJAvVF028136
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Thu, 16 Sep 2010 22:10:57 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	o8GJAvVR012684; Thu, 16 Sep 2010 22:10:57 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o8GJAv9U012683; 
	Thu, 16 Sep 2010 22:10:57 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Thu, 16 Sep 2010 22:10:57 +0300
From: Kostik Belousov <kostikbel@gmail.com>
To: Matthew Jacob <mj@feral.com>
Message-ID: <20100916191057.GF2389@deviant.kiev.zoral.com.ua>
References: <4C92694D.1070705@feral.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="8TaQrIeukR7mmbKf"
Content-Disposition: inline
In-Reply-To: <4C92694D.1070705@feral.com>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.1 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: freebsd-hackers@freebsd.org
Subject: Re: race conditions for destroying and opening a dev
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 19:11:06 -0000


--8TaQrIeukR7mmbKf
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Sep 16, 2010 at 12:00:29PM -0700, Matthew Jacob wrote:
>=20
> Has anyone seen this scenario before? I am seeing it in RELENG_7, but=20
> the code in question exists through to head.
>=20
> Thread 1:
>=20
> (kgdb) where
> #0  sched_switch (td=3D0xffffff003a04ea80, newtd=3D0xffffff00210b4000,=20
> flags=3DVariable "flags" is not available.
> ) at ../../../kern/sched_ule.c:1944
> #1  0xffffffff803b6091 in mi_switch (flags=3D1, newtd=3D0x0) at=20
> ../../../kern/kern_synch.c:450
> #2  0xffffffff80402399 in sleepq_switch (wchan=3D0xffffff8413b50b60) at=
=20
> ../../../kern/subr_sleepqueue.c:497
> #3  0xffffffff80402e8c in sleepq_timedwait (wchan=3D0xffffff8413b50b60) a=
t=20
> ../../../kern/subr_sleepqueue.c:615
> #4  0xffffffff803b682d in _sleep (ident=3D0xffffff8413b50b60,=20
> lock=3D0xffffffff80b0ee00, priority=3D76, wmesg=3D0xffffffff806583bb "dev=
drn",=20
> timo=3D100) at ../../../kern/kern_synch.c:228
> #5  0xffffffff8037640c in destroy_devl (dev=3D0xffffff003aaf0000) at=20
> ../../../kern/kern_conf.c:874
> #6  0xffffffff80376759 in destroy_dev (dev=3D0xffffff003aaf0000) at=20
> ../../../kern/kern_conf.c:916
> #7  0xffffffff8034c939 in g_dev_orphan (cp=3D0xffffff003a544800) at=20
> ../../../geom/geom_dev.c:438
> #8  0xffffffff803506a0 in g_run_events () at ../../../geom/geom_event.c:1=
64
> #9  0xffffffff80351f1c in g_event_procbody () at=20
> ../../../geom/geom_kern.c:141
> #10 0xffffffff8038a73a in fork_exit (callout=3D0xffffffff80351eb0=20
> <g_event_procbody at ../../../geom/geom_kern.c:132>, arg=3D0x0,=20
> frame=3D0xffffff8413b50c80) at ../../../kern/kern_fork.c:829
> #11 0xffffffff805a747e in fork_trampoline () at=20
> ../../../amd64/amd64/exception.S:564
> #12 0x0000000000000000 in ?? ()
>=20
> This thread is waiting on the threadcount to go away- i.e., the last=20
> close of the device to occur ("da16" in this case).
>=20
> Thread 2:
>=20
> (kgdb) where
> #0  sched_switch (td=3D0xffffff009bb4ca80, newtd=3D0xffffff003af43380,=20
> flags=3DVariable "flags" is not available.
> ) at ../../../kern/sched_ule.c:1944
> #1  0xffffffff803b6091 in mi_switch (flags=3D1, newtd=3D0x0) at=20
> ../../../kern/kern_synch.c:450
> #2  0xffffffff80402399 in sleepq_switch (wchan=3D0xffffffff80b0e040) at=
=20
> ../../../kern/subr_sleepqueue.c:497
> #3  0xffffffff80402f84 in sleepq_wait (wchan=3D0xffffffff80b0e040) at=20
> ../../../kern/subr_sleepqueue.c:580
> #4  0xffffffff803b5385 in _sx_xlock_hard (sx=3D0xffffffff80b0e040,=20
> tid=3D18446742976810240640, opts=3DVariable "opts" is not available.
> ) at ../../../kern/kern_sx.c:562
> #5  0xffffffff803b5731 in _sx_xlock (sx=3D0xffffffff80b0e040, opts=3D0,=
=20
> file=3D0xffffffff80652d27 "../../../geom/geom_dev.c", line=3D196) at sx.h=
:154
> #6  0xffffffff8034d1bc in g_dev_open (dev=3D0xffffff003aaf0000, flags=3D1=
,=20
> fmt=3DVariable "fmt" is not available.
> ) at ../../../geom/geom_dev.c:196
> #7  0xffffffff80333741 in devfs_open (ap=3D0xffffff841dea88b0) at=20
> ../../../fs/devfs/devfs_vnops.c:902
> #8  0xffffffff80601daf in VOP_OPEN_APV (vop=3D0xffffffff8089fb80,=20
> a=3D0xffffff841dea88b0) at vnode_if.c:371
> #9  0xffffffff80467246 in vn_open_cred (ndp=3D0xffffff841dea8a00,=20
> flagp=3D0xffffff841dea894c, cmode=3DVariable "cmode" is not available.
> ) at vnode_if.h:199
> #10 0xffffffff80463770 in kern_open (td=3D0xffffff009bb4ca80,=20
> path=3D0x5114a0 <Address 0x5114a0 out of bounds>, pathseg=3DVariable=20
> "pathseg" is not available.
> ) at ../../../kern/vfs_syscalls.c:1054
> #11 0xffffffff805c599e in syscall (frame=3D0xffffff841dea8c80) at=20
> ../../../amd64/amd64/trap.c:911
> #12 0xffffffff805a723b in Xfast_syscall () at=20
> ../../../amd64/amd64/exception.S:349
> #13 0x00000008009a219c in ?? ()
>=20
> This thread was opening the device, bumped the refcount, but then wedged=
=20
> on the geom topology lock .....
>=20
> the refcount field is protected under devmtx....
>=20
> Anyone seen this?
>=20
> I'm half inclined to either add in CDP_SCHED_DTR when one calls=20
> destroy_dev, or make dev_refthread look at CDP_ACTIVE, leaning more=20
> toward the latter.
>=20
> Any thoughts on this?

And who owns the topology lock ? Is it thread 1 ?

Destroy_devl() clears si_devsw for departing cdev, and *refthread()
checks si_devsw against NULL as an indicator of device destruction in
progress.

I think that this situation is what destroy_dev_sched(9) was created
for.

--8TaQrIeukR7mmbKf
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (FreeBSD)

iEYEARECAAYFAkySa8AACgkQC3+MBN1Mb4jKNwCgv30TrKYWhEeXq1KmjAP516a4
AxAAoKkXX9pQeQkkTIxWtC0V8662YWhb
=gNHJ
-----END PGP SIGNATURE-----

--8TaQrIeukR7mmbKf--

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 19:45:50 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0BBDD1065672
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 19:45:50 +0000 (UTC)
	(envelope-from mdf356@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 6A5ED8FC08
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 19:45:49 +0000 (UTC)
Received: by iwn34 with SMTP id 34so1514022iwn.13
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 12:45:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=HOzJqFNga1KhNkzgOYbMTNziS2nPm+kCrDKKKHokiSs=;
	b=IVcXYD+SW90n01lylQwKdIVnY/tUyKlmAx/JFan6hHlhGKhdV6mLP7ITsu43EcyPw7
	irOL1OvDC4ucSc3A9gBtOncar4TUmRjAg40lnUNaCdpzoGVAhKwGLI8gvZFhxoCxhOif
	dtN1sPz5M2LkIUkrzWM7oNzxYHlJyeGCPOA8I=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=QG6o9Txg51rKZGkruImg/68QpGiCohZLee2HGsX05Tru7mhrqkG4eqSGYh5eWA/o+z
	pOnH50WKIWGv4il5GI8HtPORslGWAQqKud1wCEcnN4TFicPlKLLvE1p7hZrR0Wcp7HFM
	XRfof4KZyixodEoCEY4AquOlzEWsneF6/x02E=
MIME-Version: 1.0
Received: by 10.231.31.129 with SMTP id y1mr3938081ibc.45.1284666348448; Thu,
	16 Sep 2010 12:45:48 -0700 (PDT)
Received: by 10.231.187.71 with HTTP; Thu, 16 Sep 2010 12:45:48 -0700 (PDT)
In-Reply-To: <4C92694D.1070705@feral.com>
References: <4C92694D.1070705@feral.com>
Date: Thu, 16 Sep 2010 12:45:48 -0700
Message-ID: <AANLkTimiLCDtHKa22Yjhx2vEJ5Va4pPHOTvfADQo9+w4@mail.gmail.com>
From: Matthew Fleming <mdf356@gmail.com>
To: Matthew Jacob <mj@feral.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-hackers@freebsd.org
Subject: Re: race conditions for destroying and opening a dev
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 19:45:50 -0000

On Thu, Sep 16, 2010 at 12:00 PM, Matthew Jacob <mj@feral.com> wrote:
>
> Has anyone seen this scenario before? I am seeing it in RELENG_7, but the
> code in question exists through to head.
>
> Thread 1:
>
> (kgdb) where
> #0 =A0sched_switch (td=3D0xffffff003a04ea80, newtd=3D0xffffff00210b4000,
> flags=3DVariable "flags" is not available.
> ) at ../../../kern/sched_ule.c:1944
> #1 =A00xffffffff803b6091 in mi_switch (flags=3D1, newtd=3D0x0) at
> ../../../kern/kern_synch.c:450
> #2 =A00xffffffff80402399 in sleepq_switch (wchan=3D0xffffff8413b50b60) at
> ../../../kern/subr_sleepqueue.c:497
> #3 =A00xffffffff80402e8c in sleepq_timedwait (wchan=3D0xffffff8413b50b60)=
 at
> ../../../kern/subr_sleepqueue.c:615
> #4 =A00xffffffff803b682d in _sleep (ident=3D0xffffff8413b50b60,
> lock=3D0xffffffff80b0ee00, priority=3D76, wmesg=3D0xffffffff806583bb "dev=
drn",
> timo=3D100) at ../../../kern/kern_synch.c:228
> #5 =A00xffffffff8037640c in destroy_devl (dev=3D0xffffff003aaf0000) at
> ../../../kern/kern_conf.c:874
> #6 =A00xffffffff80376759 in destroy_dev (dev=3D0xffffff003aaf0000) at
> ../../../kern/kern_conf.c:916
> #7 =A00xffffffff8034c939 in g_dev_orphan (cp=3D0xffffff003a544800) at
> ../../../geom/geom_dev.c:438
> #8 =A00xffffffff803506a0 in g_run_events () at ../../../geom/geom_event.c=
:164
> #9 =A00xffffffff80351f1c in g_event_procbody () at
> ../../../geom/geom_kern.c:141
> #10 0xffffffff8038a73a in fork_exit (callout=3D0xffffffff80351eb0
> <g_event_procbody at ../../../geom/geom_kern.c:132>, arg=3D0x0,
> frame=3D0xffffff8413b50c80) at ../../../kern/kern_fork.c:829
> #11 0xffffffff805a747e in fork_trampoline () at
> ../../../amd64/amd64/exception.S:564
> #12 0x0000000000000000 in ?? ()
>
> This thread is waiting on the threadcount to go away- i.e., the last clos=
e
> of the device to occur ("da16" in this case).
>
> Thread 2:
>
> (kgdb) where
> #0 =A0sched_switch (td=3D0xffffff009bb4ca80, newtd=3D0xffffff003af43380,
> flags=3DVariable "flags" is not available.
> ) at ../../../kern/sched_ule.c:1944
> #1 =A00xffffffff803b6091 in mi_switch (flags=3D1, newtd=3D0x0) at
> ../../../kern/kern_synch.c:450
> #2 =A00xffffffff80402399 in sleepq_switch (wchan=3D0xffffffff80b0e040) at
> ../../../kern/subr_sleepqueue.c:497
> #3 =A00xffffffff80402f84 in sleepq_wait (wchan=3D0xffffffff80b0e040) at
> ../../../kern/subr_sleepqueue.c:580
> #4 =A00xffffffff803b5385 in _sx_xlock_hard (sx=3D0xffffffff80b0e040,
> tid=3D18446742976810240640, opts=3DVariable "opts" is not available.
> ) at ../../../kern/kern_sx.c:562
> #5 =A00xffffffff803b5731 in _sx_xlock (sx=3D0xffffffff80b0e040, opts=3D0,
> file=3D0xffffffff80652d27 "../../../geom/geom_dev.c", line=3D196) at sx.h=
:154
> #6 =A00xffffffff8034d1bc in g_dev_open (dev=3D0xffffff003aaf0000, flags=
=3D1,
> fmt=3DVariable "fmt" is not available.
> ) at ../../../geom/geom_dev.c:196
> #7 =A00xffffffff80333741 in devfs_open (ap=3D0xffffff841dea88b0) at
> ../../../fs/devfs/devfs_vnops.c:902
> #8 =A00xffffffff80601daf in VOP_OPEN_APV (vop=3D0xffffffff8089fb80,
> a=3D0xffffff841dea88b0) at vnode_if.c:371
> #9 =A00xffffffff80467246 in vn_open_cred (ndp=3D0xffffff841dea8a00,
> flagp=3D0xffffff841dea894c, cmode=3DVariable "cmode" is not available.
> ) at vnode_if.h:199
> #10 0xffffffff80463770 in kern_open (td=3D0xffffff009bb4ca80, path=3D0x51=
14a0
> <Address 0x5114a0 out of bounds>, pathseg=3DVariable "pathseg" is not
> available.
> ) at ../../../kern/vfs_syscalls.c:1054
> #11 0xffffffff805c599e in syscall (frame=3D0xffffff841dea8c80) at
> ../../../amd64/amd64/trap.c:911
> #12 0xffffffff805a723b in Xfast_syscall () at
> ../../../amd64/amd64/exception.S:349
> #13 0x00000008009a219c in ?? ()
>
> This thread was opening the device, bumped the refcount, but then wedged =
on
> the geom topology lock .....
>
> the refcount field is protected under devmtx....
>
> Anyone seen this?
>
> I'm half inclined to either add in CDP_SCHED_DTR when one calls destroy_d=
ev,
> or make dev_refthread look at CDP_ACTIVE, leaning more toward the latter.
>
> Any thoughts on this?

We had a similar bug at Isilon, but in our case it was in
cam/scsi/scsi_pass.c::passcleanup() calling destroy_dev().  We
switched it to destroy_dev_sched() to fix the si_threadcount deadlock.

Cheers,
matthew

From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 20:12:08 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A98DF1065695
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 20:12:08 +0000 (UTC)
	(envelope-from mj@feral.com)
Received: from ns1.feral.com (ns1.feral.com [192.67.166.1])
	by mx1.freebsd.org (Postfix) with ESMTP id 5D6EA8FC12
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 20:12:08 +0000 (UTC)
Received: from [192.168.221.2] (remotevpn [192.168.221.2])
	by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o8GKC6Qm030388
	(version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
	for <freebsd-hackers@freebsd.org>; Thu, 16 Sep 2010 13:12:07 -0700 (PDT)
	(envelope-from mj@feral.com)
Message-ID: <4C927A10.1080202@feral.com>
Date: Thu, 16 Sep 2010 13:12:00 -0700
From: Matthew Jacob <mj@feral.com>
Organization: Feral Software
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
	rv:1.9.2.9) Gecko/20100825 Thunderbird/3.1.3
MIME-Version: 1.0
To: freebsd-hackers@freebsd.org
References: <4C92694D.1070705@feral.com>
	<AANLkTimiLCDtHKa22Yjhx2vEJ5Va4pPHOTvfADQo9+w4@mail.gmail.com>
In-Reply-To: <AANLkTimiLCDtHKa22Yjhx2vEJ5Va4pPHOTvfADQo9+w4@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.6
	(ns1.feral.com [192.168.221.1]);
	Thu, 16 Sep 2010 13:12:07 -0700 (PDT)
Subject: Re: race conditions for destroying and opening a dev
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 20:12:08 -0000

  kostik, matthew- thanks mucho!



From owner-freebsd-hackers@FreeBSD.ORG  Thu Sep 16 22:47:09 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3CCA0106564A;
	Thu, 16 Sep 2010 22:47:09 +0000 (UTC)
	(envelope-from wblock@wonkity.com)
Received: from wonkity.com (wonkity.com [67.158.26.137])
	by mx1.freebsd.org (Postfix) with ESMTP id EB5C48FC19;
	Thu, 16 Sep 2010 22:47:08 +0000 (UTC)
Received: from wonkity.com (localhost [127.0.0.1])
	by wonkity.com (8.14.4/8.14.4) with ESMTP id o8GMl5RD070096;
	Thu, 16 Sep 2010 16:47:05 -0600 (MDT)
	(envelope-from wblock@wonkity.com)
Received: from localhost (wblock@localhost)
	by wonkity.com (8.14.4/8.14.4/Submit) with ESMTP id o8GMl4xJ070093;
	Thu, 16 Sep 2010 16:47:05 -0600 (MDT)
	(envelope-from wblock@wonkity.com)
Date: Thu, 16 Sep 2010 16:47:04 -0600 (MDT)
From: Warren Block <wblock@wonkity.com>
To: Oliver Fromme <olli@lurza.secnetix.de>
In-Reply-To: <201009161619.o8GGJAmv035378@lurza.secnetix.de>
Message-ID: <alpine.BSF.2.00.1009161641410.70075@wonkity.com>
References: <201009161619.o8GGJAmv035378@lurza.secnetix.de>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.6
	(wonkity.com [127.0.0.1]); Thu, 16 Sep 2010 16:47:05 -0600 (MDT)
Cc: freebsd-hackers@freebsd.org, mav@freebsd.org,
	Tijl Coosemans <tijl@coosemans.org>, Alexander Best <arundel@freebsd.org>
Subject: Re: Summary: Re: Spin down HDD after disk sync or before power off
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Sep 2010 22:47:09 -0000

On Thu, 16 Sep 2010, Oliver Fromme wrote:

> I've updated the patch for ada(4).  It includes a bug fix
> (command1 vs. command2) and uses the howto flags passed to
> the shutdown function.  Thanks again for pointing these out.

Works perfectly on a system here.  Thanks!

From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 17 03:24:34 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1A09B106566C;
	Fri, 17 Sep 2010 03:24:34 +0000 (UTC) (envelope-from kaduk@mit.edu)
Received: from dmz-mailsec-scanner-7.mit.edu (DMZ-MAILSEC-SCANNER-7.MIT.EDU
	[18.7.68.36]) by mx1.freebsd.org (Postfix) with ESMTP id B3BF48FC1A;
	Fri, 17 Sep 2010 03:24:33 +0000 (UTC)
X-AuditID: 12074424-b7b2bae000005b3f-87-4c92df56e785
Received: from mailhub-auth-4.mit.edu ( [18.7.62.39])
	by dmz-mailsec-scanner-7.mit.edu (Symantec Brightmail Gateway) with
	SMTP id 23.BA.23359.65FD29C4; Thu, 16 Sep 2010 23:24:06 -0400 (EDT)
Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103])
	by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id o8H3OWJk009314; 
	Thu, 16 Sep 2010 23:24:32 -0400
Received: from multics.mit.edu (MULTICS.MIT.EDU [18.187.1.73])
	(authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU)
	by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id o8H3OUor018959
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT);
	Thu, 16 Sep 2010 23:24:32 -0400 (EDT)
Received: (from kaduk@localhost) by multics.mit.edu (8.12.9.20060308)
	id o8H3OThN013379; Thu, 16 Sep 2010 23:24:29 -0400 (EDT)
Date: Thu, 16 Sep 2010 23:24:29 -0400 (EDT)
From: Benjamin Kaduk <kaduk@MIT.EDU>
To: John Baldwin <jhb@freebsd.org>
In-Reply-To: <201009161416.05759.jhb@freebsd.org>
Message-ID: <alpine.GSO.1.10.1009162317430.9337@multics.mit.edu>
References: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
	<AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf@mail.gmail.com>
	<20100916173307.GA1994@pm513-1.comsys.ntu-kpi.kiev.ua>
	<201009161416.05759.jhb@freebsd.org>
User-Agent: Alpine 1.10 (GSO 962 2008-03-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Brightmail-Tracker: AAAAAA==
Cc: freebsd-hackers@freebsd.org
Subject: Re: Questions about mutex implementation in kern/kern_mutex.c
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Sep 2010 03:24:34 -0000

On Thu, 16 Sep 2010, John Baldwin wrote:

> On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
>
>> The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
>> special to compare the value of m->mtx_lock (volatile) with current thread
>> pointer, all other functions that update m->mtx_lock of unowned mutex use
>> compare-and-set instruction.  Also I cannot find anything special in
>> generated Assembler code for volatile variables (except for ia64 where
>> acquire loads and release stores are used).
>
> No, mtx_owned() is just not harmed by the races it loses.  You can certainly
> read a stale value of mtx_lock in mtx_owned() if some other thread owns the
> lock or has just released the lock.  However, we don't care, because in both
> of those cases, mtx_owned() returns false.  What does matter is that
> mtx_owned() can only return true if we currently hold the mutex.  This works
> because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the
> same time, and 2) even CPUs that hold writes in store buffers will snoop their
> store buffer for local reads on that CPU.  That is, a given CPU will never
> read a stale value of a memory word that is "older" than a write it has
> performed to that word.

Sorry for the naive question, but would you mind expounding a bit on what 
keeps the thread from migrating to a different CPU and getting a stale 
value there?  (I can imagine a couple possible mechanisms, but don't know 
enough to know which one(s) are the real ones.)

Thanks,

Ben Kaduk

From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 17 08:14:41 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A619A1065670;
	Fri, 17 Sep 2010 08:14:41 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 393368FC16;
	Fri, 17 Sep 2010 08:14:39 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA18677;
	Fri, 17 Sep 2010 11:14:37 +0300 (EEST)
	(envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1OwW5c-0008B2-V0; Fri, 17 Sep 2010 11:14:37 +0300
Message-ID: <4C93236B.4050906@freebsd.org>
Date: Fri, 17 Sep 2010 11:14:35 +0300
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.9) Gecko/20100912 Lightning/1.0b2 Thunderbird/3.1.3
MIME-Version: 1.0
To: freebsd-hackers@freebsd.org
X-Enigmail-Version: 1.1.2
Content-Type: multipart/mixed; boundary="------------030602010507080304070903"
Cc: Jeff Roberson <jeff@freebsd.org>
Subject: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Sep 2010 08:14:41 -0000

This is a multi-part message in MIME format.
--------------030602010507080304070903
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit


I've been investigating interaction between zfs and uma for a while.
You might remember that there is a noticeable fragmentation in zfs uma zones
when uma use is not enabled for actual data/metadata buffers.

I also noticed that when uma use is enabled for data/metadata buffers
(zio.use_uma=1) amount of memory reserved in free items of zfs uma zones becomes
really huge.  And this is despite the fact that the vast majority of the
data/metadata zone have items with sizes that are multiples of page size.
This couldn't really be because of fragmentation.

Further checks show that the free items are accumulated in per-cpu cache
buckets.  uz_count for those buckets starts with 1, but over time, during bursts
of activity, it grows up to maximum of 128.
Problem with those buckets is that they are not drained on low memory conditions
and uz_count never goes down.

So, after a while, I observe about 300 free items (on a mere two core system)
cached in 4 per-cpu buckets for a single zone with 128KB item size.
That's 30MB right there.
For all data and metadata zones the number goes as high as 500MB on my machine
with 4GB physical RAM.
This seems like a bit too much to me.

Although keeping free items around improves performance, it does consume memory
too.  And the fact that that memory is not freed on lowmem condition makes the
situation worse.

So, I decided to take a look at how they handle this situation in (Open)Solaris.
There is this good book:
http://books.google.com/books?id=r_cecYD4AKkC&printsec=frontcover
Please see section 6.2.4.5 on page 225 and table 6-11 on page 226.
And also this code:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/kmem.c#971

It makes sense to me to limit size of per-cpu buckets depending on item size.
I even wrote a little bit hackish patch [attached].
But I didn't go far as they did in Solaris, so minimum bucket size limit is 4.
But perhaps it would make sense to not use the cache at all starting with
certain size.

Another attached hack removes zio zones that have items larger than page size,
but not multiple of page size.  Internally they would still consume multiple of
page size per item, so we potentially can have two zones that use the same
number of pages per zone, but with different item size. With the patch they are
collapsed into a single zone.

-- 
Andriy Gapon

--------------030602010507080304070903
Content-Type: text/plain;
 name="uma-uz_count_max.diff"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="uma-uz_count_max.diff"

ZGlmZiAtLWdpdCBhL3N5cy92bS91bWFfY29yZS5jIGIvc3lzL3ZtL3VtYV9jb3JlLmMKaW5k
ZXggM2ZjNWI4YS4uM2I4Mzg0YiAxMDA2NDQKLS0tIGEvc3lzL3ZtL3VtYV9jb3JlLmMKKysr
IGIvc3lzL3ZtL3VtYV9jb3JlLmMKQEAgLTE3OSw5ICsxNzksMTIgQEAgc3RydWN0IHVtYV9i
dWNrZXRfem9uZSB7CiAJaW50CQl1YnpfZW50cmllczsKIH07CiAKLSNkZWZpbmUJQlVDS0VU
X01BWAkxMjgKKyNkZWZpbmUJQlVDS0VUX1NJWkVfVEhSRVNIT0xECTEzMTA3MgorI2RlZmlu
ZQlCVUNLRVRfTUFYCQkxMjgKIAogc3RydWN0IHVtYV9idWNrZXRfem9uZSBidWNrZXRfem9u
ZXNbXSA9IHsKKwl7IE5VTEwsICI0IEJ1Y2tldCIsIDQgfSwKKwl7IE5VTEwsICI4IEJ1Y2tl
dCIsIDggfSwKIAl7IE5VTEwsICIxNiBCdWNrZXQiLCAxNiB9LAogCXsgTlVMTCwgIjMyIEJ1
Y2tldCIsIDMyIH0sCiAJeyBOVUxMLCAiNjQgQnVja2V0IiwgNjQgfSwKQEAgLTE4OSw3ICsx
OTIsNyBAQCBzdHJ1Y3QgdW1hX2J1Y2tldF96b25lIGJ1Y2tldF96b25lc1tdID0gewogCXsg
TlVMTCwgTlVMTCwgMH0KIH07CiAKLSNkZWZpbmUJQlVDS0VUX1NISUZUCTQKKyNkZWZpbmUJ
QlVDS0VUX1NISUZUCTIKICNkZWZpbmUJQlVDS0VUX1pPTkVTCSgoQlVDS0VUX01BWCA+PiBC
VUNLRVRfU0hJRlQpICsgMSkKIAogLyoKQEAgLTE0NjMsNiArMTQ2NiwxMyBAQCB6b25lX2N0
b3Iodm9pZCAqbWVtLCBpbnQgc2l6ZSwgdm9pZCAqdWRhdGEsIGludCBmbGFncykKIAkJem9u
ZS0+dXpfY291bnQgPSBrZWctPnVrX2lwZXJzOwogCWVsc2UKIAkJem9uZS0+dXpfY291bnQg
PSBCVUNLRVRfTUFYOworCisJem9uZS0+dXpfY291bnRfbWF4ID0gQlVDS0VUX1NJWkVfVEhS
RVNIT0xEIC8gem9uZS0+dXpfc2l6ZTsKKwlpZiAoem9uZS0+dXpfY291bnRfbWF4ID4gQlVD
S0VUX01BWCkKKwkJem9uZS0+dXpfY291bnRfbWF4ID0gQlVDS0VUX01BWDsKKwllbHNlIGlm
ICh6b25lLT51el9jb3VudF9tYXggPCAoMSA8PCBCVUNLRVRfU0hJRlQpKQorCQl6b25lLT51
el9jb3VudF9tYXggPSAxIDw8IEJVQ0tFVF9TSElGVDsKKwogCXJldHVybiAoMCk7CiB9CiAK
QEAgLTIwNzYsNyArMjA4Niw3IEBAIHphbGxvY19zdGFydDoKIAljcml0aWNhbF9leGl0KCk7
CiAKIAkvKiBCdW1wIHVwIG91ciB1el9jb3VudCBzbyB3ZSBnZXQgaGVyZSBsZXNzICovCi0J
aWYgKHpvbmUtPnV6X2NvdW50IDwgQlVDS0VUX01BWCkKKwlpZiAoem9uZS0+dXpfY291bnQg
PCB6b25lLT51el9jb3VudF9tYXgpCiAJCXpvbmUtPnV6X2NvdW50Kys7CiAKIAkvKgpkaWZm
IC0tZ2l0IGEvc3lzL3ZtL3VtYV9pbnQuaCBiL3N5cy92bS91bWFfaW50LmgKaW5kZXggNzcx
MzU5My4uNmQ4MWUzZCAxMDA2NDQKLS0tIGEvc3lzL3ZtL3VtYV9pbnQuaAorKysgYi9zeXMv
dm0vdW1hX2ludC5oCkBAIC0zMzAsNiArMzMwLDcgQEAgc3RydWN0IHVtYV96b25lIHsKIAl1
X2ludDY0X3QJdXpfc2xlZXBzOwkvKiBUb3RhbCBudW1iZXIgb2YgYWxsb2Mgc2xlZXBzICov
CiAJdWludDE2X3QJdXpfZmlsbHM7CS8qIE91dHN0YW5kaW5nIGJ1Y2tldCBmaWxscyAqLwog
CXVpbnQxNl90CXV6X2NvdW50OwkvKiBIaWdoZXN0IHZhbHVlIHViX3B0ciBjYW4gaGF2ZSAq
LworCXVpbnQxNl90CXV6X2NvdW50X21heDsJLyogSGlnaGVzdCB2YWx1ZSB1el9jb3VudCBj
YW4gaGF2ZSAqLwogCiAJLyoKIAkgKiBUaGlzIEhBUyB0byBiZSB0aGUgbGFzdCBpdGVtIGJl
Y2F1c2Ugd2UgYWRqdXN0IHRoZSB6b25lIHNpemUK
--------------030602010507080304070903
Content-Type: text/plain;
 name="zfs-zio-zones.diff"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="zfs-zio-zones.diff"

ZGlmZiAtLWdpdCBhL3N5cy9jZGRsL2NvbnRyaWIvb3BlbnNvbGFyaXMvdXRzL2NvbW1vbi9m
cy96ZnMvemlvLmMgYi9zeXMvY2RkbC9jb250cmliL29wZW5zb2xhcmlzL3V0cy9jb21tb24v
ZnMvemZzL3ppby5jCmluZGV4IDhkZGY3Y2QuLjM0MGY2NzYgMTAwNjQ0Ci0tLSBhL3N5cy9j
ZGRsL2NvbnRyaWIvb3BlbnNvbGFyaXMvdXRzL2NvbW1vbi9mcy96ZnMvemlvLmMKKysrIGIv
c3lzL2NkZGwvY29udHJpYi9vcGVuc29sYXJpcy91dHMvY29tbW9uL2ZzL3pmcy96aW8uYwpA
QCAtMTIxLDEwICsxMjEsMTEgQEAgemlvX2luaXQodm9pZCkKIAkJCWFsaWduID0gU1BBX01J
TkJMT0NLU0laRTsKIAkJfSBlbHNlIGlmIChQMlBIQVNFKHNpemUsIFBBR0VTSVpFKSA9PSAw
KSB7CiAJCQlhbGlnbiA9IFBBR0VTSVpFOworI2lmIDAKIAkJfSBlbHNlIGlmIChQMlBIQVNF
KHNpemUsIHAyID4+IDIpID09IDApIHsKIAkJCWFsaWduID0gcDIgPj4gMjsKKyNlbmRpZgog
CQl9Ci0KIAkJaWYgKGFsaWduICE9IDApIHsKIAkJCWNoYXIgbmFtZVszNl07CiAJCQkodm9p
ZCkgc3ByaW50ZihuYW1lLCAiemlvX2J1Zl8lbHUiLCAodWxvbmdfdClzaXplKTsK
--------------030602010507080304070903--

From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 17 12:32:35 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1213A106564A;
	Fri, 17 Sep 2010 12:32:35 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id D1EAC8FC08;
	Fri, 17 Sep 2010 12:32:33 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA21728;
	Fri, 17 Sep 2010 15:32:32 +0300 (EEST)
	(envelope-from avg@freebsd.org)
Message-ID: <4C935FDF.4040909@freebsd.org>
Date: Fri, 17 Sep 2010 15:32:31 +0300
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.9) Gecko/20100909 Lightning/1.0b2 Thunderbird/3.1.3
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
In-Reply-To: <4C935F56.4030903@freebsd.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org, Jeff Roberson <jeff@freebsd.org>
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Sep 2010 12:32:35 -0000

on 17/09/2010 15:30 Andre Oppermann said the following:
> Having a general solutions for that is appreciated.  Maybe the size
> of the free per-cpu buckets should be specified when setting up the
> UMA zone.  Of certain frequently re-used elements we may want to
> cache more, other less.

This kind of flexibility seems like a very good idea.

-- 
Andriy Gapon

From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 17 12:39:46 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 512131065781
	for <freebsd-hackers@freebsd.org>; Fri, 17 Sep 2010 12:39:46 +0000 (UTC)
	(envelope-from alex.coulson@charthouse.co.uk)
Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com
	[74.125.82.182])
	by mx1.freebsd.org (Postfix) with ESMTP id E10BF8FC20
	for <freebsd-hackers@freebsd.org>; Fri, 17 Sep 2010 12:39:45 +0000 (UTC)
Received: by wyb33 with SMTP id 33so3087918wyb.13
	for <freebsd-hackers@freebsd.org>; Fri, 17 Sep 2010 05:39:45 -0700 (PDT)
Received: by 10.216.165.209 with SMTP id e59mr705541wel.58.1284725751134;
	Fri, 17 Sep 2010 05:15:51 -0700 (PDT)
Received: from [192.168.10.127] (host81-149-4-164.in-addr.btopenworld.com
	[81.149.4.164])
	by mx.google.com with ESMTPS id n17sm2606299weq.30.2010.09.17.05.15.49
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Fri, 17 Sep 2010 05:15:50 -0700 (PDT)
From: Alex Coulson <alex.coulson@charthouse.co.uk>
Date: Fri, 17 Sep 2010 13:15:47 +0100
Message-Id: <2E7772A2-D0C2-474F-9101-DC782F58BC4F@charthouse.co.uk>
To: freebsd-hackers@freebsd.org
Mime-Version: 1.0 (Apple Message framework v1081)
X-Mailer: Apple Mail (2.1081)
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: Nanobsd - Freebsd7.2 - Can't enable core dump
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Sep 2010 12:39:46 -0000

I have a soekris net4801 running nanobsd which is freezing randomly =
(between 10min->2hours), and does not create a crash dump file when it =
fails (and nothing interesting in the messages log).

The following kernel options are enabled:
> makeoptions	DEBUG=3D-g=09
> options 	KDTRACE_HOOKS
> options 	KDB
> options 	DDB
> options	KDB_UNATTENDED
> options	KDB_TRACE


rc.conf
> dumpdev=3D"/dev/da0s1b"
> savecore=3D"YES"


swapinfo
> Device          1K-blocks     Used    Avail Capacity
> /dev/da0s1b        500720        0   500720     0%


db> call doadump
> Physical memory: 247 MB
> Dumping 35 MB:ucom0: ucomreadcb: TIMEOUT
> Aborting dump due to I/O error.
> status =3D=3D 0x4, scsi status =3D=3D 0x0
>=20
> ** DUMP FAILED (ERROR 5) **
> =3D 0x1d


Any help would be appreciated!

Alex Coulson



From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 17 12:56:53 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4224E106566C
	for <freebsd-hackers@freebsd.org>; Fri, 17 Sep 2010 12:56:53 +0000 (UTC)
	(envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
	by mx1.freebsd.org (Postfix) with ESMTP id A757D8FC1D
	for <freebsd-hackers@freebsd.org>; Fri, 17 Sep 2010 12:56:52 +0000 (UTC)
Received: (qmail 14323 invoked from network); 17 Sep 2010 12:24:36 -0000
Received: from unknown (HELO [62.48.0.92]) ([62.48.0.92])
	(envelope-sender <andre@freebsd.org>)
	by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
	for <avg@freebsd.org>; 17 Sep 2010 12:24:36 -0000
Message-ID: <4C935F56.4030903@freebsd.org>
Date: Fri, 17 Sep 2010 14:30:14 +0200
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US;
	rv:1.9.2.9) Gecko/20100825 Thunderbird/3.1.3
MIME-Version: 1.0
To: Andriy Gapon <avg@freebsd.org>
References: <4C93236B.4050906@freebsd.org>
In-Reply-To: <4C93236B.4050906@freebsd.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org, Jeff Roberson <jeff@freebsd.org>
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Sep 2010 12:56:53 -0000

On 17.09.2010 10:14, Andriy Gapon wrote:
>
> I've been investigating interaction between zfs and uma for a while.
> You might remember that there is a noticeable fragmentation in zfs uma zones
> when uma use is not enabled for actual data/metadata buffers.
>
> I also noticed that when uma use is enabled for data/metadata buffers
> (zio.use_uma=1) amount of memory reserved in free items of zfs uma zones becomes
> really huge.  And this is despite the fact that the vast majority of the
> data/metadata zone have items with sizes that are multiples of page size.
> This couldn't really be because of fragmentation.
>
> Further checks show that the free items are accumulated in per-cpu cache
> buckets.  uz_count for those buckets starts with 1, but over time, during bursts
> of activity, it grows up to maximum of 128.
> Problem with those buckets is that they are not drained on low memory conditions
> and uz_count never goes down.
>
> So, after a while, I observe about 300 free items (on a mere two core system)
> cached in 4 per-cpu buckets for a single zone with 128KB item size.
> That's 30MB right there.
> For all data and metadata zones the number goes as high as 500MB on my machine
> with 4GB physical RAM.
> This seems like a bit too much to me.
>
> Although keeping free items around improves performance, it does consume memory
> too.  And the fact that that memory is not freed on lowmem condition makes the
> situation worse.

Interesting.  We may run into related issues with excessive mbuf
(cluster) caching in the per-cpu buckets as well.

Having a general solutions for that is appreciated.  Maybe the size
of the free per-cpu buckets should be specified when setting up the
UMA zone.  Of certain frequently re-used elements we may want to
cache more, other less.

-- 
Andre

From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 17 15:23:48 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 792801065672
	for <freebsd-hackers@freebsd.org>; Fri, 17 Sep 2010 15:23:48 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 48F188FC12
	for <freebsd-hackers@freebsd.org>; Fri, 17 Sep 2010 15:23:48 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id F210146BA0;
	Fri, 17 Sep 2010 11:23:47 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1B4BC8A04F;
	Fri, 17 Sep 2010 11:23:47 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Benjamin Kaduk <kaduk@mit.edu>
Date: Fri, 17 Sep 2010 09:02:18 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
	<201009161416.05759.jhb@freebsd.org>
	<alpine.GSO.1.10.1009162317430.9337@multics.mit.edu>
In-Reply-To: <alpine.GSO.1.10.1009162317430.9337@multics.mit.edu>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201009170902.18748.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Fri, 17 Sep 2010 11:23:47 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: freebsd-hackers@freebsd.org
Subject: Re: Questions about mutex implementation in kern/kern_mutex.c
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Sep 2010 15:23:48 -0000

On Thursday, September 16, 2010 11:24:29 pm Benjamin Kaduk wrote:
> On Thu, 16 Sep 2010, John Baldwin wrote:
> 
> > On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
> >
> >> The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
> >> special to compare the value of m->mtx_lock (volatile) with current thread
> >> pointer, all other functions that update m->mtx_lock of unowned mutex use
> >> compare-and-set instruction.  Also I cannot find anything special in
> >> generated Assembler code for volatile variables (except for ia64 where
> >> acquire loads and release stores are used).
> >
> > No, mtx_owned() is just not harmed by the races it loses.  You can certainly
> > read a stale value of mtx_lock in mtx_owned() if some other thread owns the
> > lock or has just released the lock.  However, we don't care, because in both
> > of those cases, mtx_owned() returns false.  What does matter is that
> > mtx_owned() can only return true if we currently hold the mutex.  This works
> > because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the
> > same time, and 2) even CPUs that hold writes in store buffers will snoop their
> > store buffer for local reads on that CPU.  That is, a given CPU will never
> > read a stale value of a memory word that is "older" than a write it has
> > performed to that word.
> 
> Sorry for the naive question, but would you mind expounding a bit on what 
> keeps the thread from migrating to a different CPU and getting a stale 
> value there?  (I can imagine a couple possible mechanisms, but don't know 
> enough to know which one(s) are the real ones.)

The memory barriers in the thread_lock() / thread_unlock() pair of a context
switch ensure that any writes posted by the thread before it performs a context
switch will be visible on the "new" CPU before the thread resumes execution.

-- 
John Baldwin

From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 17 17:42:45 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 858E71065694;
	Fri, 17 Sep 2010 17:42:45 +0000 (UTC)
	(envelope-from simon@comsys.ntu-kpi.kiev.ua)
Received: from comsys.kpi.ua (comsys.kpi.ua [77.47.192.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 1EBBF8FC15;
	Fri, 17 Sep 2010 17:42:45 +0000 (UTC)
Received: from pm513-1.comsys.kpi.ua ([10.18.52.101]
	helo=pm513-1.comsys.ntu-kpi.kiev.ua)
	by comsys.kpi.ua with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63)
	(envelope-from <simon@comsys.ntu-kpi.kiev.ua>)
	id 1OwexQ-0004Vx-89; Fri, 17 Sep 2010 20:42:44 +0300
Received: by pm513-1.comsys.ntu-kpi.kiev.ua (Postfix, from userid 1001)
	id B27531CC1E; Fri, 17 Sep 2010 20:42:44 +0300 (EEST)
Date: Fri, 17 Sep 2010 20:42:44 +0300
From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: John Baldwin <jhb@freebsd.org>
Message-ID: <20100917174244.GA2570@pm513-1.comsys.ntu-kpi.kiev.ua>
References: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
	<AANLkTimJV-oB_uTSbUTtbSrR5fXgWGk00dEV7L-Gobrf@mail.gmail.com>
	<20100916173307.GA1994@pm513-1.comsys.ntu-kpi.kiev.ua>
	<201009161416.05759.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201009161416.05759.jhb@freebsd.org>
User-Agent: Mutt/1.5.20 (2009-06-14)
X-Authenticated-User: simon@comsys.ntu-kpi.kiev.ua
X-Authenticator: plain
X-Sender-Verify: SUCCEEDED (sender exists & accepts mail)
X-Exim-Version: 4.63 (build at 06-Jan-2007 23:14:37)
X-Date: 2010-09-17 20:42:44
X-Connected-IP: 10.18.52.101:43526
X-Message-Linecount: 105
X-Body-Linecount: 87
X-Message-Size: 5808
X-Body-Size: 4945
Cc: freebsd-hackers@freebsd.org, Matthew Fleming <mdf356@gmail.com>
Subject: Re: Questions about mutex implementation in kern/kern_mutex.c
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Sep 2010 17:42:45 -0000

On Thu, Sep 16, 2010 at 02:16:05PM -0400, John Baldwin wrote:
> On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
>
> > "Current" value means that the value of a variable read by one thread
> > is equal to the value of this variable successfully updated by another
> > thread by the compare-and-set instruction.  As I understand from the kernel
> > source code, atomic_cmpset_ptr() allows to update a variable in a way that
> > all other CPUs will invalidate corresponding cache lines that contain
> > the value of this variable.
> 
> That is not true.  It is likely true on x86, but it is certainly not true on
> other architectures such as sparc64 where a write may be held in a store 
> buffer for an indeterminate amount of time (and note that some lock releases 
> are simple stores with a "rel" memory barrier).  All that we require is that 
> if the value is stale, the atomic_cmpset() that attempts to set MTX_CONTESTED 
> will fail.

I missed _release_lock_quick() call in _mtx_unlock_sleep().

> 
> > The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
> > special to compare the value of m->mtx_lock (volatile) with current thread
> > pointer, all other functions that update m->mtx_lock of unowned mutex use
> > compare-and-set instruction.  Also I cannot find anything special in
> > generated Assembler code for volatile variables (except for ia64 where
> > acquire loads and release stores are used).
> 
> No, mtx_owned() is just not harmed by the races it loses.  You can certainly 
> read a stale value of mtx_lock in mtx_owned() if some other thread owns the 
> lock or has just released the lock.  However, we don't care, because in both 
> of those cases, mtx_owned() returns false.  What does matter is that 
> mtx_owned() can only return true if we currently hold the mutex.  This works 
> because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the 
> same time, and 2) even CPUs that hold writes in store buffers will snoop their 
> store buffer for local reads on that CPU.  That is, a given CPU will never 
> read a stale value of a memory word that is "older" than a write it has 
> performed to that word.

Looks like I understand the logic why mtx_owned() works correctly when
mtx_lock is present in CPU cache or is absent in CPU cache.  The mtx_lock
value definitely can say whether lock is held by the current thread, but
it cannot say whether it is unowned or is owned by another thread.

Let me ask another one question about memory barriers and thread migration.

Let a thread locked a mutex, modified shared data protected by this mutex
and was migrated from CPU1 to CPU2 (mutex is still locked).  In this scenario
just migrated thread will not see stale data for a mutex itself (the
m->mtx_lock value) and for shared data on CPU2 because when it was migrated
from CPU1 there was at least one unlock call for some another mutex that had
release semantics and appropriate memory barrier instruction was run
implicitly or explicitly.  As a result this "rel" memory barrier made all
modifications from CPU1 visible on another CPUs.  When CPU2 switched to just
migrated thread there was at least on lock call for some another mutex with
acquire semantics, so "rel/acq" memory barriers pair works here together.
(Also I consider case when CPU2 did not work with that mutex, but worked
with its memory before.  Some thread on CPU2 could allocate some memory,
worked with it and freed it.  Later the same part of memory was allocated
by a thread on CPU1 for mutex).

Is the above written description correct?

Such logic of memory barriers is described in detail in Sparc v9 documentation
book in MEMBAR instruction description.  Actually MEMBAR with appropriate
masks is used in atomic.h for this architecture.  As I understand the same
logic for memory barriers (atomic_..._rel and atomic_..._acq) is applicable
to all other architectures.  Otherwise I do not understand how mtx_lock()
and mtx_unlock() pair can protect data and can ensure that a thread that
locked a mutex will see correct (not stale) data protected by this mutex.

> > There are some places in the kernel where a variable is updated in
> > something like "do { v = value; } while (!atomic_cmpset_int(&value, ...));"
> > and that variable is not "volatile", but the compiler generates correct
> > Assembler code.  So "volatile" is not a requirement for all cases.
> 
> Hmm, I suspect that many of those places actually do use volatile.  The 
> various lock cookies (mtx_lock, etc.) are declared volatile in the structure.  
> Otherwise the compiler would be free to conclude that 'v = value;' is a loop 
> invariant and move it out of the loop which would break.  Given that, the 
> construct you referred to does in fact require 'value' to be volatile.

I checked Assembler code for these functions:

kern/subr_msgbuf.c:msgbuf_addchar()
vm/vm_map.c:vmspace_free()

Thank your for answers.

From owner-freebsd-hackers@FreeBSD.ORG  Fri Sep 17 21:11:23 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8C1441065670
	for <freebsd-hackers@freebsd.org>; Fri, 17 Sep 2010 21:11:23 +0000 (UTC)
	(envelope-from jhb@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 56AD78FC08
	for <freebsd-hackers@freebsd.org>; Fri, 17 Sep 2010 21:11:23 +0000 (UTC)
Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net
	[66.111.2.69])
	by cyrus.watson.org (Postfix) with ESMTPSA id D29E946B66;
	Fri, 17 Sep 2010 17:11:22 -0400 (EDT)
Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id EBA198A03C;
	Fri, 17 Sep 2010 17:11:21 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
Date: Fri, 17 Sep 2010 17:11:21 -0400
User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; )
References: <20100915134415.GA23727@pm513-1.comsys.ntu-kpi.kiev.ua>
	<201009161416.05759.jhb@freebsd.org>
	<20100917174244.GA2570@pm513-1.comsys.ntu-kpi.kiev.ua>
In-Reply-To: <20100917174244.GA2570@pm513-1.comsys.ntu-kpi.kiev.ua>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201009171711.21307.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(bigwig.baldwin.cx); Fri, 17 Sep 2010 17:11:21 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx
Cc: freebsd-hackers@freebsd.org, Matthew Fleming <mdf356@gmail.com>
Subject: Re: Questions about mutex implementation in kern/kern_mutex.c
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Sep 2010 21:11:23 -0000

On Friday, September 17, 2010 1:42:44 pm Andrey Simonenko wrote:
> On Thu, Sep 16, 2010 at 02:16:05PM -0400, John Baldwin wrote:
> > On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
> > > The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
> > > special to compare the value of m->mtx_lock (volatile) with current thread
> > > pointer, all other functions that update m->mtx_lock of unowned mutex use
> > > compare-and-set instruction.  Also I cannot find anything special in
> > > generated Assembler code for volatile variables (except for ia64 where
> > > acquire loads and release stores are used).
> > 
> > No, mtx_owned() is just not harmed by the races it loses.  You can certainly 
> > read a stale value of mtx_lock in mtx_owned() if some other thread owns the 
> > lock or has just released the lock.  However, we don't care, because in both 
> > of those cases, mtx_owned() returns false.  What does matter is that 
> > mtx_owned() can only return true if we currently hold the mutex.  This works 
> > because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the 
> > same time, and 2) even CPUs that hold writes in store buffers will snoop their 
> > store buffer for local reads on that CPU.  That is, a given CPU will never 
> > read a stale value of a memory word that is "older" than a write it has 
> > performed to that word.
> 
> Looks like I understand the logic why mtx_owned() works correctly when
> mtx_lock is present in CPU cache or is absent in CPU cache.  The mtx_lock
> value definitely can say whether lock is held by the current thread, but
> it cannot say whether it is unowned or is owned by another thread.
> 
> Let me ask another one question about memory barriers and thread migration.
> 
> Let a thread locked a mutex, modified shared data protected by this mutex
> and was migrated from CPU1 to CPU2 (mutex is still locked).  In this scenario
> just migrated thread will not see stale data for a mutex itself (the
> m->mtx_lock value) and for shared data on CPU2 because when it was migrated
> from CPU1 there was at least one unlock call for some another mutex that had
> release semantics and appropriate memory barrier instruction was run
> implicitly or explicitly.  As a result this "rel" memory barrier made all
> modifications from CPU1 visible on another CPUs.  When CPU2 switched to just
> migrated thread there was at least on lock call for some another mutex with
> acquire semantics, so "rel/acq" memory barriers pair works here together.
> (Also I consider case when CPU2 did not work with that mutex, but worked
> with its memory before.  Some thread on CPU2 could allocate some memory,
> worked with it and freed it.  Later the same part of memory was allocated
> by a thread on CPU1 for mutex).
> 
> Is the above written description correct?

Yes.

> > > There are some places in the kernel where a variable is updated in
> > > something like "do { v = value; } while (!atomic_cmpset_int(&value, ...));"
> > > and that variable is not "volatile", but the compiler generates correct
> > > Assembler code.  So "volatile" is not a requirement for all cases.
> > 
> > Hmm, I suspect that many of those places actually do use volatile.  The 
> > various lock cookies (mtx_lock, etc.) are declared volatile in the structure.  
> > Otherwise the compiler would be free to conclude that 'v = value;' is a loop 
> > invariant and move it out of the loop which would break.  Given that, the 
> > construct you referred to does in fact require 'value' to be volatile.
> 
> I checked Assembler code for these functions:
> 
> kern/subr_msgbuf.c:msgbuf_addchar()
> vm/vm_map.c:vmspace_free()

They may happen to accidentally work because atomic_cmpset() clobbers all of
memory, but these should be marked volatile.

Index: vm/vm_map.c
===================================================================
--- vm/vm_map.c	(revision 212801)
+++ vm/vm_map.c	(working copy)
@@ -343,10 +343,7 @@
 	if (vm->vm_refcnt == 0)
 		panic("vmspace_free: attempt to free already freed vmspace");
 
-	do
-		refcnt = vm->vm_refcnt;
-	while (!atomic_cmpset_int(&vm->vm_refcnt, refcnt, refcnt - 1));
-	if (refcnt == 1)
+	if (atomic_fetchadd_int(&vm->vm_refcnt, -1) == 1)
 		vmspace_dofree(vm);
 }
 
Index: vm/vm_map.h
===================================================================
--- vm/vm_map.h	(revision 212801)
+++ vm/vm_map.h	(working copy)
@@ -237,7 +237,7 @@
 	caddr_t vm_taddr;	/* (c) user virtual address of text */
 	caddr_t vm_daddr;	/* (c) user virtual address of data */
 	caddr_t vm_maxsaddr;	/* user VA at max stack growth */
-	int	vm_refcnt;	/* number of references */
+	volatile int vm_refcnt;	/* number of references */
 	/*
 	 * Keep the PMAP last, so that CPU-specific variations of that
 	 * structure on a single architecture don't result in offset
Index: sys/msgbuf.h
===================================================================
--- sys/msgbuf.h	(revision 212801)
+++ sys/msgbuf.h	(working copy)
@@ -38,7 +38,7 @@
 #define	MSG_MAGIC	0x063062
 	u_int	msg_magic;
 	u_int	msg_size;		/* size of buffer area */
-	u_int	msg_wseq;		/* write sequence number */
+	volatile u_int msg_wseq;	/* write sequence number */
 	u_int	msg_rseq;		/* read sequence number */
 	u_int	msg_cksum;		/* checksum of contents */
 	u_int	msg_seqmod;		/* range for sequence numbers */

-- 
John Baldwin

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 04:01:09 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9F356106564A;
	Sat, 18 Sep 2010 04:01:09 +0000 (UTC) (envelope-from kaduk@mit.edu)
Received: from dmz-mailsec-scanner-4.mit.edu (DMZ-MAILSEC-SCANNER-4.MIT.EDU
	[18.9.25.15]) by mx1.freebsd.org (Postfix) with ESMTP id 304138FC12;
	Sat, 18 Sep 2010 04:01:08 +0000 (UTC)
X-AuditID: 1209190f-b7bf7ae00000628e-91-4c9439882650
Received: from mailhub-auth-3.mit.edu ( [18.9.21.43])
	by dmz-mailsec-scanner-4.mit.edu (Symantec Brightmail Gateway) with
	SMTP id 07.91.25230.889349C4; Sat, 18 Sep 2010 00:01:12 -0400 (EDT)
Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103])
	by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id o8I417ZW026321; 
	Sat, 18 Sep 2010 00:01:07 -0400
Received: from multics.mit.edu (MULTICS.MIT.EDU [18.187.1.73])
	(authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU)
	by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id o8I415po019277
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT);
	Sat, 18 Sep 2010 00:01:07 -0400 (EDT)
Received: (from kaduk@localhost) by multics.mit.edu (8.12.9.20060308)
	id o8I414dO023384; Sat, 18 Sep 2010 00:01:04 -0400 (EDT)
Date: Sat, 18 Sep 2010 00:01:04 -0400 (EDT)
From: Benjamin Kaduk <kaduk@MIT.EDU>
To: kientzle@freebsd.org, kaiw@freebsd.org
In-Reply-To: <20100829201050.GA60715@stack.nl>
Message-ID: <alpine.GSO.1.10.1009032036310.9337@multics.mit.edu>
References: <alpine.GSO.1.10.1008281833470.9337@multics.mit.edu>
	<20100829201050.GA60715@stack.nl>
User-Agent: Alpine 1.10 (GSO 962 2008-03-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII
X-Brightmail-Tracker: AAAAAA==
Cc: freebsd-hackers@freebsd.org, Jilles Tjoelker <jilles@stack.nl>
Subject: Re: ar(1) format_decimal failure is fatal?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 04:01:09 -0000

On Sun, 29 Aug 2010, Jilles Tjoelker wrote:

> On Sat, Aug 28, 2010 at 07:08:34PM -0400, Benjamin Kaduk wrote:
>> [...]
>> building static egacy library
>> ar: fatal: Numeric user ID too large
>> *** Error code 70
>
>> This error appears to be coming from
>> lib/libarchive/archive_write_set_format_ar.c , which seems to only have
>> provisions for outputting a user ID in AR_uid_size = 6 columns.
[...]
>> It looks like this macro was so defined in version 1.1 of that file, with
>> commit message "'ar' format support for libarchive, contributed by Kai
>> Wang.".  This doesn't make it terribly clear whether the 'ar' format
>> mandates this length, or if it is an implementation decision, so I get to
>> ask: what reasoning (if any) was behind this choice?  Would anything break
>> if it was bumped up to a larger size?  Are there other options for a
>> workaround in my AFS environment?
>
> I wonder if the uid/gid fields are useful at all for ar archives. Ar
> archives are usually not extracted, and when they are, the current
> user's values seem good enough. The uid/gid also prevent exactly
> reproducible builds (together with the timestamp).

GNU binutils has recently (well, March 2009) added a -D ("deterministic") 
argument to ar(1) which sets the timestamp, uid, and gid to zero, and the 
mode to 644.  If that argument is not given, linux's ar(1) happily uses my 
8-digit uid as-is; the manual page seems to imply that it will handle 15 
or 16 digits in that field.
Solaris' ar(1) caps large uids to 600001.
On OS X, the value is wrapped at some power of two less than 26, showing 
up in the archive as 271 (33554703 = 271 + 2^25).

In no cases that I tried was a large uid a fatal error; I'm not really 
convinced that it should be fatal for FreeBSD.


Poking at the source, it seems this stems from usr.bin/ar/write.c's use of 
the AC() macro, defined in ar.h:
#define AC(CALL) do {                                   \
         if ((CALL))                                     \
                 bsdar_errc(bsdar, EX_SOFTWARE, 0, "%s", \
                     archive_error_string(a));           \
} while (0)

archive_write_header() is always called within this macro, and the 
relevant implementation (archive_write_ar_header() in 
libarchive/archive_write_set_format_ar.c) immediately returns 
ARCHIVE_WARN if the format_decimal() call fails.  Other places in the 
libarchive code actually use the distinction between ARCHIVE_OK, 
ARCHIVE_WARN, and ARCHIVE_FATAL (and friends); I think that it would be 
pretty easy to modify format_decimal() (and probably its cousins) to use 
that convention instead of just -1 and 0.  It already does a reasonable 
thing in the case of overflow (write the maximum value), it just does not 
distinguish between the different possible errors.

I propose that format_{decimal,octal}() return ARCHIVE_FAILED for negative 
input, and ARCHIVE_WARN for overflow.  archive_write_ar_header() can then 
catch ARCHIVE_WARN from the format_foo functions and continue on, 
propagating the ARCHIVE_WARN return value at the end of its execution 
instead of bailing immediately.  ar/write.c would also need to be changed, 
calling archive_write_header without the AC macro and dealing with the 
ARCHIVE_WARN return value case, presumably by writing 
archive_error_string(a) to stderr and continuing.

Would (one of) you be willing to review a patch to that effect?

Thanks,

Ben Kaduk

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 08:02:33 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C363D1065670;
	Sat, 18 Sep 2010 08:02:33 +0000 (UTC)
	(envelope-from kientzle@freebsd.org)
Received: from monday.kientzle.com (99-115-135-74.uvs.sntcca.sbcglobal.net
	[99.115.135.74])
	by mx1.freebsd.org (Postfix) with ESMTP id D027D8FC19;
	Sat, 18 Sep 2010 08:02:32 +0000 (UTC)
Received: from [10.123.2.180] (DIR-655 [192.168.1.65])
	by monday.kientzle.com (8.14.3/8.14.3) with ESMTP id o8I7Ops0073710;
	Sat, 18 Sep 2010 07:24:51 GMT (envelope-from kientzle@freebsd.org)
Mime-Version: 1.0 (Apple Message framework v1081)
Content-Type: text/plain; charset=us-ascii
From: Tim Kientzle <kientzle@freebsd.org>
In-Reply-To: <alpine.GSO.1.10.1009032036310.9337@multics.mit.edu>
Date: Sat, 18 Sep 2010 00:24:51 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <F56D9CB9-E644-4279-8830-71292C880D9B@freebsd.org>
References: <alpine.GSO.1.10.1008281833470.9337@multics.mit.edu>
	<20100829201050.GA60715@stack.nl>
	<alpine.GSO.1.10.1009032036310.9337@multics.mit.edu>
To: Benjamin Kaduk <kaduk@MIT.EDU>
X-Mailer: Apple Mail (2.1081)
Cc: freebsd-hackers@freebsd.org, kaiw@freebsd.org,
	Jilles Tjoelker <jilles@stack.nl>
Subject: Re: ar(1) format_decimal failure is fatal?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 08:02:33 -0000


On Sep 17, 2010, at 9:01 PM, Benjamin Kaduk wrote:

> On Sun, 29 Aug 2010, Jilles Tjoelker wrote:
>=20
>> On Sat, Aug 28, 2010 at 07:08:34PM -0400, Benjamin Kaduk wrote:
>>> [...]
>>> building static egacy library
>>> ar: fatal: Numeric user ID too large
>>> *** Error code 70
>>=20
>>> This error appears to be coming from
>>> lib/libarchive/archive_write_set_format_ar.c , which seems to only =
have
>>> provisions for outputting a user ID in AR_uid_size =3D 6 columns.
> [...]
>>> It looks like this macro was so defined in version 1.1 of that file, =
with
>>> commit message "'ar' format support for libarchive, contributed by =
Kai
>>> Wang.".  This doesn't make it terribly clear whether the 'ar' format
>>> mandates this length, or if it is an implementation decision...

There's no official standard for the ar format, only old
conventions and compatibility with legacy implementations.

>> I wonder if the uid/gid fields are useful at all for ar archives. Ar
>> archives are usually not extracted, and when they are, the current
>> user's values seem good enough. The uid/gid also prevent exactly
>> reproducible builds (together with the timestamp).
>=20
> GNU binutils has recently (well, March 2009) added a -D =
("deterministic") argument to ar(1) which sets the timestamp, uid, and =
gid to zero, and the mode to 644.  If that argument is not given, =
linux's ar(1) happily uses my 8-digit uid as-is; the manual page seems =
to imply that it will handle 15 or 16 digits in that field.

Please send me a small example file...  I don't think I've seen
this format variant.  Maybe we can extend our ar(1) to support
this variant.

Personally, I wonder if it wouldn't make sense to just always
force the timestamp, uid, and gid to zero.  I find it hard
to believe anyone is using ar(1) as a general-purpose archiving
tool.  Of course, it should be trivial to add -D support to our ar(1).

> I propose that format_{decimal,octal}() return ARCHIVE_FAILED for =
negative input, and ARCHIVE_WARN for overflow.  =
archive_write_ar_header() can then catch ARCHIVE_WARN from the =
format_foo functions and continue on, propagating the ARCHIVE_WARN =
return value at the end of its execution ...

This sounds entirely reasonable to me.  I personally don't see much
advantage to distinguishing negative versus overflow, but certainly
have no objections to that part.  Definitely ar(1) should not abort on
a simple ARCHIVE_WARN.

> Would (one of) you be willing to review a patch to that effect?

Happy to do so.=20

Cheers,

Tim


From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 11:23:40 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 559B2106566C
	for <freebsd-hackers@FreeBSD.org>; Sat, 18 Sep 2010 11:23:40 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 9ADA68FC08
	for <freebsd-hackers@FreeBSD.org>; Sat, 18 Sep 2010 11:23:39 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA05606
	for <freebsd-hackers@FreeBSD.org>;
	Sat, 18 Sep 2010 14:23:37 +0300 (EEST)
	(envelope-from avg@icyb.net.ua)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1OwvW5-000CwJ-7j
	for freebsd-hackers@FreeBSD.org; Sat, 18 Sep 2010 14:23:37 +0300
Message-ID: <4C94A138.8050905@icyb.net.ua>
Date: Sat, 18 Sep 2010 14:23:36 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.9) Gecko/20100912 Lightning/1.0b2 Thunderbird/3.1.3
MIME-Version: 1.0
To: freebsd-hackers@FreeBSD.org
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: 
Subject: KDB_TRACE and no backend
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 11:23:40 -0000


Here's a small patch that adds support for printing stack trace in form of frame
addresses when KDB_TRACE is enabled, but there is no debugger backend configured.
The patch is styled after "cheap" variant of stack_ktr.

What do you think (useful/useless, correct, etc) ?

--- a/sys/kern/subr_kdb.c
+++ b/sys/kern/subr_kdb.c
@@ -37,6 +37,7 @@
 #include <sys/pcpu.h>
 #include <sys/proc.h>
 #include <sys/smp.h>
+#include <sys/stack.h>
 #include <sys/sysctl.h>

 #include <machine/kdb.h>
@@ -295,10 +296,16 @@
 void
 kdb_backtrace(void)
 {
+	struct stack st;
+	int i;

-	if (kdb_dbbe != NULL && kdb_dbbe->dbbe_trace != NULL) {
-		printf("KDB: stack backtrace:\n");
+	printf("KDB: stack backtrace:\n");
+	if (kdb_dbbe != NULL && kdb_dbbe->dbbe_trace != NULL)
 		kdb_dbbe->dbbe_trace();
+	else {
+		stack_save(&st);
+		for (i = 0; i < st.depth; i++)
+			printf("#%d %p\n", i, (void*)(uintptr_t)st.pcs[i]);
 	}
 }




-- 
Andriy Gapon

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 11:23:48 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EAC9C1065697;
	Sat, 18 Sep 2010 11:23:48 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id C44A18FC1F;
	Sat, 18 Sep 2010 11:23:48 +0000 (UTC)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id 75C8546B2D;
	Sat, 18 Sep 2010 07:23:48 -0400 (EDT)
Date: Sat, 18 Sep 2010 12:23:48 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Andre Oppermann <andre@freebsd.org>
In-Reply-To: <4C935F56.4030903@freebsd.org>
Message-ID: <alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-hackers@freebsd.org, Jeff Roberson <jeff@freebsd.org>,
	Andriy Gapon <avg@freebsd.org>
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 11:23:49 -0000


On Fri, 17 Sep 2010, Andre Oppermann wrote:

>> Although keeping free items around improves performance, it does consume 
>> memory too.  And the fact that that memory is not freed on lowmem condition 
>> makes the situation worse.
>
> Interesting.  We may run into related issues with excessive mbuf (cluster) 
> caching in the per-cpu buckets as well.
>
> Having a general solutions for that is appreciated.  Maybe the size of the 
> free per-cpu buckets should be specified when setting up the UMA zone.  Of 
> certain frequently re-used elements we may want to cache more, other less.

I've been keeping a vague eye out for this over the last few years, and 
haven't spotted many problems in production machines I've inspected.  You can 
use the umastat tool in the tools tree to look at the distribution of memory 
over buckets (etc) in UMA manually.  It would be nice if it had some automated 
statistics on fragmentation however.  Short-lived fragmentation is likely, and 
isn't an issue, so what you want is a tool that monitors over time and reports 
on longer-lived fragmentation.

The main fragmentation issue we've had in the past has been due to 
mbuf+cluster caching, which prevented mbufs from being freed usefully in some 
cases.  Jeff's ongoing work on variable-sized mbufs would entirely eliminate 
that problem...

Robert

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 11:27:46 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7C5BF1065670;
	Sat, 18 Sep 2010 11:27:46 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 8CB628FC17;
	Sat, 18 Sep 2010 11:27:45 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA05660;
	Sat, 18 Sep 2010 14:27:44 +0300 (EEST)
	(envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1Owva3-000Cwf-Oi; Sat, 18 Sep 2010 14:27:43 +0300
Message-ID: <4C94A22F.1070608@freebsd.org>
Date: Sat, 18 Sep 2010 14:27:43 +0300
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.9) Gecko/20100912 Lightning/1.0b2 Thunderbird/3.1.3
MIME-Version: 1.0
To: Robert Watson <rwatson@freebsd.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
In-Reply-To: <alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 11:27:46 -0000

on 18/09/2010 14:23 Robert Watson said the following:
> I've been keeping a vague eye out for this over the last few years, and haven't
> spotted many problems in production machines I've inspected.  You can use the
> umastat tool in the tools tree to look at the distribution of memory over
> buckets (etc) in UMA manually.  It would be nice if it had some automated
> statistics on fragmentation however.  Short-lived fragmentation is likely, and
> isn't an issue, so what you want is a tool that monitors over time and reports
> on longer-lived fragmentation.
> 
> The main fragmentation issue we've had in the past has been due to mbuf+cluster
> caching, which prevented mbufs from being freed usefully in some cases.  Jeff's
> ongoing work on variable-sized mbufs would entirely eliminate that problem...

Robert,

just in case, this thread is not about fragmentation, it's about per-cpu
buckets, number of items in them and size of the items.

-- 
Andriy Gapon

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 11:30:44 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D7930106566C;
	Sat, 18 Sep 2010 11:30:44 +0000 (UTC)
	(envelope-from rwatson@freebsd.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id B11DC8FC08;
	Sat, 18 Sep 2010 11:30:44 +0000 (UTC)
Received: from [127.0.0.1] (rhee.cl.cam.ac.uk [128.232.1.202])
	by cyrus.watson.org (Postfix) with ESMTPSA id BE70246B2D;
	Sat, 18 Sep 2010 07:30:43 -0400 (EDT)
Mime-Version: 1.0 (Apple Message framework v1081)
Content-Type: text/plain; charset=us-ascii
From: "Robert N. M. Watson" <rwatson@freebsd.org>
In-Reply-To: <4C94A22F.1070608@freebsd.org>
Date: Sat, 18 Sep 2010 12:30:41 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <52AE93F3-D15F-40C9-A9CA-07F30C803B81@freebsd.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
	<4C94A22F.1070608@freebsd.org>
To: Andriy Gapon <avg@freebsd.org>
X-Mailer: Apple Mail (2.1081)
Cc: freebsd-hackers@freebsd.org
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 11:30:45 -0000


On 18 Sep 2010, at 12:27, Andriy Gapon wrote:

> on 18/09/2010 14:23 Robert Watson said the following:
>> I've been keeping a vague eye out for this over the last few years, =
and haven't
>> spotted many problems in production machines I've inspected.  You can =
use the
>> umastat tool in the tools tree to look at the distribution of memory =
over
>> buckets (etc) in UMA manually.  It would be nice if it had some =
automated
>> statistics on fragmentation however.  Short-lived fragmentation is =
likely, and
>> isn't an issue, so what you want is a tool that monitors over time =
and reports
>> on longer-lived fragmentation.
>>=20
>> The main fragmentation issue we've had in the past has been due to =
mbuf+cluster
>> caching, which prevented mbufs from being freed usefully in some =
cases.  Jeff's
>> ongoing work on variable-sized mbufs would entirely eliminate that =
problem...
>=20
> just in case, this thread is not about fragmentation, it's about =
per-cpu
> buckets, number of items in them and size of the items.

Those issues are closely related, and in particular, wanted to point =
Andre at umastat since he's probably not aware of it.. :-)

Robert=

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 12:49:13 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 39C7F106566B;
	Sat, 18 Sep 2010 12:49:13 +0000 (UTC)
	(envelope-from freebsd-listen@fabiankeil.de)
Received: from smtprelay01.ispgateway.de (smtprelay01.ispgateway.de
	[80.67.29.23]) by mx1.freebsd.org (Postfix) with ESMTP id BC0848FC0C;
	Sat, 18 Sep 2010 12:49:12 +0000 (UTC)
Received: from [87.79.159.189] (helo=r500.local)
	by smtprelay01.ispgateway.de with esmtpsa (TLSv1:AES128-SHA:128)
	(Exim 4.68) (envelope-from <freebsd-listen@fabiankeil.de>)
	id 1Owweq-0004gU-7a; Sat, 18 Sep 2010 14:36:44 +0200
Date: Sat, 18 Sep 2010 14:35:16 +0200
From: Fabian Keil <freebsd-listen@fabiankeil.de>
To: Robert Watson <rwatson@FreeBSD.org>
Message-ID: <20100918143516.3568f40e@r500.local>
In-Reply-To: <alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
X-Mailer: Claws Mail 3.7.6 (GTK+ 2.20.1; amd64-portbld-freebsd9.0)
X-PGP-KEY-URL: http://www.fabiankeil.de/gpg-keys/freebsd-listen-2008-08-18.asc
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=PGP-SHA1;
	boundary="Sig_/j0IIO6G0OvbQXQCJQewu8.K";
	protocol="application/pgp-signature"
X-Df-Sender: 775067
Cc: freebsd-hackers@freebsd.org
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 12:49:13 -0000

--Sig_/j0IIO6G0OvbQXQCJQewu8.K
Content-Type: multipart/mixed; boundary="MP_/V45ylbNW9Sv144uke8uKVXp"

--MP_/V45ylbNW9Sv144uke8uKVXp
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Robert Watson <rwatson@FreeBSD.org> wrote:

> On Fri, 17 Sep 2010, Andre Oppermann wrote:
>=20
> >> Although keeping free items around improves performance, it does consu=
me=20
> >> memory too.  And the fact that that memory is not freed on lowmem cond=
ition=20
> >> makes the situation worse.
> >
> > Interesting.  We may run into related issues with excessive mbuf (clust=
er)=20
> > caching in the per-cpu buckets as well.
> >
> > Having a general solutions for that is appreciated.  Maybe the size of =
the=20
> > free per-cpu buckets should be specified when setting up the UMA zone. =
 Of=20
> > certain frequently re-used elements we may want to cache more, other le=
ss.
>=20
> I've been keeping a vague eye out for this over the last few years, and=20
> haven't spotted many problems in production machines I've inspected.  You=
 can=20
> use the umastat tool in the tools tree to look at the distribution of mem=
ory=20
> over buckets (etc) in UMA manually.

Doesn't build for me on amd64:

fk@r500 /usr/src/tools/tools/umastat $make
Warning: Object directory not changed from original /usr/src/tools/tools/um=
astat
cc -O2 -pipe  -fno-omit-frame-pointer -std=3Dgnu99 -fstack-protector -Wsyst=
em-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-=
prototypes -Wmissing-prototypes -Wpointer-arith -Wno-uninitialized -Wno-poi=
nter-sign -c umastat.c
cc1: warnings being treated as errors
umastat.c: In function 'uma_print_bucketlist':
umastat.c:234: warning: format '%llu' expects type 'long long unsigned int'=
, but argument 3 has type 'uint64_t'
umastat.c:234: warning: format '%llu' expects type 'long long unsigned int'=
, but argument 4 has type 'uint64_t'
umastat.c: In function 'uma_print_cache':
umastat.c:245: warning: format '%llu' expects type 'long long unsigned int'=
, but argument 3 has type 'u_int64_t'
umastat.c:246: warning: format '%llu' expects type 'long long unsigned int'=
, but argument 3 has type 'u_int64_t'
umastat.c: In function 'main':
umastat.c:416: warning: format '%llu' expects type 'long long unsigned int'=
, but argument 2 has type 'u_int64_t'
umastat.c:418: warning: format '%llu' expects type 'long long unsigned int'=
, but argument 2 has type 'u_int64_t'
umastat.c:420: warning: format '%llu' expects type 'long long unsigned int'=
, but argument 2 has type 'u_int64_t'
umastat.c:426: warning: dereferencing type-punned pointer will break strict=
-aliasing rules
umastat.c:429: warning: dereferencing type-punned pointer will break strict=
-aliasing rules
*** Error code 1

Stop in /usr/src/tools/tools/umastat.

The attached patch seems to work around the problem, I'm not sure if
the casts to void* are better than decreasing the WARN level, though ...

Fabian

--MP_/V45ylbNW9Sv144uke8uKVXp
Content-Type: text/x-patch
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename=0001-Work-around-umastat-build-failures-on-amd64.patch

=46rom b84b5cf4f24b6886b5db9885f5bea707dcfb11e8 Mon Sep 17 00:00:00 2001
From: Fabian Keil <fk@fabiankeil.de>
Date: Sat, 18 Sep 2010 13:55:54 +0200
Subject: [PATCH] Work around umastat build failures on amd64.

---
 tools/tools/umastat/umastat.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/tools/umastat/umastat.c b/tools/tools/umastat/umastat.c
index 3c9fe0e..639bf80 100644
--- a/tools/tools/umastat/umastat.c
+++ b/tools/tools/umastat/umastat.c
@@ -230,7 +230,7 @@ uma_print_bucketlist(kvm_t *kvm, struct bucketlist *buc=
ketlist,
 	}
=20
 	printf("\n");
-	printf("%s};  // total cnt %llu, total entries %llu\n", spaces,
+	printf("%s};  // total cnt %ju, total entries %ju\n", spaces,
 	    total_cnt, total_entries);
 }
=20
@@ -242,8 +242,8 @@ uma_print_cache(kvm_t *kvm, struct uma_cache *cache, co=
nst char *name,
 	int ret;
=20
 	printf("%s%s[%d] =3D {\n", spaces, name, cpu);
-	printf("%s  uc_frees =3D %llu;\n", spaces, cache->uc_frees);
-	printf("%s  uc_allocs =3D %llu;\n", spaces, cache->uc_allocs);
+	printf("%s  uc_frees =3D %ju;\n", spaces, cache->uc_frees);
+	printf("%s  uc_allocs =3D %ju;\n", spaces, cache->uc_allocs);
=20
 	if (cache->uc_freebucket !=3D NULL) {
 		ret =3D kread(kvm, cache->uc_freebucket, &ub, sizeof(ub), 0);
@@ -412,20 +412,20 @@ main(int argc, char *argv[])
 			}
 			printf("  Zone {\n");
 			printf("    uz_name =3D \"%s\";\n", name);
-			printf("    uz_allocs =3D %llu;\n",
+			printf("    uz_allocs =3D %ju;\n",
 			    uzp_userspace->uz_allocs);
-			printf("    uz_frees =3D %llu;\n",
+			printf("    uz_frees =3D %ju;\n",
 			    uzp_userspace->uz_frees);
-			printf("    uz_fails =3D %llu;\n",
+			printf("    uz_fails =3D %ju;\n",
 			    uzp_userspace->uz_fails);
 			printf("    uz_fills =3D %u;\n",
 			    uzp_userspace->uz_fills);
 			printf("    uz_count =3D %u;\n",
 			    uzp_userspace->uz_count);
-			uma_print_bucketlist(kvm, (struct bucketlist *)
+			uma_print_bucketlist(kvm, (void *)
 			    &uzp_userspace->uz_full_bucket, "uz_full_bucket",
 			    "    ");
-			uma_print_bucketlist(kvm, (struct bucketlist *)
+			uma_print_bucketlist(kvm, (void *)
 			    &uzp_userspace->uz_free_bucket, "uz_free_bucket",
 			    "    ");
=20
--=20
1.7.2.3


--MP_/V45ylbNW9Sv144uke8uKVXp--

--Sig_/j0IIO6G0OvbQXQCJQewu8.K
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)

iEYEARECAAYFAkyUsgkACgkQBYqIVf93VJ16SACfcwYSHrh0IoqMUFODzDrJ9RQZ
9voAoIqzNCiBLm9dpxXbGh0l8WHJEsg2
=MVkL
-----END PGP SIGNATURE-----

--Sig_/j0IIO6G0OvbQXQCJQewu8.K--

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 13:29:14 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 982C61065670;
	Sat, 18 Sep 2010 13:29:14 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id AB1CE8FC15;
	Sat, 18 Sep 2010 13:29:13 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA07099;
	Sat, 18 Sep 2010 16:29:12 +0300 (EEST)
	(envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1OwxTb-000D5K-LP; Sat, 18 Sep 2010 16:29:11 +0300
Message-ID: <4C94BEA7.6040504@freebsd.org>
Date: Sat, 18 Sep 2010 16:29:11 +0300
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.9) Gecko/20100912 Lightning/1.0b2 Thunderbird/3.1.3
MIME-Version: 1.0
To: "Robert N. M. Watson" <rwatson@freebsd.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
	<4C94A22F.1070608@freebsd.org>
	<52AE93F3-D15F-40C9-A9CA-07F30C803B81@freebsd.org>
In-Reply-To: <52AE93F3-D15F-40C9-A9CA-07F30C803B81@freebsd.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 13:29:14 -0000

on 18/09/2010 14:30 Robert N. M. Watson said the following:
> Those issues are closely related, and in particular, wanted to point Andre at
> umastat since he's probably not aware of it.. :-)

I didn't know about the tool too, so thanks!
But I perceived the issues as quite opposite: small items vs huge items.

-- 
Andriy Gapon

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 13:52:53 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C36281065673
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 13:52:53 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 9B6398FC15
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 13:52:53 +0000 (UTC)
Received: from [127.0.0.1] (rhee.cl.cam.ac.uk [128.232.1.202])
	by cyrus.watson.org (Postfix) with ESMTPSA id 9B05146B09;
	Sat, 18 Sep 2010 09:52:52 -0400 (EDT)
Mime-Version: 1.0 (Apple Message framework v1081)
Content-Type: text/plain; charset=us-ascii
From: "Robert N. M. Watson" <rwatson@FreeBSD.org>
In-Reply-To: <20100918143516.3568f40e@r500.local>
Date: Sat, 18 Sep 2010 14:52:51 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <F100D77A-CE16-40DE-B441-02E702B12686@FreeBSD.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
	<20100918143516.3568f40e@r500.local>
To: Fabian Keil <freebsd-listen@fabiankeil.de>
X-Mailer: Apple Mail (2.1081)
Cc: freebsd-hackers@freebsd.org
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 13:52:53 -0000


On 18 Sep 2010, at 13:35, Fabian Keil wrote:

> Doesn't build for me on amd64:
>=20
> fk@r500 /usr/src/tools/tools/umastat $make
> Warning: Object directory not changed from original =
/usr/src/tools/tools/umastat
> cc -O2 -pipe  -fno-omit-frame-pointer -std=3Dgnu99 -fstack-protector =
-Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter =
-Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith =
-Wno-uninitialized -Wno-pointer-sign -c umastat.c
> cc1: warnings being treated as errors
> umastat.c: In function 'uma_print_bucketlist':
> umastat.c:234: warning: format '%llu' expects type 'long long unsigned =
int', but argument 3 has type 'uint64_t'
> umastat.c:234: warning: format '%llu' expects type 'long long unsigned =
int', but argument 4 has type 'uint64_t'
> umastat.c: In function 'uma_print_cache':
> umastat.c:245: warning: format '%llu' expects type 'long long unsigned =
int', but argument 3 has type 'u_int64_t'
> umastat.c:246: warning: format '%llu' expects type 'long long unsigned =
int', but argument 3 has type 'u_int64_t'
> umastat.c: In function 'main':
> umastat.c:416: warning: format '%llu' expects type 'long long unsigned =
int', but argument 2 has type 'u_int64_t'
> umastat.c:418: warning: format '%llu' expects type 'long long unsigned =
int', but argument 2 has type 'u_int64_t'
> umastat.c:420: warning: format '%llu' expects type 'long long unsigned =
int', but argument 2 has type 'u_int64_t'
> umastat.c:426: warning: dereferencing type-punned pointer will break =
strict-aliasing rules
> umastat.c:429: warning: dereferencing type-punned pointer will break =
strict-aliasing rules
> *** Error code 1
>=20
> Stop in /usr/src/tools/tools/umastat.
>=20
> The attached patch seems to work around the problem, I'm not sure if
> the casts to void* are better than decreasing the WARN level, though =
...

This is a 32-bit/64-bit issue. Probably all pointers printing should be =
converted to %p, and large integer types to %ju and %jd, perhaps with a =
cast first to intmax_t or uintmax_t if required.

Robert=

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 15:30:52 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4B44B106566C
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 15:30:52 +0000 (UTC)
	(envelope-from yanegomi@gmail.com)
Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id EDACD8FC17
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 15:30:51 +0000 (UTC)
Received: by iwn34 with SMTP id 34so3412455iwn.13
	for <multiple recipients>; Sat, 18 Sep 2010 08:30:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type;
	bh=qEzrj+z/ZxgA+AlwhN+SwIBRyayXTx5qDaFIep448oY=;
	b=s+Oo90rnNJh+oFuhuAhfjBWjfo0nmpLI44fZa1agXtlVS4AFCKSHdTN9yqQKP3+V+l
	bxsdgVOO7ZMzo65O7HlaaKM5E51sfiusKcmXzLudqnQl8cKSDpdkHN9cvB/ZqoMWAJWF
	WyxjWKgPn++VDEBeqfpzSB84LOgD7CcyUS9uI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type;
	b=GtPl3u1AZTDtIMCNqKL1gU/UMVs53mGQOBv7ihx7wzRbEGiCxt7rwjR/Hd759/00pb
	0nHzyuDIzHQEPJZJ5J4xi+DCEP+E7pe3jIn3p3K3AUWPBBAI79aSysx3q16wyo7bzH4E
	ysYJtTQG7lE8fExMYhjYfVC1aX1qfAf5QbBSA=
MIME-Version: 1.0
Received: by 10.231.193.135 with SMTP id du7mr6087557ibb.176.1284823851253;
	Sat, 18 Sep 2010 08:30:51 -0700 (PDT)
Sender: yanegomi@gmail.com
Received: by 10.231.11.133 with HTTP; Sat, 18 Sep 2010 08:30:51 -0700 (PDT)
In-Reply-To: <F100D77A-CE16-40DE-B441-02E702B12686@FreeBSD.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
	<20100918143516.3568f40e@r500.local>
	<F100D77A-CE16-40DE-B441-02E702B12686@FreeBSD.org>
Date: Sat, 18 Sep 2010 08:30:51 -0700
X-Google-Sender-Auth: qg-NKjrSM8pHGrs2OZblCgKXerA
Message-ID: <AANLkTimyBMBM1Bfyz3RjQRoUkEeSPuLEZDb6ws0_XQ-o@mail.gmail.com>
From: Garrett Cooper <gcooper@FreeBSD.org>
To: "Robert N. M. Watson" <rwatson@freebsd.org>
Content-Type: multipart/mixed; boundary=00504501751940c0ea04908a5de9
Cc: freebsd-hackers@freebsd.org, Fabian Keil <freebsd-listen@fabiankeil.de>
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 15:30:52 -0000

--00504501751940c0ea04908a5de9
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Sat, Sep 18, 2010 at 6:52 AM, Robert N. M. Watson
<rwatson@freebsd.org> wrote:
>
> On 18 Sep 2010, at 13:35, Fabian Keil wrote:
>
>> Doesn't build for me on amd64:
>>
>> fk@r500 /usr/src/tools/tools/umastat $make
>> Warning: Object directory not changed from original /usr/src/tools/tools=
/umastat
>> cc -O2 -pipe =A0-fno-omit-frame-pointer -std=3Dgnu99 -fstack-protector -=
Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wst=
rict-prototypes -Wmissing-prototypes -Wpointer-arith -Wno-uninitialized -Wn=
o-pointer-sign -c umastat.c
>> cc1: warnings being treated as errors
>> umastat.c: In function 'uma_print_bucketlist':
>> umastat.c:234: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 3 has type 'uint64_t'
>> umastat.c:234: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 4 has type 'uint64_t'
>> umastat.c: In function 'uma_print_cache':
>> umastat.c:245: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 3 has type 'u_int64_t'
>> umastat.c:246: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 3 has type 'u_int64_t'
>> umastat.c: In function 'main':
>> umastat.c:416: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 2 has type 'u_int64_t'
>> umastat.c:418: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 2 has type 'u_int64_t'
>> umastat.c:420: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 2 has type 'u_int64_t'
>> umastat.c:426: warning: dereferencing type-punned pointer will break str=
ict-aliasing rules
>> umastat.c:429: warning: dereferencing type-punned pointer will break str=
ict-aliasing rules
>> *** Error code 1
>>
>> Stop in /usr/src/tools/tools/umastat.
>>
>> The attached patch seems to work around the problem, I'm not sure if
>> the casts to void* are better than decreasing the WARN level, though ...
>
> This is a 32-bit/64-bit issue. Probably all pointers printing should be c=
onverted to %p, and large integer types to %ju and %jd, perhaps with a cast=
 first to intmax_t or uintmax_t if required.

All types were explicitly declared as u_int64_t, so I'd try this
instead with PRIu64. Very few spots in the code today use void * (and
the ones that do interface with kvm_read(3)).

<OT>
FWIW, kvm_read taking the second argument as unsigned long instead of
void* seems a bit inconsistent:

     ssize_t
     kvm_read(kvm_t *kd, unsigned long addr, void *buf, size_t nbytes);

     ssize_t
     kvm_write(kvm_t *kd, unsigned long addr, const void *buf, size_t nbyte=
s);

but that's a different topic to look at later, if it really matters to anyo=
ne.
</OT>

Thanks,
-Garrett

--00504501751940c0ea04908a5de9
Content-Type: application/octet-stream; name="umastat-64bit.diff"
Content-Disposition: attachment; filename="umastat-64bit.diff"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_ge8mva1o0

SW5kZXg6IHVtYXN0YXQuYwo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSB1bWFzdGF0LmMJKHJldmlzaW9uIDIxMjIy
MykKKysrIHVtYXN0YXQuYwkod29ya2luZyBjb3B5KQpAQCAtMzYsNiArMzYsNyBAQAogI2luY2x1
ZGUgPHZtL3VtYV9pbnQuaD4KIAogI2luY2x1ZGUgPGVyci5oPgorI2luY2x1ZGUgPGludHR5cGVz
Lmg+CiAjaW5jbHVkZSA8a3ZtLmg+CiAjaW5jbHVkZSA8bWVtc3RhdC5oPgogI2luY2x1ZGUgPHN0
ZGlvLmg+CkBAIC0yMzAsOCArMjMxLDggQEAKIAl9CiAKIAlwcmludGYoIlxuIik7Ci0JcHJpbnRm
KCIlc307ICAvLyB0b3RhbCBjbnQgJWxsdSwgdG90YWwgZW50cmllcyAlbGx1XG4iLCBzcGFjZXMs
Ci0JICAgIHRvdGFsX2NudCwgdG90YWxfZW50cmllcyk7CisJcHJpbnRmKCIlc307ICAvLyB0b3Rh
bCBjbnQgJSJQUkl1NjQiLCB0b3RhbCBlbnRyaWVzICUiUFJJdTY0IlxuIiwKKwkgICAgc3BhY2Vz
LCB0b3RhbF9jbnQsIHRvdGFsX2VudHJpZXMpOwogfQogCiBzdGF0aWMgdm9pZApAQCAtMjQyLDgg
KzI0Myw4IEBACiAJaW50IHJldDsKIAogCXByaW50ZigiJXMlc1slZF0gPSB7XG4iLCBzcGFjZXMs
IG5hbWUsIGNwdSk7Ci0JcHJpbnRmKCIlcyAgdWNfZnJlZXMgPSAlbGx1O1xuIiwgc3BhY2VzLCBj
YWNoZS0+dWNfZnJlZXMpOwotCXByaW50ZigiJXMgIHVjX2FsbG9jcyA9ICVsbHU7XG4iLCBzcGFj
ZXMsIGNhY2hlLT51Y19hbGxvY3MpOworCXByaW50ZigiJXMgIHVjX2ZyZWVzID0gJSJQUkl1NjQi
O1xuIiwgc3BhY2VzLCBjYWNoZS0+dWNfZnJlZXMpOworCXByaW50ZigiJXMgIHVjX2FsbG9jcyA9
ICUiUFJJdTY0IjtcbiIsIHNwYWNlcywgY2FjaGUtPnVjX2FsbG9jcyk7CiAKIAlpZiAoY2FjaGUt
PnVjX2ZyZWVidWNrZXQgIT0gTlVMTCkgewogCQlyZXQgPSBrcmVhZChrdm0sIGNhY2hlLT51Y19m
cmVlYnVja2V0LCAmdWIsIHNpemVvZih1YiksIDApOwpAQCAtNDEyLDExICs0MTMsMTEgQEAKIAkJ
CX0KIAkJCXByaW50ZigiICBab25lIHtcbiIpOwogCQkJcHJpbnRmKCIgICAgdXpfbmFtZSA9IFwi
JXNcIjtcbiIsIG5hbWUpOwotCQkJcHJpbnRmKCIgICAgdXpfYWxsb2NzID0gJWxsdTtcbiIsCisJ
CQlwcmludGYoIiAgICB1el9hbGxvY3MgPSAlIlBSSXU2NCI7XG4iLAogCQkJICAgIHV6cF91c2Vy
c3BhY2UtPnV6X2FsbG9jcyk7Ci0JCQlwcmludGYoIiAgICB1el9mcmVlcyA9ICVsbHU7XG4iLAor
CQkJcHJpbnRmKCIgICAgdXpfZnJlZXMgPSAlIlBSSXU2NCI7XG4iLAogCQkJICAgIHV6cF91c2Vy
c3BhY2UtPnV6X2ZyZWVzKTsKLQkJCXByaW50ZigiICAgIHV6X2ZhaWxzID0gJWxsdTtcbiIsCisJ
CQlwcmludGYoIiAgICB1el9mYWlscyA9ICUiUFJJdTY0IjtcbiIsCiAJCQkgICAgdXpwX3VzZXJz
cGFjZS0+dXpfZmFpbHMpOwogCQkJcHJpbnRmKCIgICAgdXpfZmlsbHMgPSAldTtcbiIsCiAJCQkg
ICAgdXpwX3VzZXJzcGFjZS0+dXpfZmlsbHMpOwo=
--00504501751940c0ea04908a5de9--

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 16:11:28 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 49AC61065672
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 16:11:28 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id B04698FC08
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 16:11:27 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o8IGBNxk080340
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 18 Sep 2010 19:11:23 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	o8IGBNAr037222; Sat, 18 Sep 2010 19:11:23 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o8IGBMjR037221; 
	Sat, 18 Sep 2010 19:11:22 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Sat, 18 Sep 2010 19:11:22 +0300
From: Kostik Belousov <kostikbel@gmail.com>
To: Mateusz Guzik <mjguzik@gmail.com>
Message-ID: <20100918161122.GU2389@deviant.kiev.zoral.com.ua>
References: <4C8A81D9.5020905@rawbw.com> <20100910194600.GB60815@stack.nl>
	<20100912130801.GA23538@freebsd.org>
	<AANLkTikiWs9O+8+mwOaE4nVovT0yDQ3GvPO7E9H_MWkW@mail.gmail.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="Wlbg71WMOPzcvmIn"
Content-Disposition: inline
In-Reply-To: <AANLkTikiWs9O+8+mwOaE4nVovT0yDQ3GvPO7E9H_MWkW@mail.gmail.com>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.5 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_20,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: Alexander Best <arundel@freebsd.org>, freebsd-hackers@freebsd.org,
	Jilles Tjoelker <jilles@stack.nl>, Yuri <yuri@rawbw.com>
Subject: Re: Why I can't trace linux process's childs with truss?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 16:11:28 -0000


--Wlbg71WMOPzcvmIn
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Sep 12, 2010 at 05:01:09PM +0200, Mateusz Guzik wrote:
> On Sun, Sep 12, 2010 at 3:08 PM, Alexander Best <arundel@freebsd.org> wro=
te:
> > there's a PR related to this "issue" [1]. so is truss missing this
> > functionality or is this in fact a feature, because truss musn't be use=
d on
> > any non freebsd executable?
> >
>=20
> Actually truss handles linux processes just fine, except for their childr=
en. :)
> Linux process can create a child using linux_clone syscall, but truss doe=
s not
> handle that case and this can be the problem that Yuri reported (since
> no log was
> provided, I can only guess).
>=20
> This trivial patch should fix this:
> http://student.agh.edu.pl/~mjguzik/truss-linux-forks.patch
This is too trivial, IMO. linux_clone() does not neccessary cause new
process to be created, I think.

>=20
> Tested on this simple program:
> http://student.agh.edu.pl/~mjguzik/fork.c
>=20
> If it still does not work, log generated by truss would be helfpul.

--Wlbg71WMOPzcvmIn
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (FreeBSD)

iEYEARECAAYFAkyU5KoACgkQC3+MBN1Mb4gHLwCgmhPxYKiowkOfNguiKSZ3pY6X
cBAAn2eQ4uOkvtH2s58PkJls7s3SbipN
=VDPi
-----END PGP SIGNATURE-----

--Wlbg71WMOPzcvmIn--

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 17:31:13 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8D8A1106566B;
	Sat, 18 Sep 2010 17:31:13 +0000 (UTC)
	(envelope-from pluknet@gmail.com)
Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com
	[209.85.216.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 2B7D58FC19;
	Sat, 18 Sep 2010 17:31:12 +0000 (UTC)
Received: by qwg5 with SMTP id 5so2980286qwg.13
	for <multiple recipients>; Sat, 18 Sep 2010 10:31:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:received:in-reply-to
	:references:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=5DkWPnfo/teCp6QxaDxKazLg/ZQNeSCiESarYWsymPM=;
	b=C5FX06akGjS2VNzJjyvJdpQpIHq/GtSWW3z//1LafHhyR9nVzZA6btFkPgClY5/Pg3
	nPrMIv9B8xWJSpvPPR9WLq4GzrVtB+PIRMf1UK1qcP205ekWpXjSkhWsP2qntaWnWVjd
	CNQ9aOmzyWaKw0yR3YsmTi+wN7sR4N8b+dRnw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=sUq+BDtpUqjh58s81S5qPnsIaVkpGdID32s38dppqaEbhcsL35kSTnX6tdcCXIdQHX
	MF/Df4IZvgjs2NEwM1Ysd+ksGpe7XHdDHplapZ2k0v99WVXGNif2TfsudzSHooBQPkb4
	MjkDUJOSQXwehnPU7DpPLTHvNgJCQIYcwFh7I=
MIME-Version: 1.0
Received: by 10.229.11.18 with SMTP id r18mr4405897qcr.281.1284829283744; Sat,
	18 Sep 2010 10:01:23 -0700 (PDT)
Received: by 10.229.19.206 with HTTP; Sat, 18 Sep 2010 10:01:23 -0700 (PDT)
In-Reply-To: <F100D77A-CE16-40DE-B441-02E702B12686@FreeBSD.org>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
	<20100918143516.3568f40e@r500.local>
	<F100D77A-CE16-40DE-B441-02E702B12686@FreeBSD.org>
Date: Sat, 18 Sep 2010 21:01:23 +0400
Message-ID: <AANLkTikqc3Ya4zsF6zZA248S5CoERHm8a=sE+9Af93te@mail.gmail.com>
From: pluknet <pluknet@gmail.com>
To: "Robert N. M. Watson" <rwatson@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-hackers@freebsd.org, Fabian Keil <freebsd-listen@fabiankeil.de>
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 17:31:13 -0000

On 18 September 2010 17:52, Robert N. M. Watson <rwatson@freebsd.org> wrote=
:
>
> On 18 Sep 2010, at 13:35, Fabian Keil wrote:
>
>> Doesn't build for me on amd64:
>>
>> fk@r500 /usr/src/tools/tools/umastat $make
>> Warning: Object directory not changed from original /usr/src/tools/tools=
/umastat
>> cc -O2 -pipe =A0-fno-omit-frame-pointer -std=3Dgnu99 -fstack-protector -=
Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wst=
rict-prototypes -Wmissing-prototypes -Wpointer-arith -Wno-uninitialized -Wn=
o-pointer-sign -c umastat.c
>> cc1: warnings being treated as errors
>> umastat.c: In function 'uma_print_bucketlist':
>> umastat.c:234: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 3 has type 'uint64_t'
>> umastat.c:234: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 4 has type 'uint64_t'
>> umastat.c: In function 'uma_print_cache':
>> umastat.c:245: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 3 has type 'u_int64_t'
>> umastat.c:246: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 3 has type 'u_int64_t'
>> umastat.c: In function 'main':
>> umastat.c:416: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 2 has type 'u_int64_t'
>> umastat.c:418: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 2 has type 'u_int64_t'
>> umastat.c:420: warning: format '%llu' expects type 'long long unsigned i=
nt', but argument 2 has type 'u_int64_t'
>> umastat.c:426: warning: dereferencing type-punned pointer will break str=
ict-aliasing rules
>> umastat.c:429: warning: dereferencing type-punned pointer will break str=
ict-aliasing rules
>> *** Error code 1
>>
>> Stop in /usr/src/tools/tools/umastat.
>>
>> The attached patch seems to work around the problem, I'm not sure if
>> the casts to void* are better than decreasing the WARN level, though ...
>
> This is a 32-bit/64-bit issue. Probably all pointers printing should be c=
onverted to %p, and large integer types to %ju and %jd, perhaps with a cast=
 first to intmax_t or uintmax_t if required.
>

FYI, There is a PR 146119 about sort of fixing that issues.

--=20
wbr,
pluknet

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 18:26:38 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0798E1065673
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 18:26:38 +0000 (UTC)
	(envelope-from asmrookie@gmail.com)
Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com
	[209.85.216.54])
	by mx1.freebsd.org (Postfix) with ESMTP id ACB4B8FC0A
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 18:26:37 +0000 (UTC)
Received: by qwg5 with SMTP id 5so2998504qwg.13
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 11:26:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=O06q6jKsHAn9hWr71J0lIm64cO/vjmlV00nTX8p5rRM=;
	b=cMUvxEpR7zZ0LIEGv5VpyhTjhEvIn9QvMQtVSw1pUxE4grBqU+JAtTSUjUi5rHRuZB
	OQrWKKlTIGreV9IPZSq1EriYQ2jXexq4vFMKWyUIfMxau8ThbphTa4JcUPklMI7BaTMw
	CAwFyAAoQgAiBEMRNKoCTv17LuPz0yOWpXyLE=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	b=mOUDoK9yWEwiBaiyMTK+qMdpA/SObvX6JujEIsQOEu3wXf0IBVUE9DTamSLH43RgFk
	XPyUV/pKQ9/LlqGZiBWmWEg78+VjVSadScvUshSvhLMAlqlK3ppvhhzEYios8qy9xENe
	IONrda3j8MUjFjj+QtzWJ3LZUIEOkapHzIf90=
MIME-Version: 1.0
Received: by 10.224.28.145 with SMTP id m17mr4477091qac.196.1284834396771;
	Sat, 18 Sep 2010 11:26:36 -0700 (PDT)
Sender: asmrookie@gmail.com
Received: by 10.229.235.143 with HTTP; Sat, 18 Sep 2010 11:26:36 -0700 (PDT)
In-Reply-To: <4C94A138.8050905@icyb.net.ua>
References: <4C94A138.8050905@icyb.net.ua>
Date: Sat, 18 Sep 2010 20:26:36 +0200
X-Google-Sender-Auth: 6GFItXLAkcVaIihHlDlIKx2a62U
Message-ID: <AANLkTingR6k6xdQJ3cZH8EkJeCWnq5vzeEjGHNaDv8AT@mail.gmail.com>
From: Attilio Rao <attilio@freebsd.org>
To: Andriy Gapon <avg@icyb.net.ua>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-hackers@freebsd.org
Subject: Re: KDB_TRACE and no backend
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 18:26:38 -0000

2010/9/18 Andriy Gapon <avg@icyb.net.ua>:
>
> Here's a small patch that adds support for printing stack trace in form o=
f frame
> addresses when KDB_TRACE is enabled, but there is no debugger backend con=
figured.
> The patch is styled after "cheap" variant of stack_ktr.
>
> What do you think (useful/useless, correct, etc) ?
>
> --- a/sys/kern/subr_kdb.c
> +++ b/sys/kern/subr_kdb.c
> @@ -37,6 +37,7 @@
> =C2=A0#include <sys/pcpu.h>
> =C2=A0#include <sys/proc.h>
> =C2=A0#include <sys/smp.h>
> +#include <sys/stack.h>
> =C2=A0#include <sys/sysctl.h>
>
> =C2=A0#include <machine/kdb.h>
> @@ -295,10 +296,16 @@
> =C2=A0void
> =C2=A0kdb_backtrace(void)
> =C2=A0{
> + =C2=A0 =C2=A0 =C2=A0 struct stack st;
> + =C2=A0 =C2=A0 =C2=A0 int i;
>
> - =C2=A0 =C2=A0 =C2=A0 if (kdb_dbbe !=3D NULL && kdb_dbbe->dbbe_trace !=
=3D NULL) {
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 printf("KDB: stack bac=
ktrace:\n");
> + =C2=A0 =C2=A0 =C2=A0 printf("KDB: stack backtrace:\n");
> + =C2=A0 =C2=A0 =C2=A0 if (kdb_dbbe !=3D NULL && kdb_dbbe->dbbe_trace !=
=3D NULL)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0kdb_dbbe->dbbe_tra=
ce();
> + =C2=A0 =C2=A0 =C2=A0 else {
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 stack_save(&st);
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 for (i =3D 0; i < st.d=
epth; i++)
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 printf("#%d %p\n", i, (void*)(uintptr_t)st.pcs[i]);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> =C2=A0}

You have to eventually wrap this logic within the 'STACK' option
(opt_stack.h for the check) because stack_save() will be uneffective
otherwise. STACK should be mandatory for DDB I guess, but it is not
for KDB.

Thanks,
Attilio


--=20
Peace can only be achieved by understanding - A. Einstein

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 18:41:25 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 63793106566B
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 18:41:25 +0000 (UTC)
	(envelope-from avg@icyb.net.ua)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 7A4D88FC15
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 18:41:24 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA10346;
	Sat, 18 Sep 2010 21:41:23 +0300 (EEST)
	(envelope-from avg@icyb.net.ua)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1Ox2Li-000DRY-MV; Sat, 18 Sep 2010 21:41:22 +0300
Message-ID: <4C9507D1.3010008@icyb.net.ua>
Date: Sat, 18 Sep 2010 21:41:21 +0300
From: Andriy Gapon <avg@icyb.net.ua>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.9) Gecko/20100912 Lightning/1.0b2 Thunderbird/3.1.3
MIME-Version: 1.0
To: Attilio Rao <attilio@freebsd.org>
References: <4C94A138.8050905@icyb.net.ua>
	<AANLkTingR6k6xdQJ3cZH8EkJeCWnq5vzeEjGHNaDv8AT@mail.gmail.com>
In-Reply-To: <AANLkTingR6k6xdQJ3cZH8EkJeCWnq5vzeEjGHNaDv8AT@mail.gmail.com>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org
Subject: Re: KDB_TRACE and no backend
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 18:41:25 -0000

on 18/09/2010 21:26 Attilio Rao said the following:
> 
> You have to eventually wrap this logic within the 'STACK' option
> (opt_stack.h for the check) because stack_save() will be uneffective
> otherwise. STACK should be mandatory for DDB I guess, but it is not
> for KDB.

Thank you for the tip!
BTW, why is this under an option?
It seems like something like this won't add much to kernel size and won't affect
performance at all.

-- 
Andriy Gapon

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 18:55:08 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2D1771065679
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 18:55:08 +0000 (UTC)
	(envelope-from freebsd-hackers@m.gmane.org)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by mx1.freebsd.org (Postfix) with ESMTP id D9D1A8FC08
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 18:55:07 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-hackers@m.gmane.org>) id 1Ox2Yz-000830-3E
	for freebsd-hackers@freebsd.org; Sat, 18 Sep 2010 20:55:05 +0200
Received: from k.saper.info ([91.121.151.35])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 20:55:05 +0200
Received: from saper by k.saper.info with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-hackers@freebsd.org>; Sat, 18 Sep 2010 20:55:05 +0200
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-hackers@freebsd.org
From: Marcin Cieslak <saper@saper.info>
Date: Sat, 18 Sep 2010 18:03:42 +0000 (UTC)
Organization: http://saper.info
Lines: 9
Message-ID: <slrni99vnu.23jn.saper@saper.info>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
	<20100918143516.3568f40e@r500.local>
	<F100D77A-CE16-40DE-B441-02E702B12686@FreeBSD.org>
	<AANLkTimyBMBM1Bfyz3RjQRoUkEeSPuLEZDb6ws0_XQ-o@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: k.saper.info
User-Agent: slrn/0.9.9p1 (FreeBSD)
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 18:55:08 -0000

> FWIW, kvm_read taking the second argument as unsigned long instead of
> void* seems a bit inconsistent:

I think it done on purpose, since address in the kernel address space
has nothing to do with pointers for mere userland mortals. We shouldn't
bother compiler with aliasing and other stuff in case of kernel addresses.

//Marcin



From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 19:00:27 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7E1A21065670;
	Sat, 18 Sep 2010 19:00:27 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 8DB778FC1B;
	Sat, 18 Sep 2010 19:00:26 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id WAA10537;
	Sat, 18 Sep 2010 22:00:24 +0300 (EEST)
	(envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1Ox2e8-000DT7-Ly; Sat, 18 Sep 2010 22:00:24 +0300
Message-ID: <4C950C48.6020600@freebsd.org>
Date: Sat, 18 Sep 2010 22:00:24 +0300
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.9) Gecko/20100912 Lightning/1.0b2 Thunderbird/3.1.3
MIME-Version: 1.0
To: Attilio Rao <attilio@freebsd.org>
References: <4C94A138.8050905@icyb.net.ua>
	<AANLkTingR6k6xdQJ3cZH8EkJeCWnq5vzeEjGHNaDv8AT@mail.gmail.com>
	<4C9507D1.3010008@icyb.net.ua>
In-Reply-To: <4C9507D1.3010008@icyb.net.ua>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org
Subject: Re: KDB_TRACE and no backend
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 19:00:27 -0000

on 18/09/2010 21:41 Andriy Gapon said the following:
> on 18/09/2010 21:26 Attilio Rao said the following:
>>
>> You have to eventually wrap this logic within the 'STACK' option
>> (opt_stack.h for the check) because stack_save() will be uneffective
>> otherwise. STACK should be mandatory for DDB I guess, but it is not
>> for KDB.
> 
> Thank you for the tip!
> BTW, why is this under an option?
> It seems like something like this won't add much to kernel size and won't affect
> performance at all.
> 

Oh, wow, and I totally overlooked stack_print().
Should have read stack(9) from the start.

-- 
Andriy Gapon

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 20:30:06 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7BDDF1065679;
	Sat, 18 Sep 2010 20:30:06 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 956078FC18;
	Sat, 18 Sep 2010 20:30:05 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA11471;
	Sat, 18 Sep 2010 23:30:03 +0300 (EEST)
	(envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1Ox42t-000DZ7-8B; Sat, 18 Sep 2010 23:30:03 +0300
Message-ID: <4C95214A.3070600@freebsd.org>
Date: Sat, 18 Sep 2010 23:30:02 +0300
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.9) Gecko/20100912 Lightning/1.0b2 Thunderbird/3.1.3
MIME-Version: 1.0
To: Attilio Rao <attilio@freebsd.org>
References: <4C94A138.8050905@icyb.net.ua>
	<AANLkTingR6k6xdQJ3cZH8EkJeCWnq5vzeEjGHNaDv8AT@mail.gmail.com>
	<4C9507D1.3010008@icyb.net.ua> <4C950C48.6020600@freebsd.org>
In-Reply-To: <4C950C48.6020600@freebsd.org>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org
Subject: Re: KDB_TRACE and no backend
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 20:30:06 -0000

on 18/09/2010 22:00 Andriy Gapon said the following:
> Oh, wow, and I totally overlooked stack_print().
> Should have read stack(9) from the start.

New patch.  Hope this is better.
I don't like that the printf is duplicated, but couldn't figure out a way to
combine pre-processor and C conditions.

--- a/sys/kern/subr_kdb.c
+++ b/sys/kern/subr_kdb.c
@@ -37,6 +37,7 @@ __FBSDID("$FreeBSD$");
 #include <sys/pcpu.h>
 #include <sys/proc.h>
 #include <sys/smp.h>
+#include <sys/stack.h>
 #include <sys/sysctl.h>

 #include <machine/kdb.h>
@@ -300,6 +301,15 @@ kdb_backtrace(void)
 		printf("KDB: stack backtrace:\n");
 		kdb_dbbe->dbbe_trace();
 	}
+#ifdef STACK
+	else {
+		struct stack st;
+
+		printf("KDB: stack backtrace:\n");
+		stack_save(&st);
+		stack_print(&st);
+	}
+#endif
 }

 /*


-- 
Andriy Gapon

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 20:35:49 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6CA0E106566C;
	Sat, 18 Sep 2010 20:35:49 +0000 (UTC)
	(envelope-from asmrookie@gmail.com)
Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com
	[209.85.216.175])
	by mx1.freebsd.org (Postfix) with ESMTP id 04D6A8FC0C;
	Sat, 18 Sep 2010 20:35:48 +0000 (UTC)
Received: by qyk31 with SMTP id 31so1950493qyk.13
	for <multiple recipients>; Sat, 18 Sep 2010 13:35:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=5RFx50P4F4C6tu0HojTsa89DCFh42GPVILYc75KuQHs=;
	b=w0/0x/ih92CRV8qeHO/45gBHwYKV2pYhhJJ3LpADcjsvIbDgZaDcEtDcTPR8kPZ1fK
	/1MTA+oMozoFjA0v5K9bLB4jh8uHWG7iN6S9axbuobAKPmcbTuQ3/brsZsQ4eUDq4KrK
	HqDZ1npW4n6y3xafOb1VbUyHmVx5JRLu8V4m8=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	b=lmnBJERnwcmnZKtNrXuUbt7+0eYlVQ6WqFQnVFJOohi9Ic4OkxNpeFpDlzBhdmXxLx
	+Yho5469QPb5nxmHjaisxsjxrM+hudJVqZ89q4wAEMcogp/LLfKwnxyLStakl3zk1eAL
	kTIu+U7U2coTp4YUuG9HijZrcSshWPEWEDHUk=
MIME-Version: 1.0
Received: by 10.224.54.13 with SMTP id o13mr4609286qag.9.1284842148123; Sat,
	18 Sep 2010 13:35:48 -0700 (PDT)
Sender: asmrookie@gmail.com
Received: by 10.229.235.143 with HTTP; Sat, 18 Sep 2010 13:35:48 -0700 (PDT)
In-Reply-To: <4C95214A.3070600@freebsd.org>
References: <4C94A138.8050905@icyb.net.ua>
	<AANLkTingR6k6xdQJ3cZH8EkJeCWnq5vzeEjGHNaDv8AT@mail.gmail.com>
	<4C9507D1.3010008@icyb.net.ua> <4C950C48.6020600@freebsd.org>
	<4C95214A.3070600@freebsd.org>
Date: Sat, 18 Sep 2010 22:35:48 +0200
X-Google-Sender-Auth: IZi5OPjEjC5t-Tks2jaEQwJa2gI
Message-ID: <AANLkTimuiBG4nL4o+J+KM6+9QdkRwwEsWnbC-tsoNO54@mail.gmail.com>
From: Attilio Rao <attilio@freebsd.org>
To: Andriy Gapon <avg@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-hackers@freebsd.org
Subject: Re: KDB_TRACE and no backend
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 20:35:49 -0000

2010/9/18 Andriy Gapon <avg@freebsd.org>:
> on 18/09/2010 22:00 Andriy Gapon said the following:
>> Oh, wow, and I totally overlooked stack_print().
>> Should have read stack(9) from the start.
>
> New patch. =C2=A0Hope this is better.
> I don't like that the printf is duplicated, but couldn't figure out a way=
 to
> combine pre-processor and C conditions.
>
> --- a/sys/kern/subr_kdb.c
> +++ b/sys/kern/subr_kdb.c
> @@ -37,6 +37,7 @@ __FBSDID("$FreeBSD$");
> =C2=A0#include <sys/pcpu.h>
> =C2=A0#include <sys/proc.h>
> =C2=A0#include <sys/smp.h>
> +#include <sys/stack.h>
> =C2=A0#include <sys/sysctl.h>
>
> =C2=A0#include <machine/kdb.h>
> @@ -300,6 +301,15 @@ kdb_backtrace(void)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0printf("KDB: stack=
 backtrace:\n");
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0kdb_dbbe->dbbe_tra=
ce();
> =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +#ifdef STACK
> + =C2=A0 =C2=A0 =C2=A0 else {
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct stack st;
> +
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 printf("KDB: stack bac=
ktrace:\n");
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 stack_save(&st);
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 stack_print(&st);
> + =C2=A0 =C2=A0 =C2=A0 }
> +#endif
> =C2=A0}
>
> =C2=A0/*
>

It is still missing checking on opt_stack.h
Besides, I'd reconsider having KDB_TRACE explanation in ddb(4) manpage
(right now it is rightly there because it is DDB specific only, as
long as it offers the backend, but with your change it is a global
functionality. Not sure if it worths changing it but however you may
have more opinions).

Thanks,
Attilio


--=20
Peace can only be achieved by understanding - A. Einstein

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 20:49:46 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A4C5A106564A;
	Sat, 18 Sep 2010 20:49:46 +0000 (UTC) (envelope-from avg@freebsd.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id BACC48FC15;
	Sat, 18 Sep 2010 20:49:45 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA11682;
	Sat, 18 Sep 2010 23:49:44 +0300 (EEST)
	(envelope-from avg@freebsd.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1Ox4Lv-000DaR-VM; Sat, 18 Sep 2010 23:49:44 +0300
Message-ID: <4C9525E7.3030804@freebsd.org>
Date: Sat, 18 Sep 2010 23:49:43 +0300
From: Andriy Gapon <avg@freebsd.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.9) Gecko/20100912 Lightning/1.0b2 Thunderbird/3.1.3
MIME-Version: 1.0
To: Attilio Rao <attilio@freebsd.org>
References: <4C94A138.8050905@icyb.net.ua>	<AANLkTingR6k6xdQJ3cZH8EkJeCWnq5vzeEjGHNaDv8AT@mail.gmail.com>	<4C9507D1.3010008@icyb.net.ua>	<4C950C48.6020600@freebsd.org>	<4C95214A.3070600@freebsd.org>
	<AANLkTimuiBG4nL4o+J+KM6+9QdkRwwEsWnbC-tsoNO54@mail.gmail.com>
In-Reply-To: <AANLkTimuiBG4nL4o+J+KM6+9QdkRwwEsWnbC-tsoNO54@mail.gmail.com>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org
Subject: Re: KDB_TRACE and no backend
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 20:49:46 -0000

on 18/09/2010 23:35 Attilio Rao said the following:
> It is still missing checking on opt_stack.h

Yes, thanks, fixed it in my tree.

> Besides, I'd reconsider having KDB_TRACE explanation in ddb(4) manpage
> (right now it is rightly there because it is DDB specific only, as
> long as it offers the backend, but with your change it is a global
> functionality. Not sure if it worths changing it but however you may
> have more opinions).

It seems that we don't have kdb(4) ?

-- 
Andriy Gapon

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 20:51:11 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 83177106566C;
	Sat, 18 Sep 2010 20:51:11 +0000 (UTC)
	(envelope-from asmrookie@gmail.com)
Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com
	[209.85.216.175])
	by mx1.freebsd.org (Postfix) with ESMTP id 1B57A8FC2C;
	Sat, 18 Sep 2010 20:51:10 +0000 (UTC)
Received: by qyk31 with SMTP id 31so1956175qyk.13
	for <multiple recipients>; Sat, 18 Sep 2010 13:51:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:sender:received
	:in-reply-to:references:date:x-google-sender-auth:message-id:subject
	:from:to:cc:content-type;
	bh=ExSZDlKSbl6vAcqzIjhguElOQXDg1wxncMi1z93+Zd4=;
	b=thvwQZec1fNAui/3tO5ZLmwHy6vALiraLea6idT26CvQJJaG5wCBR0WHkLFT+7DW+H
	G2ebHn96oZ+cJALId/Cdw1m94kNJ0tLjwknmaQJ411LMFk9R8fknZiz/x6ldqBF5DJ0C
	CDRmeu2mgp5ufn9Z1KW/N/Yi0elsQEQccntNI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type;
	b=lI+v0WQPgxxCii/u2KySEaMQONlrFoWfSNs+o0A7+gH3tFN3YaNZ9uSVnDgzGuRQ5e
	hWuaRONwF3cjqA0pSuQcjAm4hZvw+F3Zp+JeBy9DQHgq8M7mbkmi2lxOvkqx0fmBu033
	L8offdM3t3FM5r/Cl2PHJzT84crGfuuo39Qog=
MIME-Version: 1.0
Received: by 10.229.65.159 with SMTP id j31mr4307475qci.212.1284843070303;
	Sat, 18 Sep 2010 13:51:10 -0700 (PDT)
Sender: asmrookie@gmail.com
Received: by 10.229.235.143 with HTTP; Sat, 18 Sep 2010 13:51:10 -0700 (PDT)
In-Reply-To: <4C9525E7.3030804@freebsd.org>
References: <4C94A138.8050905@icyb.net.ua>
	<AANLkTingR6k6xdQJ3cZH8EkJeCWnq5vzeEjGHNaDv8AT@mail.gmail.com>
	<4C9507D1.3010008@icyb.net.ua> <4C950C48.6020600@freebsd.org>
	<4C95214A.3070600@freebsd.org>
	<AANLkTimuiBG4nL4o+J+KM6+9QdkRwwEsWnbC-tsoNO54@mail.gmail.com>
	<4C9525E7.3030804@freebsd.org>
Date: Sat, 18 Sep 2010 22:51:10 +0200
X-Google-Sender-Auth: CBHq4fhJDb-0P_PwGC6Pi7xl26k
Message-ID: <AANLkTi=2hc4a0CnSiCAuCeHoisZ25WK7RRWxfrDpWkez@mail.gmail.com>
From: Attilio Rao <attilio@freebsd.org>
To: Andriy Gapon <avg@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-hackers@freebsd.org
Subject: Re: KDB_TRACE and no backend
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 20:51:11 -0000

2010/9/18 Andriy Gapon <avg@freebsd.org>:
> on 18/09/2010 23:35 Attilio Rao said the following:
>> It is still missing checking on opt_stack.h
>
> Yes, thanks, fixed it in my tree.
>
>> Besides, I'd reconsider having KDB_TRACE explanation in ddb(4) manpage
>> (right now it is rightly there because it is DDB specific only, as
>> long as it offers the backend, but with your change it is a global
>> functionality. Not sure if it worths changing it but however you may
>> have more opinions).
>
> It seems that we don't have kdb(4) ?
>

We don't and we should really have. I'd really like a kernel section
describing the whole kdb infrastructure and kdbe hooks.
That may be indicated as a janitor taks actually if someone wants to
takeover and document the whole layer.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 22:13:24 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 75DCE1065672;
	Sat, 18 Sep 2010 22:13:24 +0000 (UTC)
	(envelope-from perryh@pluto.rain.com)
Received: from agora.rdrop.com (agora.rdrop.com [IPv6:2607:f678:1010::34])
	by mx1.freebsd.org (Postfix) with ESMTP id 541688FC1B;
	Sat, 18 Sep 2010 22:13:24 +0000 (UTC)
Received: from agora.rdrop.com (66@localhost [127.0.0.1])
	by agora.rdrop.com (8.13.1/8.12.7) with ESMTP id o8IMDNtr026215
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT);
	Sat, 18 Sep 2010 15:13:23 -0700 (PDT)
	(envelope-from perryh@pluto.rain.com)
Received: (from uucp@localhost)
	by agora.rdrop.com (8.13.1/8.12.9/Submit) with UUCP id o8IMDNbH026214; 
	Sat, 18 Sep 2010 15:13:23 -0700 (PDT)
Received: from fbsd61 by pluto.rain.com (4.1/SMI-4.1-pluto-M2060407)
	id AA15753; Sat, 18 Sep 10 15:09:25 PDT
Date: Sat, 18 Sep 2010 15:09:19 -0700
From: perryh@pluto.rain.com
To: kientzle@freebsd.org
Message-Id: <4c95388f.vSPICvvA6A5bgvDR%perryh@pluto.rain.com>
References: <alpine.GSO.1.10.1008281833470.9337@multics.mit.edu>
	<20100829201050.GA60715@stack.nl>
	<alpine.GSO.1.10.1009032036310.9337@multics.mit.edu>
	<F56D9CB9-E644-4279-8830-71292C880D9B@freebsd.org>
In-Reply-To: <F56D9CB9-E644-4279-8830-71292C880D9B@freebsd.org>
User-Agent: nail 11.25 7/29/05
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: freebsd-hackers@freebsd.org
Subject: Re: ar(1) format_decimal failure is fatal?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 22:13:24 -0000

Tim Kientzle <kientzle@freebsd.org> wrote:

> Personally, I wonder if it wouldn't make sense to just always
> force the timestamp, uid, and gid to zero ..

uid and gid, OK.  Timestamp, no.  It is not that rare to need
to find out which version of some .o is in a particular .a file,
usually in connection with debugging some obscure failure.

For that matter, aren't there some versions of make(1) that can
check whether an archive member is up to date by examining the
timestamp in the archive?

From owner-freebsd-hackers@FreeBSD.ORG  Sat Sep 18 22:42:12 2010
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E9C8C106566B;
	Sat, 18 Sep 2010 22:42:11 +0000 (UTC)
	(envelope-from jroberson@jroberson.net)
Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com
	[209.85.210.54])
	by mx1.freebsd.org (Postfix) with ESMTP id AF8728FC12;
	Sat, 18 Sep 2010 22:42:11 +0000 (UTC)
Received: by pzk7 with SMTP id 7so1109817pzk.13
	for <multiple recipients>; Sat, 18 Sep 2010 15:42:11 -0700 (PDT)
Received: by 10.142.232.19 with SMTP id e19mr4134137wfh.254.1284848144219;
	Sat, 18 Sep 2010 15:15:44 -0700 (PDT)
Received: from [10.0.1.198] (udp022762uds.hawaiiantel.net [72.234.79.107])
	by mx.google.com with ESMTPS id o17sm9676882wal.21.2010.09.18.15.15.40
	(version=SSLv3 cipher=RC4-MD5); Sat, 18 Sep 2010 15:15:42 -0700 (PDT)
Date: Sat, 18 Sep 2010 12:16:49 -1000 (HST)
From: Jeff Roberson <jroberson@jroberson.net>
X-X-Sender: jroberson@desktop
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
Message-ID: <alpine.BSF.2.00.1009181135430.23448@desktop>
References: <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org>
	<alpine.BSF.2.00.1009181221560.86826@fledge.watson.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Mailman-Approved-At: Sat, 18 Sep 2010 22:47:56 +0000
Cc: freebsd-hackers@freebsd.org, Jeff Roberson <jeff@freebsd.org>,
	Andre Oppermann <andre@freebsd.org>, Andriy Gapon <avg@freebsd.org>
Subject: Re: zfs + uma
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Sep 2010 22:42:12 -0000

On Sat, 18 Sep 2010, Robert Watson wrote:

>
> On Fri, 17 Sep 2010, Andre Oppermann wrote:
>
>>> Although keeping free items around improves performance, it does consume 
>>> memory too.  And the fact that that memory is not freed on lowmem 
>>> condition makes the situation worse.
>> 
>> Interesting.  We may run into related issues with excessive mbuf (cluster) 
>> caching in the per-cpu buckets as well.
>> 
>> Having a general solutions for that is appreciated.  Maybe the size of the 
>> free per-cpu buckets should be specified when setting up the UMA zone.  Of 
>> certain frequently re-used elements we may want to cache more, other less.
>
> I've been keeping a vague eye out for this over the last few years, and 
> haven't spotted many problems in production machines I've inspected.  You can 
> use the umastat tool in the tools tree to look at the distribution of memory 
> over buckets (etc) in UMA manually.  It would be nice if it had some 
> automated statistics on fragmentation however.  Short-lived fragmentation is 
> likely, and isn't an issue, so what you want is a tool that monitors over 
> time and reports on longer-lived fragmentation.

Not specifically in reaction to Robert's comment but I would like to add 
my thoughts to this notion of resource balancing in buckets.  I really 
prefer not to do any specific per-zone tuning except in extreme cases. 
This is because quite often the decisions we make don't apply to some 
class of machines or workloads.  I would instead prefer to keep the 
algorithm adaptable.

I like the idea of weighting the bucket decisions by the size of the item. 
Obviously this has some flaws with compound objects but in the general 
case it is good.  We should consider increasing the cost of bucket 
expansion based on the size of the item.  Right now buckets are expanded 
fairly readily.

We could also consider decreasing the default bucket size for a zone based 
on vm pressure and use.  Right now there is no downward pressure on bucket 
size, only upward based on trips to the slab layer.

Additionally we could make a last ditch flush mechanism that runs on each 
cpu in turn and flushes some or all of the buckets in per-cpu caches. 
Presently that is not done due to synchronization issues.  It can't be 
done from a central place.  It could be done with a callout mechanism or a 
for loop that binds to each core in succession.

I believe the combination of these approaches would significantly solve 
the problem and should be relatively little new code.  It should also 
preserve the adaptable nature of the system without penalizing resource 
heavy systems.  I would be happy to review patches from anyone who wishes 
to undertake it.


>
> The main fragmentation issue we've had in the past has been due to 
> mbuf+cluster caching, which prevented mbufs from being freed usefully in some 
> cases.  Jeff's ongoing work on variable-sized mbufs would entirely eliminate 
> that problem...

I'm going to get back to this soon as infiniband gets to a useful state 
for doing high performance network testing.  This is only because I have 
no 10gigE but do have ib and have funding to cover working on it.  I hope 
to have some results and activity on this front by the end of the year. 
I know it has been long coming.

Thanks,
Jeff

>
> Robert
>