From owner-freebsd-current@FreeBSD.ORG  Wed Aug  5 23:17:12 2009
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 71FAB106564A;
	Wed,  5 Aug 2009 23:17:12 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 496038FC08;
	Wed,  5 Aug 2009 23:17:12 +0000 (UTC)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id D888746B03;
	Wed,  5 Aug 2009 19:17:11 -0400 (EDT)
Date: Thu, 6 Aug 2009 00:17:11 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Navdeep Parhar <nparhar@gmail.com>
In-Reply-To: <20090805063417.GA10969@doormat.home>
Message-ID: <alpine.BSF.2.00.0908060011490.59996@fledge.watson.org>
References: <20090804225806.GA54680@hub.freebsd.org>
	<20090805054115.O93661@maildrop.int.zabbadoz.net>
	<20090805063417.GA10969@doormat.home>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-current@freebsd.org, jeff@FreeBSD.org,
	"Bjoern A. Zeeb" <bz@FreeBSD.org>, kib@FreeBSD.org,
	Navdeep Parhar <np@FreeBSD.org>, lstewart@FreeBSD.org
Subject: Re: reproducible panic in netisr
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Aug 2009 23:17:12 -0000

On Tue, 4 Aug 2009, Navdeep Parhar wrote:

>>> This occurs on today's HEAD + some unrelated patches.  That makes it 
>>> 8.0BETA2+ code.  I haven't tried older builds.
>>
>> We have finally been able to reproduce this ourselves yesterday and
>
> Well, it happens every single time on all of my amd64 machines. After I'd 
> already sent my email I noticed that the netisr mutex has an odd address 
> (pun intended :-))
>
> m=0xffffffff8144d867

Heh, indeed.  We just spotted the same result here.  In this case it's causing 
a panic because it leads to a non-atomic read due to mtx_lock spanning a cache 
line boundary, followed shortly by a panic because it's not a valid thread 
pointer when it's dereferenced, as we get a fractional pointer.

> It's a bit unusual for the mutex struct to start at a completely unaligned 
> address.  I hope things are better on sparc64 etc., not everyone is as 
> forgiving as amd64.

amd64 isn't as forgiving either, it turns out. :-)

> The mutex led me to some DPCPU stuff that I didn't quite get.
>
> (kgdb) p/x dpcpu_off
> $2 = {0x8407d7, 0xffffff807f4037d7, 0x0 <repeats 30 times>}
> (kgdb) p dpcpu
> $3 = (void *) 0xffffff8000010000
> (kgdb) p &__start_set_pcpu
> $4 = (uintptr_t **) 0xffffffff80c0c829
> (kgdb) p/x 0xffffff8000010000 - 0xffffffff80c0c829
> $5 = 0xffffff807f4037d7
>
> It's not clear why we prefer to store offsets from DPCPU_START, instead of 
> the base address of the dpcpu area directly.  On amd64, the dpcpu area for 
> cpu 0 is above kernbase (immediately after kernbase + thread0's stack). 
> For the other CPUs it's below kernbase.  This makes the pointer arithmetic 
> that calculates offsets more "interesting."
>
> Why have a dpcpu_off[] instead of a dpcpu_base[]?

Each field in DPCPU is named with respect to the start of a "master" dpcpu 
copy, which holds the static initialization.  This makes the per-CPU name:

    (&master_name_for_variable - DPCPU_START) + per-cpu-base

What Jeff has done is factor out the DPCPU_START subtraction, since it's a 
constant subtraction across all DPCPU use, and do it once when calculating 
dpcpu_off.  This should all be fine, the question is why we're losing the 
alignment during linking of the kernel.  netisr is linked into the base 
kernel, so I guess it's some problem with the way the linker set is being laid 
out at compile-time.  I expect we may have a similar issue with the run-time 
allocation of DPCPU space as well.

Robert N M Watson
Computer Laboratory
University of Cambridge