From owner-freebsd-current@freebsd.org Thu Dec 15 00:44:12 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4885AC750E6 for ; Thu, 15 Dec 2016 00:44:12 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qk0-x243.google.com (mail-qk0-x243.google.com [IPv6:2607:f8b0:400d:c09::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 01D2C819; Thu, 15 Dec 2016 00:44:11 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qk0-x243.google.com with SMTP id h201so5277579qke.3; Wed, 14 Dec 2016 16:44:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=c9HpRDt4lPR0av7gju99Obtne0ygN0Y0/yXZYnFp5ds=; b=TrD8WwxXto6IiMBLA5TEAOBzaggMwYyvdrRrxB5YkhOY7o9ZErOIHsEsrM7p+vVwTB EcxdYOf0tUPIVI9Vayf3Nz8yuceX2q1hn03SjnUKy+pov5j6le+oI+Ubb2MTs6ibzRbA v1Du3WA0HTLB7fqnf1t5r7PGIVCM0qGYdv7J8Z1Orncmc1h149ZjZfyA2McIaMPWaIwi biByo7OZ7OpMHgbcyK45/iog3RadIag2p4rSl3GrG5T/8amO5bU+m36ri7A4tHGp3now 6CZ3+ohICrvK3Jwzmm90nIzf0vaGk5mbYKqEGGxmAcTcOvpJjgwHDMbgY97pUSsNvoOM Nj2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=c9HpRDt4lPR0av7gju99Obtne0ygN0Y0/yXZYnFp5ds=; b=gFrsB3Bhlzm2SiLXEHeznipko+XEzWxEYgEO3PzGHqhNQZEAUtTztKW/XSfWUIboK2 E5vo90d1/1BI1cbOb0dGxaTcbGJkJUhP3gwP8b4ydmRUYZNgdWsO3CXc7vEC6aAyx62/ McDL1elPpdYrDhpowp7u2AQ11ITVoXQPisGsNcbCKiwhgbSGgMkeCYbnAK9B5G2MPhw5 h7pYShVLXCv9GAyb27MmxefDWsp8vMQ57Y2Cl1SDNHF44+SFrioi+B6Ivee8BLOyKdaf 8uI47lRqSD3S4clgQmsMRHxPb9MBJDGdxutCi5alj9G3P5y+R1gVOrczizhJZaqPnWWi Y8gg== X-Gm-Message-State: AKaTC003jViInfdWGXqynOyyDCKrWqm1nPC+a3zenb132NsEALB51+hk3yWE0hSIkdx7eQ== X-Received: by 10.55.127.129 with SMTP id a123mr87230973qkd.129.1481762650668; Wed, 14 Dec 2016 16:44:10 -0800 (PST) Received: from wkstn-mjohnston.west.isilon.com (c-76-104-201-218.hsd1.wa.comcast.net. [76.104.201.218]) by smtp.gmail.com with ESMTPSA id h47sm33077386qtc.27.2016.12.14.16.44.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 14 Dec 2016 16:44:10 -0800 (PST) Sender: Mark Johnston Date: Wed, 14 Dec 2016 16:50:21 -0800 From: Mark Johnston To: kargl@uw.edu Cc: freebsd-current@freebsd.org, kib@freebsd.org, jhb@FreeBSD.org Subject: Re: Revision 309657 to stack_machdep.c renders unbootable system Message-ID: <20161215005012.GA84222@wkstn-mjohnston.west.isilon.com> References: <20161214194848.GA881@troutmask.apl.washington.edu> <20161214201416.GA64767@wkstn-mjohnston.west.isilon.com> <20161214221048.GB64767@wkstn-mjohnston.west.isilon.com> <20161214234804.GA26443@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161214234804.GA26443@troutmask.apl.washington.edu> User-Agent: Mutt/1.7.1 (2016-10-04) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Dec 2016 00:44:12 -0000 On Wed, Dec 14, 2016 at 03:48:04PM -0800, Steven G. Kargl wrote: > On Wed, Dec 14, 2016 at 02:10:48PM -0800, Mark Johnston wrote: > > On Wed, Dec 14, 2016 at 12:14:16PM -0800, Mark Johnston wrote: > > > On Wed, Dec 14, 2016 at 11:49:26AM -0800, Steven G. Kargl wrote: > > > > Well, after 3 days of bisection, I finally found the commit > > > > that renders my system unbootable. The system does not panic. > > > > It simply gets stuck in some state. Nonfunctional keyboard, > > > > so can't break into debugger. No serial console available. > > > > The verbose dmesg.boot for a working kernel from revision > > > > 309656 is at > > > > > > > > http://troutmask.apl.washington.edu/~kargl/freebsd/dmesg.309656.txt > > > > > > > > The kernel config file is at > > > > > > > > http://troutmask.apl.washington.edu/~kargl/freebsd/SPEW.txt > > > > > > > > In looking at /usr/src/UPDATING, there is no warning that one > > > > can create a boat anchor by upgrading to 309657. If compiling > > > > a kernel with 'options DDB' is no longer supported, this should > > > > be stated in UPDATING. Or, UPDATING should state that 'options > > > > DDB' requires 'options STACK'. Or, 'options DDB' should simply > > > > to the right thing and pull in whatever 'option STACK' does. > > > > > > It is supported though - the point of that change was to fix a problem > > > that occurred when DDB is configured but STACK isn't. While testing I > > > tried every combination of the two options, and I just tried and > > > successfully booted a kernel with DDB and !STACK. > > > > > > Does the kernel boot successfully if STACK is added to your > > > configuration? > > > > I tried your config (plus virtio drivers) and was able to reproduce the > > hang in bhyve. Adding STACK "fixed" the hang, as did reverting part of > > my change to re-add dead code into the kernel. My VM was always hanging > > after printing > > > > 000.000050 [ 426] vtnet_netmap_attach virtio attached txq=1, txd=1024 rxq=1, rxd=1024 > > > > Sure enough, removing "device netmap" from your config also fixes the > > hang. When the hang occurs, I can see with "bhyvectl --get-rip" that > > we're stuck in DELAY(), but I can't get a stack at that point. I think > > my change is an innocent bystander - it just happened to expose a latent > > issue elsewhere. > > > > I don't have much more time to look at this right now, but I'll look > > into it more tonight. > > Yes, adding STACK got me to a booting kernel. I can't remember > why I added netmap to my config file. Re-adding dead code seems to > point to some memory corruption issue or a rogue pointer. :( It's not quite that bad, as it turns out. The key is that adding/removing the dead code changes the ordering of the items in the sysinit linker set. I discovered that if the ctl(4) module is initialized before the vtnet driver attaches, the hang occurs, and reverting my commit results in a sysinit order where vtnet comes _before_ ctl(4). So my change triggers the problem just because it happens to perturb something in the compile-time linker. > > BTW, I think it would be prudent to add something like > > 20161206: > At revision 309657, 'options STACK' was introduced into > sys/x86/x86/mstack_machdep.c. Old kernel configuration files > that included 'options DDB' are now required to include also > 'options STACK'. > > to UPDATING or some such wording. I was jumping from circ Oct 10th world > to top of tree, and got caught by ~3000 commits. The issue actually seems to be in 4BSD, and more specifically in r308564 and r308565. Switching to ULE or reverting either of those two commits fixes the hang. Here's what happens: 1. ctl_init() runs and creates ctl_thresh_thread. This thread's main loop cause pause(9) when it has no work to do. During boot, pause(9) just calls DELAY() and does not yield the CPU. 2. thread0 attaches the vtnet driver. As part of this, it creates and starts some high-priority taskqueue threads in _taskqueue_thread_start(). They're added to the scheduler with: thread_lock(); sched_pri(...); sched_add(...); thread_unlock(); 4BSD's sched_add() will call maybe_preempt() in this case, which as of r308564 will unconditionally set td_owepreempt in the current thread. 3. thread_unlock() will release the critical section held by the current thread and because td_owepreempt is set, we'll yield the CPU. The taskqueue threads have nothing to do, but ctl_thresh_thread runs and ends up busy-waiting in pause() forever. r308565 removes a check in maybe_preempt() that would have stopped td_owepreempt from being set. Before r308564, maybe_preempt() would have switched directly to the new thread and apparently always switched back immediately. I'm not sure what the correct fix is - jhb might have an idea. I wonder if pause() should try to yield periodically when called during boot.