From owner-freebsd-current@FreeBSD.ORG  Mon Mar 10 21:18:48 2014
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 696834C1;
 Mon, 10 Mar 2014 21:18:48 +0000 (UTC)
Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1])
 (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 216C690;
 Mon, 10 Mar 2014 21:18:48 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id E08FBB918;
 Mon, 10 Mar 2014 17:18:46 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: signal 8 (floating point exception) upon resume
Date: Mon, 10 Mar 2014 17:17:13 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20130906; KDE/4.5.5; amd64; ; )
References: <CAJ-Vmo=29BUP7JREe+K+8zOZNVt2LttjTB2byB0AA2jnCKnPSw@mail.gmail.com>
 <201403041124.04122.jhb@freebsd.org> <20140305054906.L2176@besplex.bde.org>
In-Reply-To: <20140305054906.L2176@besplex.bde.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201403101717.13131.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Mon, 10 Mar 2014 17:18:47 -0400 (EDT)
Cc: Adrian Chadd <adrian@freebsd.org>,
 "freebsd-mobile@freebsd.org" <freebsd-mobile@freebsd.org>,
 freebsd-current <freebsd-current@freebsd.org>,
 Konstantin Belousov <kib@freebsd.org>, Bruce Evans <bde@freebsd.org>,
 Jung-uk Kim <jkim@freebsd.org>
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Mar 2014 21:18:48 -0000

On Tuesday, March 04, 2014 4:50:01 pm Bruce Evans wrote:
> On Tue, 4 Mar 2014, John Baldwin wrote:
> % Index: i386/i386/swtch.s
> % ===================================================================
> % --- i386/i386/swtch.s	(revision 262711)
> % +++ i386/i386/swtch.s	(working copy)
> % @@ -417,42 +417,9 @@
> %  	str	PCB_TR(%ecx)
> % 
> %  #ifdef DEV_NPX
> % -	/*
> % -	 * If fpcurthread == NULL, then the npx h/w state is irrelevant and the
> % -	 * state had better already be in the pcb.  This is true for forks
> % -	 * but not for dumps (the old book-keeping with FP flags in the pcb
> % -	 * always lost for dumps because the dump pcb has 0 flags).
> % -	 *
> % -	 * If fpcurthread != NULL, then we have to save the npx h/w state to
> % -	 * fpcurthread's pcb and copy it to the requested pcb, or save to the
> % -	 * requested pcb and reload.  Copying is easier because we would
> % -	 * have to handle h/w bugs for reloading.  We used to lose the
> % -	 * parent's npx state for forks by forgetting to reload.
> % -	 */
> 
> This function is mostly bogus (see old mails).

I was going off of the commit logs for amd64 that removed this code as savectx()
is not used for fork(), only for IPI_STOP and suspend/resume.

> Without fxsave, npxsuspend() cannot be atomic without locking, since
> fnsave destroys the state in the FPU and you either need a lock to
> reload the old state atomically enough, or a lock to modify FPCURTHREAD
> atomically enough.

save_ctx() is now only called from IPI handlers or when doing suspend in
which case we shouldn't have to worry about being preempted.

> % 
> %  	movl	$1,%eax
> % ...
> % @@ -520,7 +490,16 @@
> %  	movl	%eax,%dr7
> % 
> %  #ifdef DEV_NPX
> % -	/* XXX FIX ME */
> % +	/* Restore FPU state */
> 
> Is the problem just this missing functionality?

Possibly.  I think on amd64 there was also the desire to have the pcb
state be meaningful in dumps (since we IPI_STOP before a dump).  OTOH,
the current approach used by amd64 (and this patch for i386) is to not
dirty fpcurthread's state during save_ctx(), but to instead leave
fpcurthread alone and explicitly save whatever state the FPU is in
in the PCB used for IPI_STOP or suspend.

> % @@ -761,7 +761,34 @@
> %  	PCPU_SET(fpcurthread, NULL);
> %  }
> % 
> % +/*
> % + * Unconditionally save the current co-processor state across suspend and
> % + * resume.
> % + */
> %  void
> % +npxsuspend(union safefpu *addr)
> % +{
> % +	register_t cr0;
> % +
> % +	if (!hw_float)
> % +		return;
> % +	cr0 = rcr(0);
> % +	clts();
> % +	fpusave(addr);
> % +	load_cr(0, cr0);
> % +}
> 
> In the !fxsave case, this destroys the state in the npx, leaving
> fpcurthread invalid.  It also does the save when the state in the
> npx is inactive.  I think jkim intentionally this state so that
> resume can load it unconditionally.  It must be arranged that there
> are no interactions with fpcurthread.

Given the single-threaded nature of suspend/resume and IPI_STOP /
restart_cpus(), those requirements are met, so it should be safe
to resume whatever state was in the FPU and leave fpcurthread
unchanged.

> This doesn't work so well
> without fxsave.  When fpcurthread != NULL, reloading CR0 keeps
> CR0_TS and thus ensures that inconsistent state lives for longer.
> Things will only be OK if fpcurthread isn't changed until resume.

After the save_ctx() the CPU is going to either resume without
doing a resume_ctx (IPI_STOP case) leaving fpcurthread unchanged
(so save_ctx() just grabbed a snapshot of the FPU state for
debugging purposes) or the CPU is going to power off for suspend.
During resume it will invoke resume_ctx() which will restore the
FPU state (whatever state it was in) and fpcurthread and only
after those are true is the CPU able to run other threads which
will modify or use the FPU state.

> You can probably fix this by using the old code here.  The old code
> doesn't need the hw_float test, since fpcurthread != NULL implies
> hw_float != 0.
> 
> Actually, I don't see any need to change anything on i386 -- after
> storing the state for the thread, there should be no need to store it
> anywhere else across suspend/resume.  We intentionally use this method
> (even on amd64 IIRC), although it is suboptimal, to reduce complications
> for context switchres and signal handling.  npxsave() takes an address,
> but savectx() didn't abuse this to store directly in the special save
> area.  It made npxsave() store in the pcb, and then copied to the special
> area.

So I guess that is one option is to always clear fpcurthread during
suspend and just do an fninit on resume.  However, I chose to match
what amd64 does for now.  I did make one change locally which was to
not bother saving the FPU state if fpcurthread was NULL during save_ctx,
but to instead store a copy of 'npx_initial_state' in the pcb instead.
This is then loaded into the FPU on resume.  Is that sufficient for
the !fxsave case?

-- 
John Baldwin