From owner-freebsd-arch  Thu Mar  7 14:40:21 2002
Delivered-To: freebsd-arch@freebsd.org
Received: from rwcrmhc51.attbi.com (rwcrmhc51.attbi.com [204.127.198.38])
	by hub.freebsd.org (Postfix) with ESMTP id EA64237B400
	for <arch@FreeBSD.ORG>; Thu,  7 Mar 2002 14:40:16 -0800 (PST)
Received: from InterJet.elischer.org ([12.232.206.8])
          by rwcrmhc51.attbi.com
          (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP
          id <20020307224016.BUEC2626.rwcrmhc51.attbi.com@InterJet.elischer.org>;
          Thu, 7 Mar 2002 22:40:16 +0000
Received: from localhost (localhost.elischer.org [127.0.0.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id OAA44396;
	Thu, 7 Mar 2002 14:37:20 -0800 (PST)
Date: Thu, 7 Mar 2002 14:37:17 -0800 (PST)
From: Julian Elischer <julian@elischer.org>
To: Poul-Henning Kamp <phk@critter.freebsd.dk>
Cc: arch@FreeBSD.ORG
Subject: Re: Contemplating THIS change to signals. (fwd)
In-Reply-To: <4410.1015538902@critter.freebsd.dk>
Message-ID: <Pine.BSF.4.21.0203071410570.37321-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG


On Thu, 7 Mar 2002, Poul-Henning Kamp wrote:

> In message <Pine.BSF.4.21.0203071342160.37321-100000@InterJet.elischer.org>, Ju
> lian Elischer writes:
> 
> >I would argue that a process can be considered to be suspended even while
> >it is running in kernel space.
> 
> Since this would affect not only SIGSTOP but actually all signals,
> and since we have long-running syscalls like sendfile I'm not sure
> this assumption is a good idea.

That's the whole point.
Only STOP type signals are affected..

the signal delivery process goes as follows:

Generally, senders call psignal()
 psignal() checks if there are IMMEDIATE actions to take.
	and then puts the signal into the "pending signals" array.
	One of the things it does is check if it should wake the process.
	(and also a few other similar checks)

In msleep()  CURSIG is called.
CURSIG calls issignal (indirectly) which checks to see if there
are any signals that need to be delivered. If there are, it checks to see
if they include a STOP type signal. if so, it suspends the process
immediatly. In all other cases, it just returns the fact that a signal is
ready, and the system call proceeeds to teh user boundary.

At teh user boundary (in userret()), CURSIG is called again but AFTER it
is called, postsig() is called. This actually acts out the work needed
to DO the signal.

In the msleep() case, postsig() is not called after CURSIG() so there is
no signal generated at that point, only when it gets back to the user
boundary.

My suggestion is to stop making STOP type signals an exception,
because it should not be necessary to stop them in the middle of a
syscall, just stop them from getting back to userspace.

I mean, if you are debugging a program and you have stopped it, does it
matter to you if it stopped in an msleep in the middle of the system call
you are in, or stopped before coming back to you, having completed the
system call? There is no difference except that if it didn't have to block
it will stop at the user boundary, and if it did block it will stop in a
different place. It would be more consistant to have it stop in the same
place on each call. In the case of ^Z you are even less interested
about exactly WHERE in the syscall yuo stopped as long as you actually
stopped.

In the case of sendfile, the signal will cut short the syscall in both
cases. The differnce is that as it is now it stops in the msleep, and when
you release it, it returns to the user boundary and then to the user,
and in the new case, it immediatlty retunrs to the user boundary, 
and on being allowed to continue, goes back to the user.

From the user's perspective the two actions are almost indistiguishable.


Please let me know if I'm blowing smoke on this!

(BTW I say "almost indistiguishable" because there is one minor difference
from the user perspective)
 In a read from a disk file, in the current situation, if the block is in
core, you don't block so you stop at the user boundary after having done
the copyout(). In the case where it's only on disk, you stop in msleep
having NOT YET done the copyout(). In my case, you stop at the user
boundary both times, though you haven't done the copyout on the second
case either. But in some cases involving multiple reads it is possible
that you may find a small amount of difference in the amount of the
aborted read gets copied out.  before the stop is put into affect. if
there has been a partial read, returning to the user boundary may decide
to copy out that partial read in some cases (read from tty maybe?) Where
it will not have been copied out if you stopped in the msleep.

> 
> -- 
> Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
> phk@FreeBSD.ORG         | TCP/IP since RFC 956
> FreeBSD committer       | BSD since 4.3-tahoe    
> Never attribute to malice what can adequately be explained by incompetence.
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message