From owner-freebsd-arch Thu Mar 7 14:40:21 2002 Delivered-To: freebsd-arch@freebsd.org Received: from rwcrmhc51.attbi.com (rwcrmhc51.attbi.com [204.127.198.38]) by hub.freebsd.org (Postfix) with ESMTP id EA64237B400 for ; Thu, 7 Mar 2002 14:40:16 -0800 (PST) Received: from InterJet.elischer.org ([12.232.206.8]) by rwcrmhc51.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020307224016.BUEC2626.rwcrmhc51.attbi.com@InterJet.elischer.org>; Thu, 7 Mar 2002 22:40:16 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id OAA44396; Thu, 7 Mar 2002 14:37:20 -0800 (PST) Date: Thu, 7 Mar 2002 14:37:17 -0800 (PST) From: Julian Elischer To: Poul-Henning Kamp Cc: arch@FreeBSD.ORG Subject: Re: Contemplating THIS change to signals. (fwd) In-Reply-To: <4410.1015538902@critter.freebsd.dk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 7 Mar 2002, Poul-Henning Kamp wrote: > In message , Ju > lian Elischer writes: > > >I would argue that a process can be considered to be suspended even while > >it is running in kernel space. > > Since this would affect not only SIGSTOP but actually all signals, > and since we have long-running syscalls like sendfile I'm not sure > this assumption is a good idea. That's the whole point. Only STOP type signals are affected.. the signal delivery process goes as follows: Generally, senders call psignal() psignal() checks if there are IMMEDIATE actions to take. and then puts the signal into the "pending signals" array. One of the things it does is check if it should wake the process. (and also a few other similar checks) In msleep() CURSIG is called. CURSIG calls issignal (indirectly) which checks to see if there are any signals that need to be delivered. If there are, it checks to see if they include a STOP type signal. if so, it suspends the process immediatly. In all other cases, it just returns the fact that a signal is ready, and the system call proceeeds to teh user boundary. At teh user boundary (in userret()), CURSIG is called again but AFTER it is called, postsig() is called. This actually acts out the work needed to DO the signal. In the msleep() case, postsig() is not called after CURSIG() so there is no signal generated at that point, only when it gets back to the user boundary. My suggestion is to stop making STOP type signals an exception, because it should not be necessary to stop them in the middle of a syscall, just stop them from getting back to userspace. I mean, if you are debugging a program and you have stopped it, does it matter to you if it stopped in an msleep in the middle of the system call you are in, or stopped before coming back to you, having completed the system call? There is no difference except that if it didn't have to block it will stop at the user boundary, and if it did block it will stop in a different place. It would be more consistant to have it stop in the same place on each call. In the case of ^Z you are even less interested about exactly WHERE in the syscall yuo stopped as long as you actually stopped. In the case of sendfile, the signal will cut short the syscall in both cases. The differnce is that as it is now it stops in the msleep, and when you release it, it returns to the user boundary and then to the user, and in the new case, it immediatlty retunrs to the user boundary, and on being allowed to continue, goes back to the user. From the user's perspective the two actions are almost indistiguishable. Please let me know if I'm blowing smoke on this! (BTW I say "almost indistiguishable" because there is one minor difference from the user perspective) In a read from a disk file, in the current situation, if the block is in core, you don't block so you stop at the user boundary after having done the copyout(). In the case where it's only on disk, you stop in msleep having NOT YET done the copyout(). In my case, you stop at the user boundary both times, though you haven't done the copyout on the second case either. But in some cases involving multiple reads it is possible that you may find a small amount of difference in the amount of the aborted read gets copied out. before the stop is put into affect. if there has been a partial read, returning to the user boundary may decide to copy out that partial read in some cases (read from tty maybe?) Where it will not have been copied out if you stopped in the msleep. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message