From owner-freebsd-current@FreeBSD.ORG Wed Aug 27 07:08:48 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A9E7E16A4BF for ; Wed, 27 Aug 2003 07:08:48 -0700 (PDT) Received: from mail2.sol.net (mail2.sol.net [206.55.64.73]) by mx1.FreeBSD.org (Postfix) with ESMTP id A15C443FCB for ; Wed, 27 Aug 2003 07:08:43 -0700 (PDT) (envelope-from jgreco@aurora.sol.net) Received: from aurora.sol.net (aurora.sol.net [206.55.65.130]) by mail2.sol.net (8.11.0/8.11.0/SNNS-1.04) with ESMTP id h7RE8gP51853 for ; Wed, 27 Aug 2003 09:08:42 -0500 (CDT) Received: (from jgreco@localhost) by aurora.sol.net (8.12.8p1/8.12.9/Submit) id h7RE8ghW073779 for freebsd-current@freebsd.org; Wed, 27 Aug 2003 09:08:42 -0500 (CDT) From: Joe Greco Message-Id: <200308271408.h7RE8ghW073779@aurora.sol.net> To: freebsd-current@freebsd.org Date: Wed, 27 Aug 2003 09:08:42 -0500 (CDT) X-Mailer: ELM [version 2.5 PL6] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: Someone help me understand this...? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2003 14:08:48 -0000 I've got a weirdness with kill(2). This code is out of Diablo, the news package, and has been working fine for some years. It apparently works fine on other OS's. In the Diablo model, the parent process may choose to tell its children to update status via a signal. The loop basically consists of going through and issuing a SIGALRM. This stopped working a while ago, don't know precisely when. I was in the process of debugging it today and ran into this. The specific OS below is 5.1-RELEASE but apparently this happens on 4.8 as well. %echo $$ 29047 %ps -O ruid,uid | egrep '28949|29045|29047' 28949 8 8 p0 I 0:00.00 diablo: ihav=0 chk=0 rec=0 ent=0 29045 8 8 p0 I 0:00.00 sleep 999999 29047 8 8 p0 D 0:00.01 -su (csh) %kill -ALRM 28949 28949: Operation not permitted %kill -ALRM 29045 %ps -O ruid,uid | egrep '28949|29045' 28949 8 8 p0 I 0:00.00 diablo: ihav=0 chk=0 rec=0 ent=0 % Wot? Why can't I send it a signal? I've read kill(2) rather carefully and cannot find the reason. It says, For a process to have permission to send a signal to a process designated by pid, the real or effective user ID of the receiving process must match that of the sending process or the user must have appropriate privileges (such as given by a set-user-ID program or the user is the super-user). Well, the sending and receiving processes both clearly have equal uid/euid. We're not running in a jail, so I don't expect any issues there. The parent process did actually start as root and then shed privilege with struct passwd *pw = getpwnam("news"); struct group *gr = getgrnam("news"); gid_t gid; if (pw == NULL) { perror("getpwnam('news')"); exit(1); } if (gr == NULL) { perror("getgrnam('news')"); exit(1); } gid = gr->gr_gid; setgroups(1, &gid); setgid(gr->gr_gid); setuid(pw->pw_uid); so that looks all well and fine... so why can't it kill its own children, and why can't I kill one of its children from a shell with equivalent uid/euid? I know there's been some paranoia about signal delivery and all that, but my searching hasn't turned up anything that would explain this. Certainly the manual page ought to be updated if this is a new expected behaviour or something... at least some clue as to why it might fail would be helpful. ... JG -- Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net "We call it the 'one bite at the apple' rule. Give me one chance [and] then I won't contact you again." - Direct Marketing Ass'n position on e-mail spam(CNN) With 24 million small businesses in the US alone, that's way too many apples.