Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Jul 2016 06:30:36 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Mark Johnston <markj@FreeBSD.org>
Cc:        freebsd-current@FreeBSD.org
Subject:   Re: ptrace attach in multi-threaded processes
Message-ID:  <20160713033036.GR38613@kib.kiev.ua>
In-Reply-To: <20160712182414.GC71220@wkstn-mjohnston.west.isilon.com>
References:  <20160712011938.GA51319@wkstn-mjohnston.west.isilon.com> <20160712055753.GI38613@kib.kiev.ua> <20160712170502.GA71220@wkstn-mjohnston.west.isilon.com> <20160712175150.GP38613@kib.kiev.ua> <20160712182414.GC71220@wkstn-mjohnston.west.isilon.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jul 12, 2016 at 11:24:14AM -0700, Mark Johnston wrote:
> On Tue, Jul 12, 2016 at 08:51:50PM +0300, Konstantin Belousov wrote:
> > On Tue, Jul 12, 2016 at 10:05:02AM -0700, Mark Johnston wrote:
> > > On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote:
> > > I suppose it is not strictly incorrect. I find it surprising that a
> > > PT_ATTACH followed by a PT_DETACH may leave the process in a different
> > > state than it was in before the attach. This means that it is not
> > > possible to gcore a process without potentially leaving it stopped, for
> > > instance. This result may occur in a single-threaded process
> > > as well, since a signal may already be queued when the PT_ATTACH handler
> > > sends SIGSTOP.
> > I still miss somethine.  Isn't this an expected outcome from sending a
> > signal with STOP action ?
> 
> It is. But I also expect a PT_DETACH operation to resume a stopped
> process, assuming that a second SIGSTOP was not posted while the
> process was suspended.
But as far as the situation was discussed, it seems that real SIGSTOP raced
with PT_ATTACH. And the offered interpretation that SIGSTOP was delivered
'a bit later' than PT_ATTACH would fit into the description.

> 
> > 
> > > Indeed, I somehow missed that. I had assumed that the leaked TDB_XSIG
> > > represented a bug in ptracestop().
> > It could, I did not made any statements that deny the bug:
> 
> To be clear, the root of my issue comes from the following: the SIGSTOP
> from PT_ATTACH may be handled concurrently with a second signal
> delivered to a second thread in the same process. Then, the resulting
> behaviour depends on the order in which the recipient threads suspend in
> ptracestop(). If the thread that received SIGSTOP suspends last, its
> td_xsig will be overwritten with the userland-provided value in the
> PT_DETACH handler. If it suspends first, its td_xsig will be preserved,
> and upon PT_DETACH the process will be suspended again in issignal().
> 
> I'm not sure if this is considered a bug. ptracestop() is handling all
> signals (including the SIGSTOP generated by the PT_ATTACH handler) in a
> consistent way, but this results in inconsistent behaviour from the
> perspective of a ptrace(2) consumer.

Still I do not understand what is inconsistent.

Let look at it from the other side (before, we discussed the implementation
in kernel).  Is this happens in gcore(1) ?   If yes, gcore interaction
with ptrace(2) looks like this:
	ptrace(PT_ATTACH, g_pid);
	waitpid(g_pid, &g_status, 0);
	...
	if (sig == SIGSTOP)
		sig = 0;
	ptrace(PT_DETACH, g_pid, 1, sig);
It sounds as if it is desirable for you to modify gcore(1) to consume
all signals, or at least, all STOP signals before PT_DETACH.  I do not
understand why do you want it, but that would probably give you the
behaviour you want:
	ptrace(PT_ATTACH, g_pid);
	waitpid(g_pid, &g_status, 0);
	...
	/* still consume implicit SIGSTOP from attach */
	if (sig == SIGSTOP)
		sig = 0;
	do {
		error = waitpid(g_pid, &g_status, WNOHANG | WSTOPPED);
	} while (error == 0);	
	ptrace(PT_DETACH, g_pid, 1, sig);



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160713033036.GR38613>