From owner-freebsd-hackers@freebsd.org Fri Mar 17 08:23:39 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9D75BD0F7B1 for ; Fri, 17 Mar 2017 08:23:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 330E11AAC; Fri, 17 Mar 2017 08:23:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v2H8NXlV012003 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 17 Mar 2017 10:23:33 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v2H8NXlV012003 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v2H8NXv2012002; Fri, 17 Mar 2017 10:23:33 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 17 Mar 2017 10:23:33 +0200 From: Konstantin Belousov To: Steven Hartland Cc: "K. Macy" , "freebsd-hackers@freebsd.org" Subject: Re: Help needed to identify golang fork / memory corruption issue on FreeBSD Message-ID: <20170317082333.GP16105@kib.kiev.ua> References: <27e1a828-5cd9-0755-50ca-d7143e7df117@multiplay.co.uk> <20161206125919.GQ54029@kib.kiev.ua> <8b502580-4d2d-1e1f-9e05-61d46d5ac3b1@multiplay.co.uk> <20161206143532.GR54029@kib.kiev.ua> <18b40a69-4460-faf2-c0ce-7491eca92782@multiplay.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <18b40a69-4460-faf2-c0ce-7491eca92782@multiplay.co.uk> User-Agent: Mutt/1.8.0 (2017-02-23) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Mar 2017 08:23:39 -0000 On Fri, Mar 17, 2017 at 06:30:49AM +0000, Steven Hartland wrote: > Ok I think I've identified the cause. > > If an alternative signal stack is applied to a non-main thread and that > thread calls execve then the signal stack is not cleared. > > This results in all sorts of badness. > > Full details, including a small C reproduction case can be found here: > https://github.com/golang/go/issues/15658#issuecomment-287276856 > > So looks like its kernel bug. If anyone has an ideas about that before I > look tomorrow that would be appreciated. Yes, there is definitely a kernel bug, which should be fixed by the patch below. Still, what I saw when I looked at the issue, is not quite resembling potential consequences of the bug. Using wrong memory for signal stack would result either in much more significant memory corruption if the alt stack range is mapped and used for something unrelated, or in killed process on signal delivery, if the range is not mapped. While I saw a systematic 'off by 0x10' in some gc structures. Anyway, patch for the issue you identified: diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c index 29d5dd4b132..9bf3ba66f5c 100644 --- a/sys/kern/kern_sig.c +++ b/sys/kern/kern_sig.c @@ -976,7 +976,6 @@ execsigs(struct proc *p) * and are now ignored by default). */ PROC_LOCK_ASSERT(p, MA_OWNED); - td = FIRST_THREAD_IN_PROC(p); ps = p->p_sigacts; mtx_lock(&ps->ps_mtx); while (SIGNOTEMPTY(ps->ps_sigcatch)) { @@ -1007,6 +1006,8 @@ execsigs(struct proc *p) * Reset stack state to the user stack. * Clear set of signals caught on the signal stack. */ + td = curthread; + MPASS(td->td_proc == p); td->td_sigstk.ss_flags = SS_DISABLE; td->td_sigstk.ss_size = 0; td->td_sigstk.ss_sp = 0;