From owner-freebsd-arm@freebsd.org Fri Jan 4 06:56:44 2019 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4BA1F141E0A9; Fri, 4 Jan 2019 06:56:44 +0000 (UTC) (envelope-from melounmichal@gmail.com) Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3729383EC0; Fri, 4 Jan 2019 06:56:43 +0000 (UTC) (envelope-from melounmichal@gmail.com) Received: by mail-wm1-x344.google.com with SMTP id n190so286203wmd.0; Thu, 03 Jan 2019 22:56:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:reply-to:subject:to:references:openpgp:autocrypt:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=usatyWAmXZ35xb0cj4vJ5X+otTvMD3Y8WK5vM9ibfqk=; b=EqaVeZ+MtgcqEbDjTM1zZUsQuOEuMpVgH7psTso6ORP1sUDbJyxIGSmosBtKD5ckla r+1BFsB2m7gCBGwYwLe6Ivfi4PwAbq1i8fJufedle7dt7U031SZXyTgyDy8HBLrOnzSi vIM5+MVBUeGyszmhLL5X0rhZkUFF1vsunUzQb2P5XZlZ/MZLUyv9hsgxQAojlm+VKDed 0oNEVfjmKgR7MANTgBEN4RFkkM3LJDgeoGkRu/iWCLR0q/rqWQT2JykWy4Z6ggc0lCHM i50cOhEu3d5bXFdFECRjMMA2on/REPRtgmDN3RjQZGoHtcBbe+NbNvW2vCnof1jGlQIM tuIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:reply-to:subject:to:references:openpgp :autocrypt:message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=usatyWAmXZ35xb0cj4vJ5X+otTvMD3Y8WK5vM9ibfqk=; b=hpeLOypp4kRvveV+EZ+NIgYvGbeYChDUFDvmBQnVght7PKyUnHZtuAfj+EmiMGmiaB NtnG2NNSttpViL9gRUK2DMwOPLpPkOQdF9GKBX/ciRMX2g4RAhEPSo3gTkrWhTvCO2s/ 4hsJMOQUbY7lg8gXl0of7GmZ/SsimdeMua02kBIX0rExta7VlH22PZ5CUQT38vSkxtLb r6wlXXVkr/oqvamEpa5sw3ELXiBpSB5yKIUPQ1GPsxljNxagUn//tH68BDQjcRytajRa ZmnjB6dUbNNMiKVSj0InMlJ0TEHonQW+i/0tyjoTnK559YrSdGV+pKwrAAU1ntW1/7AG 7ZVQ== X-Gm-Message-State: AJcUukfa92AS+JRzB9F5IlYSJ6a0fmOE5zNYcZzt6JevDYqJz6kEPyNd ZfIAIoxbrsu692RGedXRaV0= X-Google-Smtp-Source: ALg8bN5Zpor5Ok+IQ7zPygWdFm+N3SmLBhhVdGnHAX9TWo018dTNm6YcA4RAJOIQWcVUAWDSIFrUYA== X-Received: by 2002:a1c:8d12:: with SMTP id p18mr451319wmd.31.1546585001981; Thu, 03 Jan 2019 22:56:41 -0800 (PST) Received: from [88.208.79.100] (halouny.humusoft.cz. [88.208.79.100]) by smtp.gmail.com with ESMTPSA id t76sm339763wme.33.2019.01.03.22.56.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 03 Jan 2019 22:56:41 -0800 (PST) From: Michal Meloun X-Google-Original-From: Michal Meloun Reply-To: mmel@freebsd.org Subject: Re: A reliable port cross-build failure (hangup) in my context (amd64->armv7 cross build, with native-tool speedup involved) To: Dennis Clarke , freebsd-arm@freebsd.org, FreeBSD Current , Mark Millard References: <865A13C8-9749-486E-9F79-5EEDDECBE621@yahoo.com> <0154C3AC-D85B-4FCF-BA63-454BC26BC1A2@yahoo.com> <13f5e4dd-33fb-2170-e31a-1b5d5f155869@freebsd.org> <2E3F6196-4652-40D2-937F-8860B6005A35@yahoo.com> Openpgp: preference=signencrypt Autocrypt: addr=mmel@freebsd.org; prefer-encrypt=mutual; keydata= mQENBFYuVRkBCADZiwLCCne3wG9b9k+R2Neo5zVo2bLaZRfNNY/v9kg283i0sb1Da4EdEiNT 15El5UyozhphUIbIR/zrVpxF1TvvFdoCyzx6a68bNY2d9dBrDcNDZC+XnyDdHQoobN87DWT1 mRVkmbg9LHZ/SVUOkGYuWyE+8UYeDAcUizuXwSK5zFWmeTyIoWNa68ifrWLfQe0p4x5jC/AI VURCi17p360vU4fhgwoMvEEhrRBWCr4DYHToFjIt2WdBy3GR1qoO0+Xkd6G+OoBULo+XDfgu L2WdPvh0K69F9/LgHkMmG5Il7SCe62QGpG2vaCgRV7BQhLX+kxlvM+WrdRatWRml4Y/3ABEB AAG0IE1pY2hhbCBNZWxvdW4gPG1tZWxAZnJlZWJzZC5vcmc+iQFABBMBCgAqAhsDBQsJCAcD BRUKCQgLBRYDAgEAAh4BAheAAhkBBQJZjBHDBQkHICOqAAoJEGkesmtexaqqIKMIAJ9xTp1w ge86ns2ZYOac5++mAgpFatohSlxYUR3gwud3Y3Ej0eumavpv/C26N6dsLnspwRenKdLbIPKe 0N8lI7CcDBIJGiFyY3c4H79QjIkYpRgbWFyCM85zEyVJpB+U7BhsgXE2uwVjE9RNhEP0KBoj sp357uqq1B1+VUO4GJ+RjdmYSOcNrjR8tTfy02456qovGjJ4JcJBlhyK6GzBKvnZSoA0s+QP OMn3gd8gdomMLEJdS3kTsfhLh2rQPZa9EmzafIyjXrirWq4+4fVFgd8SiMZyyTM+Kz30ZSUe 6SmfaQTQ/WLRIl5jku2uYQWlrRIKT9xaQzRWtZO9UgtXFRG5AQ0EVi5VGQEIALqgRkfS21D/ OqWE9mXfh2bIjrp9uC8T0MCuimbsrAdLKNNorGu2nE+rebgX8n5nYM377HOnalPGyOuXvCbQ 8MFVRdWOHxenJjXJialNdBsOf2wLva3vSSVsdoPzibWDIcJqhBOQ3EuhsILyWSPvYYKEiy95 mfhrDtuTTOAYVR9aNQBOENztB2TDJyMx/qZmtGroGV3N0Hqde/znHPtQO8RG5/FQGMfHMI5G FMuycr1ceHnLo/ovrqAl4TYV+UHSHJ+FDE9dt9wXHclWbWbC0yNugchZq6rho5Jjfv4a2v7P pyn3HoDinh1lWP7hYA0ZNExGHekLnXWVqO/lzGS6bMEAEQEAAYkBJQQYAQoADwIbDAUCWYwR wwUJByAjqgAKCRBpHrJrXsWqqrsrB/4g4ESK5TLxUxi8pLWcLPyvwtN4Fmf7VsCVefkhakaG rDPmfvfnG+OFwN60Xqoni7GBeakl01xwT4RINfvVfShDy6cHpLS7QL/M8pzfulVX38MkVkOD yGZhwjE+jyT/kZNA1Olaw3N3IefHq3brskQ7G4d9oPep2DDbw7C4Q76uOBjxy34JVB0WOsB6 NyMQB9h6LGljQtdEddyUqwnRZzzHiGvp0hPtdYQHQZlqbj4FV9lTRK7a8Ega+y7MgmeMiztG zeXyjNP02r3PRHCPagwa57bPxH2aAh4Q7UzBBZ0GTMm7DLKNtCP58WDxblrrhZ+7kHqGK8Fs bdeUpDdEYLVd Message-ID: Date: Fri, 4 Jan 2019 07:56:42 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.3 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 3729383EC0 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=EqaVeZ+M; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of melounmichal@gmail.com designates 2a00:1450:4864:20::344 as permitted sender) smtp.mailfrom=melounmichal@gmail.com X-Spamd-Result: default: False [-4.18 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; HAS_REPLYTO(0.00)[mmel@freebsd.org]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.69)[-0.692,0]; FROM_EQ_ENVFROM(0.00)[]; IP_SCORE(-0.48)[ip: (1.21), ipnet: 2a00:1450::/32(-1.87), asn: 15169(-1.65), country: US(-0.08)]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; REPLYTO_DOM_NEQ_FROM_DOM(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[4.4.3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.5.4.1.0.0.a.2.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[] X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jan 2019 06:56:44 -0000 On 29.12.2018 18:47, Dennis Clarke wrote: > On 12/28/18 9:56 PM, Mark Millard via freebsd-arm wrote: >> >> On 2018-Dec-28, at 12:12, Mark Millard wrote: >> >>> On 2018-Dec-28, at 05:13, Michal Meloun >>> wrote: >>> >>>> Mark, >>>> this is known problem with qemu-user-static. >>>> Emulation of every single interruptible syscall is broken by design (it >>>> have signal related races). Theses races cannot be solved without major >>>> rewrite of syscall emulation code. >>>> Unfortunately, nobody actively works on this, I think. >>>> > > Following along here quietly and I had to blink at this a few times. > Is there a bug report somewhere within the qemu world related to this >  'broken by design' qemu feature? Firstly, I apologize for late answer. Writing a technically accurate but still comprehensible report is extremely difficult for me. Major design issue with qemu-user is the fact that guest (blocking / interruptible) syscalls must be emulated atomically, including delivering of asynchronous signals (including signals originated by other thread). This is something that cannot be emulated precisely by user mode program, without specific kernel support. Let me explain this in a little more details. Assume that we have following trivial code: void sig_alarm_handler(…) { if (!done) { do some work; alarm(10); } } void foo(void) { install_signal_handler(SIGALARM, sig_alarm_handler); alarm(10); do some work; while (true) { rv = select(…, NULL); if (rv == 0) do some work; else if (rv != EINTR) Report error end exit; } } In native environment, this code works well. It calls alarm signal handler every 10s, irrespective if signal is fired in the program code or in libc implementation of select() or if program is waiting in kernel part of select() syscall. In qemu-user environment, things get significantly harder. Qemu can deliver signals to guest only on instruction boundary, the guest signal handler should see emulated CPU context in consistent state. But kernel can deliver signal to qemu in any time. Due to this, qemu must store delivered signals into queue and emit these later, when emulator steps over next instruction boundary. Assume that qemu just emulates 'syscall' instruction from guest select() call. Also assume that no other signals (but SIGALARM) are generated, and socket used in select() never received or transmits any data. The first version of qemu-user code emulating select() was: abi_long do_freebsd_select(..) { convert input guest arguments to host; rv = select(…); convert output host arguments to guest; return(rv); } But this is very racy. If alarm signal is fired before select(…) enters kernel, qemu queues it (but does not deliver it to guest because it isn't on instruction boundary) and continues in emulation. And because (in our case) select() waits indefinitely, alarm signal is never delivered to guest and whole program hangs. Actual qemu code emulating select() looks like: abi_long do_freebsd_select(..) { convert input guest arguments to host; sigfillset(&mask); sigprocmask(SIG_BLOCK, &mask, &omask); if (ts->signal_pending) { sigprocmask(SIG_SETMASK, &omask, NULL); /* We have a signal pending so just poll select() and return. */ tv2.tv_sec = tv2.tv_usec = 0; ret = select(…, , &tv2)); if (ret == 0) ret = TARGET_EINTR; } else { ret = pselect(…, &omask)); sigprocmask(SIG_SETMASK, &omask, NULL); } convert output host arguments to guest; return(rv); } This look a much better. The code blocks all signals first, then checks if any signal is pending. If yes, then does not-blocking select() (because timeout is zero) and correctly returns EINTR immediately. Otherwise, it uses other variant of select(), pselect() which adjusts right signal mask itself. That's mean that syscall is called with blocked signal delivery, but kernel adjusts right sigmask before it waits for event. While this looks like perfect solution and this code closes all races from first version, then it doesn't. pselect() uses different semantic that select(), it doesn't update timeout argument. So this solution is also inappropriate. Moreover, I think, we don't have p equivalents for all blocking syscalls. Mark, I hope that this is also the answer to your question posted to hackers@ and also the exploitation why you see hang. Linux uses different approach to overcome this issue, safe_syscall -> https://gitlab.collabora.com/tomeu/qemu/commit/4d330cee37a21aabfc619a1948953559e66951a4 It looks like workable workaround, but I'm not sure about ERESTART versus EINTR return values. Imho, this can be problem. I have list of other qemu-user problems (I mean mainly a bsd-user part of qemu code here), not counting normal coding bugs: - code is not thread safety but is used in threaded environment (rw locks for example), - emulate some sysctl's and resource limits / usage behavior is very hard (mainly if we emulate 32-bits guest on 64-bits host) - if host syscall returns ERESTART, we should do full unroll and pass it to guest. - the syscalls emulation should not use the libc functions, but syscall instruction directly. Libc shims can have side effects so we should not to execute it twice. Once in guest, second time in host. - and last major one. At this time, all guest structures are maintained by hand. Due to huge amount of these structures, this is the extreme error prone approach. We should convert this to script generated code, including guest syscalls definition. Again, my apology for slightly (or much) chaotic report, but this is the best what's I capable. Michal