From owner-freebsd-arch@FreeBSD.ORG Sun Dec 29 08:14:44 2013 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8D79E563 for ; Sun, 29 Dec 2013 08:14:44 +0000 (UTC) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id 39124126A for ; Sun, 29 Dec 2013 08:14:43 +0000 (UTC) Received: from c122-106-144-87.carlnfd1.nsw.optusnet.com.au (c122-106-144-87.carlnfd1.nsw.optusnet.com.au [122.106.144.87]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 38527421D3D; Sun, 29 Dec 2013 19:14:34 +1100 (EST) Date: Sun, 29 Dec 2013 19:14:32 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Warner Losh Subject: Re: UART grab patch In-Reply-To: <79C659B9-F402-42B6-8240-C4AB0A0EB92B@bsdimp.com> Message-ID: <20131229173041.E979@besplex.bde.org> References: <79C659B9-F402-42B6-8240-C4AB0A0EB92B@bsdimp.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=bpB1Wiqi c=1 sm=1 tr=0 a=p/w0leo876FR0WNmYI1KeA==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=fyKI3GbUD0oA:10 a=6I5d2MoRAAAA:8 a=IGeycOlmXiouukMLaQEA:9 a=CjuIK1q_8ugA:10 Cc: "freebsd-arch@freebsd.org Arch" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Dec 2013 08:14:44 -0000 On Sat, 28 Dec 2013, Warner Losh wrote: > OK. Looks like my initial assessment of mountroot was too optimistic. With the exception of the imx driver, all the uart drivers in the tree fell victim to this bug. I hit a second one with the Raspberry Pi, and went looking. Please use the Unix newline character in mail. > I've uploaded a patch to http://people.freebsd.org/~imp/uart-grab.diff that should solve the problem. I don't have all these systems to test on, so I'm hoping people can report back to me what works and what doesn't. Boot -a with and without the patch for all serial consoles will tell you if it is working. Type stuff. If it appears exactly as you type it, then the patch is working. If not, please let me know. > > The basic strategy for all these is to disable RX interrupts when the console is grabbed, and enable them when it is ungrabbed. This has to be done in the hardware, since masking the interrupt at the CPU or PIC level will cause many UARTs to misbehave and/or prevent other interrupts from happening that are shared which can cause problems on some platforms. This is very broken on xx50 (3 copies which differ only in style). There it doesn't actually disable interrupts using the interrupt enable register, but hacks on the fifo control register to try to reduce the rx fifo trigger level to 1. It has to hard-code the latter register since it is not supposed to be available at this level. You previously pointed out that the bug is smaller when the rx fifo is large, so this change makes the problem worse when its attempt succeeds. It is not so bad when either the fifo doesn't exist (then it has no effect) or when the trigger level doesn't get reduced all the way to 1 (this happens in some non-default xx50 modes). This is broken or fragile on most or all device types. It "restores" to a hard-coded state instead of to the previous state. Maybe uart fixates the state of the console device sufficently for the final state to be constant, but I doubt this. Indeed, just for xx50, buggy hardware needs special handling for tx interrupts, and uart has this. It sets the tx interrupt enable bit while transmitting and keeps it clear it otherwise. Thus the initial state of the interrupt enable register varies and the state restored should vary to match. The correct state to restore for this register can possibly be determined by looking at the global state of the driver, but this would be complicated (it needs transient locking of software state all over; only hardware state is locked). In general, hardware state should be saved in *sc by grab and restored by ungrab. Locking only this part of *sc across grab/ungrab is still difficult. Parts of the driver that might invalidate the saved state must be locked out. The patch also "sets" the state to a hard-coded state instead of modifying the current state. If this is done right, then it is correct since it amounts to an initialization of the driver to console mode. But the parts of the patch that set the correct (interrupt control) register enable all the non-tx interrupts even if none were enabled before (these are later "restored"). This asks for interrupts that might cause problems. On xx50, this has a chance of being harmless only due to the implementation detail that the hardware has an interrupt identification register and the software uses this register so as to not see hardware state changes unless their interrupt has happened. As I mentioned previously, grab/ungrab were supposed to handle both directions of i/o, but are currently only called from higher levels for input. Forcing rx interrupts off and tx interrupts on in grab is very input-centric. The locking is certainly incomplete. I don't remember even locking against recursion in earlier changes in uart grab/ungrab. (Note that there is no anti-recursion code at the top level, except for some deadlocking in cnputs()). syscons grab has the buggy locking: if (scp->grabbed++ > 0) return; Bugs in this start with the access being non-atomic. With atomic accesses, locking like this can work, but you might have to sprinkle too many atomic tests of the variable, using code like: uart_lock(...); if (scp->grabbed == 0) normal_operation(); else operation_reduced_to_avoid_confict_with_grabbed_mode(); uart_unlock(...); I changed this to use a mutex because atomic accesses are too hard to use. They would have to be spammed through normal_operation(). The lock must be held throughout. The !normal_operation() case may still be complicated. E.g., suppose you have a buggy grab routine that somehow allows an interrupt. Normal operation is to service the interrupt so that it doesn't repeat. Abnormal operation can't be to simply return from the interrupt handler, since then the interrupt might repeat endlessly or never occur again. Here it would be better to do the normal operation and break the console operation. Make this a feature and let invalid (but not very harmful) states occur occasionally and fix up the invalid states using polling. Bruce From owner-freebsd-arch@FreeBSD.ORG Sun Dec 29 14:14:15 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2773939C; Sun, 29 Dec 2013 14:14:15 +0000 (UTC) Received: from mta05.bitpro.no (mta05.bitpro.no [92.42.64.202]) by mx1.freebsd.org (Postfix) with ESMTP id CEC531713; Sun, 29 Dec 2013 14:14:14 +0000 (UTC) Received: from mail.lockless.no (mail.lockless.no [46.29.221.38]) by mta05.bitpro.no (Postfix) with ESMTPS id 392B017FC60; Sun, 29 Dec 2013 15:14:01 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by mail.lockless.no (Postfix) with ESMTP id A29CE8FAF7E; Sun, 29 Dec 2013 15:14:45 +0100 (CET) X-Virus-Scanned: by amavisd-new-2.6.4 (20090625) (Debian) at lockless.no Received: from mail.lockless.no ([127.0.0.1]) by localhost (mail.lockless.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nva5j9jeI3jp; Sun, 29 Dec 2013 15:14:44 +0100 (CET) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) by mail.lockless.no (Postfix) with ESMTPSA id 8A6798F77C5; Sun, 29 Dec 2013 15:14:44 +0100 (CET) Message-ID: <52C02E6C.2080704@bitfrost.no> Date: Sun, 29 Dec 2013 15:15:08 +0100 From: Hans Petter Selasky Organization: Bitfrost A/S User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: [RFC] Should CUSE4BSD move to the FreeBSD source tree? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Ed Maste X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Dec 2013 14:14:15 -0000 Hi, The CUSE4BSD, character device in userspace, library and kernel module, has become an extremely popular driver choice for supporting a whole lot of USB devices under FreeBSD. There are now many clients using CUSE4BSD, cx88, uhidd, webcamd, and someone even mailed me a python wrapper for CUSE4BSD, so actually you can now implement a character device driver in a high level scripting language! When upgrading the kernel and stuff like that, it would be more easy for people if CUSE4BSD was part of the kernel. Of course this puts more restrictions on the API, and CUSE4BSD sometime has API additions. Also that might mean that CUSE4BSD will go through some API changes, as input from other FreeBSD developers. One change might be to change lengths from "int" to "ssize_t" although, it is very unlikely to read more than a few megabytes at a time from a character device. Also I wonder if putting CUSE4BSD in src, means that it is then also accepted as new *BSD standard? Some other questions are: Where should CUSE4BSD reside in svn? Should it have it's own vendor branch or simply just @ head + MFC to xxx-stable? Input and comments are appreciated! --HPS Reference: http://www.selasky.org/hans_petter/cuse4bsd /usr/ports/multimedia/cuse4bsd-kmod +++ Happy new year to all of you +++ From owner-freebsd-arch@FreeBSD.ORG Sun Dec 29 15:38:21 2013 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D39DE315 for ; Sun, 29 Dec 2013 15:38:21 +0000 (UTC) Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A04941D29 for ; Sun, 29 Dec 2013 15:38:21 +0000 (UTC) Received: from 65-123-255-137.dia.static.qwest.net ([65.123.255.137]:58028 helo=[172.26.21.190]) by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80.1) (envelope-from ) id 1VxIRX-0008HP-D1; Sun, 29 Dec 2013 10:38:19 -0500 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: IPSEC From: George Neville-Neil In-Reply-To: Date: Sun, 29 Dec 2013 07:38:16 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <15DFE76D-40B7-4F56-82EC-26EB9F1D9824@neville-neil.com> References: <523457A1.3090606@debian.org> To: Eitan Adler X-Mailer: Apple Mail (2.1827) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - vps.hungerhost.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - neville-neil.com X-Get-Message-Sender-Via: vps.hungerhost.com: authenticated_id: gnn@neville-neil.com Cc: =?windows-1252?Q?Olivier_Cochard-Labb=E9?= , "freebsd-arch@freebsd.org" , Robert Millan , "debian-bsd@lists.debian.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Dec 2013 15:38:22 -0000 On Dec 14, 2013, at 11:28 , Eitan Adler wrote: > Hi arch@, >=20 > The question below has been unanswered since Sat, Sep 14, 2013. >=20 > Are there any known concerns with enabling IPSEC? Is there any reason > to not do so in GENERIC? >=20 Certainly there is always a risk of reduced stability when you mix more = code into the system. I do not know, off hand, of any bugs that would prevent us from = turning this on in GENERIC. It would be nice to know what kind of user/customer = demand you=92re seeing so we could evaluate whether or not we should turn IPSec = on by default in GENERIC in the base FreeBSD. Best, George > On Sun, Dec 8, 2013 at 2:02 PM, Olivier Cochard-Labb=E9 > wrote: >> On Sun, Dec 8, 2013 at 12:16 AM, Eitan Adler = wrote: >>> Hi all, >>>=20 >>> I understand this is an old thread but I do not see an answer here. >>> Can anyone answer the question below? >>>=20 >>> On Sat, Sep 14, 2013 at 8:33 AM, Robert Millan = wrote: >>>>=20 >>>> Hi! >>>>=20 >>>> Is there any particular reason (performance, stability concerns...) >>>> IPSEC support is not enabled in GENERIC? >>>>=20 >>>> In Debian GNU/kFreeBSD we're considering enabling it in our default >>>> builds, due to increased user demand and as it is already enabled = for >>>> our Linux-based flavours. >>>>=20 >>>> However we're concerned about diverging from FreeBSD as there might = be >>>> unforeseen consequences. Is there any specific concern on your = side? >>>>=20 >>>> If not, perhaps it could be considered for HEAD after 10.0 release? >>>=20 >>>=20 >>=20 >> Here are my own bench result regarding forwarding speed = (paquet-per-second) >> with a kernel compiled without-ipsec and with ipsec (ipsec is not = enabled >> during the tests, just present on the kernel) of FreeBSD = 10.0-PRERELEASE: >>=20 >> ministat -s without-ipsec ipsec >> x without-ipsec >> + ipsec >> = +-------------------------------------------------------------------------= -------+ >> |x + x + +x x x + >> +| >> | |__________________A_____M____________| >> | >> | = |_______________M_________A__________________________| >> | >> = +-------------------------------------------------------------------------= -------+ >> N Min Max Median Avg = Stddev >> x 5 1646075 1764528 1725461 1713080 = 44560.059 >> + 5 1685034 1833206 1724461 1748666.8 = 62356.218 >> No difference proven at 95.0% confidence >>=20 >> I didn't see negative impact of enabling ipsec (it's even a little = bit >> better with it). >>=20 >> Regards, >>=20 >> Olivier >=20 >=20 >=20 > --=20 > Eitan Adler > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to = "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Sun Dec 29 18:19:36 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 866755E6 for ; Sun, 29 Dec 2013 18:19:36 +0000 (UTC) Received: from mail-ie0-f172.google.com (mail-ie0-f172.google.com [209.85.223.172]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4D18C1771 for ; Sun, 29 Dec 2013 18:19:35 +0000 (UTC) Received: by mail-ie0-f172.google.com with SMTP id qd12so11373479ieb.17 for ; Sun, 29 Dec 2013 10:19:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=wCoyTE4VUKE84HMLkyeMvkGvdyqzrexz4Jh6DowUBUA=; b=bK/Z3H4Z7bIBa46qOldpKI/2sFoN5rTtb/dEW1MQ59amykmafSFPFzPznZCvM2vloo SNkUTHrUMXqF+hShafUxo4Blxdg2ZSA1f3VAAEJF6jDzANEW403AskPRERHbAvNY1rjU GtLiE78BhkwRCPDX8S/w2zEdDciZFJ11Ai4tIqiiYploVgjlie8ZCo2im8Xvp9PnWPx5 4TcC+4WbbCseGqXb0TbuihUgx5g/L6KLhUiYy8+ah5UeaXfRuK3znJ3UBqbll717UySf MUXz/TdgCxJ31O5/5v/+9ohEAmbCN2+EzjL0s02Rf3PBziYOBNYoFtcpKwrhLiRtF9ZF 1meg== X-Gm-Message-State: ALoCoQkm8Pkw+I5KIpHDf3VPcINkGjFrT010wD4FkiUUAHUuQsmI8XONKlDDjBKOGpX1KJRWKO+m X-Received: by 10.50.43.170 with SMTP id x10mr52346779igl.16.1388341169564; Sun, 29 Dec 2013 10:19:29 -0800 (PST) Received: from fusion-mac.bsdimp.com (50-78-194-198-static.hfc.comcastbusiness.net. [50.78.194.198]) by mx.google.com with ESMTPSA id lp9sm55597855igb.2.2013.12.29.10.19.26 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 29 Dec 2013 10:19:28 -0800 (PST) Sender: Warner Losh Subject: Re: UART grab patch Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Warner Losh In-Reply-To: <20131229173041.E979@besplex.bde.org> Date: Sun, 29 Dec 2013 11:19:25 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <7C0A2FDD-7531-4C9F-B89B-004682E05F80@bsdimp.com> References: <79C659B9-F402-42B6-8240-C4AB0A0EB92B@bsdimp.com> <20131229173041.E979@besplex.bde.org> To: Bruce Evans X-Mailer: Apple Mail (2.1085) Cc: "freebsd-arch@freebsd.org Arch" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Dec 2013 18:19:36 -0000 On Dec 29, 2013, at 1:14 AM, Bruce Evans wrote: > On Sat, 28 Dec 2013, Warner Losh wrote: >=20 >> OK. Looks like my initial assessment of mountroot was too optimistic. = With the exception of the imx driver, all the uart drivers in the tree = fell victim to this bug. I hit a second one with the Raspberry Pi, and = went looking. >=20 > Please use the Unix newline character in mail. Apple's Mail.app hates me. I don't see how to change it. >> I've uploaded a patch to = http://people.freebsd.org/~imp/uart-grab.diff that should solve the = problem. I don't have all these systems to test on, so I'm hoping people = can report back to me what works and what doesn't. Boot -a with and = without the patch for all serial consoles will tell you if it is = working. Type stuff. If it appears exactly as you type it, then the = patch is working. If not, please let me know. >>=20 >> The basic strategy for all these is to disable RX interrupts when the = console is grabbed, and enable them when it is ungrabbed. This has to be = done in the hardware, since masking the interrupt at the CPU or PIC = level will cause many UARTs to misbehave and/or prevent other interrupts = from happening that are shared which can cause problems on some = platforms. >=20 > This is very broken on xx50 (3 copies which differ only in style). Yea, I think we need to remerge them. > There it doesn't actually disable interrupts using the interrupt = enable > register, but hacks on the fifo control register to try to reduce the > rx fifo trigger level to 1. It has to hard-code the latter register > since it is not supposed to be available at this level. You = previously > pointed out that the bug is smaller when the rx fifo is large, so this > change makes the problem worse when its attempt succeeds. It is not > so bad when either the fifo doesn't exist (then it has no effect) or > when the trigger level doesn't get reduced all the way to 1 (this > happens in some non-default xx50 modes). Doh! That's what I get for trying to do this without properly consulting the data sheets. > This is broken or fragile on most or all device types. It "restores" = to a > hard-coded state instead of to the previous state. Maybe uart fixates > the state of the console device sufficently for the final state to be > constant, but I doubt this. =20 For many of the UARTs, the state is fixed in attached, so blasting out a fixed state is sufficient. > Indeed, just for xx50, buggy hardware needs > special handling for tx interrupts, and uart has this. It sets the tx > interrupt enable bit while transmitting and keeps it clear it = otherwise. > Thus the initial state of the interrupt enable register varies and the > state restored should vary to match. The correct state to restore > for this register can possibly be determined by looking at the global > state of the driver, but this would be complicated (it needs transient > locking of software state all over; only hardware state is locked). Yes. That's a flaw in the current implementation. We don't have the software state passed around, and we should. > In general, hardware state should be saved in *sc by grab and restored > by ungrab. Locking only this part of *sc across grab/ungrab is still > difficult. Parts of the driver that might invalidate the saved state > must be locked out. Yes. "sc" isn't saved in uart_devinfo for the console, so we can't do = this today. I'll work things up so that we can do that... > The patch also "sets" the state to a hard-coded state instead of = modifying > the current state. If this is done right, then it is correct since it > amounts to an initialization of the driver to console mode. But the > parts of the patch that set the correct (interrupt control) register > enable all the non-tx interrupts even if none were enabled before = (these > are later "restored"). This asks for interrupts that might cause = problems. > On xx50, this has a chance of being harmless only due to the = implementation > detail that the hardware has an interrupt identification register and = the > software uses this register so as to not see hardware state changes = unless > their interrupt has happened. Good point... I'd forgotten just how picky the nsXX50 parts are. > As I mentioned previously, grab/ungrab were supposed to handle both > directions of i/o, but are currently only called from higher levels > for input. Forcing rx interrupts off and tx interrupts on in grab > is very input-centric. For now, grab/ungrab is rx centric. Fixing that is beyond the scope of what I'm trying to achieve. > The locking is certainly incomplete. I don't remember even locking > against recursion in earlier changes in uart grab/ungrab. (Note that > there is no anti-recursion code at the top level, except for some > deadlocking in cnputs()). The proper fix to this is at the top level. I'll look into this as a = separate issue. > syscons grab has the buggy locking: >=20 > if (scp->grabbed++ > 0) > return; >=20 > Bugs in this start with the access being non-atomic. With atomic > accesses, locking like this can work, but you might have to sprinkle > too many atomic tests of the variable, using code like: >=20 > uart_lock(...); > if (scp->grabbed =3D=3D 0) > normal_operation(); > else > operation_reduced_to_avoid_confict_with_grabbed_mode(); > uart_unlock(...); Concurrent access is verboten, so I'm not sure how useful this would = be... I'll have to think about this.. > I changed this to use a mutex because atomic accesses are too hard to > use. They would have to be spammed through normal_operation(). The > lock must be held throughout. The !normal_operation() case may still > be complicated. E.g., suppose you have a buggy grab routine that > somehow allows an interrupt. Normal operation is to service the > interrupt so that it doesn't repeat. Abnormal operation can't be to > simply return from the interrupt handler, since then the interrupt > might repeat endlessly or never occur again. Here it would be better > to do the normal operation and break the console operation. Make this > a feature and let invalid (but not very harmful) states occur > occasionally and fix up the invalid states using polling. True. This would make it more robust. However, given the extent of grabs today, I'm not sure that this is worth fixing. I'll have to think about = it. Thanks for the review. Warner= From owner-freebsd-arch@FreeBSD.ORG Sun Dec 29 22:27:21 2013 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7DB92553 for ; Sun, 29 Dec 2013 22:27:21 +0000 (UTC) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id 264F61DB7 for ; Sun, 29 Dec 2013 22:27:20 +0000 (UTC) Received: from c122-106-144-87.carlnfd1.nsw.optusnet.com.au (c122-106-144-87.carlnfd1.nsw.optusnet.com.au [122.106.144.87]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id A19383C21FC; Mon, 30 Dec 2013 08:59:14 +1100 (EST) Date: Mon, 30 Dec 2013 08:59:13 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Warner Losh Subject: Re: UART grab patch In-Reply-To: <7C0A2FDD-7531-4C9F-B89B-004682E05F80@bsdimp.com> Message-ID: <20131230072758.F3485@besplex.bde.org> References: <79C659B9-F402-42B6-8240-C4AB0A0EB92B@bsdimp.com> <20131229173041.E979@besplex.bde.org> <7C0A2FDD-7531-4C9F-B89B-004682E05F80@bsdimp.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=HZAtEE08 c=1 sm=1 tr=0 a=p/w0leo876FR0WNmYI1KeA==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=fyKI3GbUD0oA:10 a=6I5d2MoRAAAA:8 a=bh-WMf8O3G-kWHpSbkIA:9 a=CjuIK1q_8ugA:10 Cc: "freebsd-arch@freebsd.org Arch" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Dec 2013 22:27:21 -0000 On Sun, 29 Dec 2013, Warner Losh wrote: > > On Dec 29, 2013, at 1:14 AM, Bruce Evans wrote: > >> On Sat, 28 Dec 2013, Warner Losh wrote: >> >>> OK. Looks like my initial assessment of mountroot was too optimistic. With the exception of the imx driver, all the uart drivers in the tree fell victim to this bug. I hit a second one with the Raspberry Pi, and went looking. >> >> Please use the Unix newline character in mail. > > Apple's Mail.app hates me. I don't see how to change it. Most mail I get from apple mail clients is misformatted. >>> I've uploaded a patch to http://people.freebsd.org/~imp/uart-grab.diff > ... >> This is broken or fragile on most or all device types. It "restores" to a >> hard-coded state instead of to the previous state. Maybe uart fixates >> the state of the console device sufficently for the final state to be >> constant, but I doubt this. > > For many of the UARTs, the state is fixed in attached, so blasting > out a fixed state is sufficient. Well, that is fragile. What is "in attached"? Normal device attachment is too late for console drivers. The state must be set at console attach time and never changed. This is not easy to arrange. First you have to ensure that normal probe/attach after console attach never changes it. Then you have to ensure that normal higher-level device operations never change it. For uart, you have to do this for many different types of hardware. I forgot to mention a problem with kdb activity again. This _always_ gives an unfixed state. The mutex is ignored in kdb mode, so locking using the mutex alone is insuffient. The unfixed state and races from it occur as follows: someone does non-kdb console i/o that grabs properly grabbing consists of: acquire hwmtx disable device interrupts (note that this unfixes the state) suppose this is interrupted at this point or later by a kdb trap doing console i/o --> this grabs too, but igores the mutex. Grabbing consists of: disable interrupts. This then does the i/o and ungrabs. Ungrabbing consists of: enable interrupts (blindly in your version). --> back to interrupted context Continue doing the i/o with corrupted state. uart has a much more serious bug from ignoring the mutex (old versions that didn't ignore the mutex had deadlock instead). The easiest way to hit this bug on xx50 is when ns8250_param() deselects the data register: caller acquires hwmtx. This locks for everything except kdb ns8250_param() selects the divisor latch registers and deselects the data register and the interrupt enable register. (Note that this is reachable for console devices. Even though uart's fixation breaks the initial state device's control over part of the state and the lock state device's control over the fixation, it only does this for the line speed and a couple of software parameters; ns8250_param() is still reached to set things like the parity, and in fact it deselects the data register, etc. even for null changes.) suppose a kdb trap doing console i/o occurs while the data register is deselected. When this waited for hwmtx, it deadlocked. Now it obviously cannot write any data since the data register is deselected. It actually writes garbage to the low half of the divisor latch. This may or may not be overwritten with the correct value for the divisor latch, depending on the ordering of the writes. Blind grabbing and ungrabbing increase this bug slightly. The critical interrupt enable register is the other one that is deselected. Grabbing writes garbage to the upper half of the divisor latch. Blind ungrabbing writes different garbage. Restoring the previous value would fix up the garbage in this half. Some xx50's have modes in which a larger set of registers are deselected and these modes are especially needed for changing parameters, but uart doesn't support them. >> Indeed, just for xx50, buggy hardware needs >> special handling for tx interrupts, and uart has this. It sets the tx >> interrupt enable bit while transmitting and keeps it clear it otherwise. >> Thus the initial state of the interrupt enable register varies and the >> state restored should vary to match. The correct state to restore >> for this register can possibly be determined by looking at the global >> state of the driver, but this would be complicated (it needs transient >> locking of software state all over; only hardware state is locked). > > Yes. That's a flaw in the current implementation. We don't have the > software state passed around, and we should. Urk. cn_arg is passed around, but it is always &uart_console. This seems to be a placeholder for doing it better someday. >> In general, hardware state should be saved in *sc by grab and restored >> by ungrab. Locking only this part of *sc across grab/ungrab is still >> difficult. Parts of the driver that might invalidate the saved state >> must be locked out. > > Yes. "sc" isn't saved in uart_devinfo for the console, so we can't do this > today. I'll work things up so that we can do that... Maybe the state can be stored in the devinfo. It is good to decouple the console state from the normal softc state as much as possible, but with no hints from the softc this seems harder. >> The patch also "sets" the state to a hard-coded state instead of modifying >> the current state. If this is done right, then it is correct since it >> amounts to an initialization of the driver to console mode. But the >> parts of the patch that set the correct (interrupt control) register >> enable all the non-tx interrupts even if none were enabled before (these >> are later "restored"). This asks for interrupts that might cause problems. >> On xx50, this has a chance of being harmless only due to the implementation >> detail that the hardware has an interrupt identification register and the >> software uses this register so as to not see hardware state changes unless >> their interrupt has happened. > > Good point... I'd forgotten just how picky the nsXX50 parts are. They are fairly orthogonal and thus less picky than some. Only simpler devices like sy6551 are much easier to program. z8530 is much harder. >> The locking is certainly incomplete. I don't remember even locking >> against recursion in earlier changes in uart grab/ungrab. (Note that >> there is no anti-recursion code at the top level, except for some >> deadlocking in cnputs()). > > The proper fix to this is at the top level. I'll look into this as a separate issue. No, the details are too device-specific to handle at the top level. >> ... > Concurrent access is verboten, so I'm not sure how useful this would be... > I'll have to think about this.. > >> I changed this to use a mutex because atomic accesses are too hard to >> use. They would have to be spammed through normal_operation(). The >> lock must be held throughout. The !normal_operation() case may still >> be complicated. E.g., suppose you have a buggy grab routine that >> somehow allows an interrupt. Normal operation is to service the >> interrupt so that it doesn't repeat. Abnormal operation can't be to >> simply return from the interrupt handler, since then the interrupt >> might repeat endlessly or never occur again. Here it would be better >> to do the normal operation and break the console operation. Make this >> a feature and let invalid (but not very harmful) states occur >> occasionally and fix up the invalid states using polling. > > True. This would make it more robust. However, given the extent of grabs > today, I'm not sure that this is worth fixing. I'll have to think about it. Sloppy grabs aren't worth having. You are only using them to work around the multiple consoles breakage. The former is easy to fix quickly by dropping support for multiple consoles. Or don't drop it completely, but hack on it for mountroot and not much else. Bruce From owner-freebsd-arch@FreeBSD.ORG Sun Dec 29 23:23:27 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 701CEF8F; Sun, 29 Dec 2013 23:23:27 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 44A471119; Sun, 29 Dec 2013 23:23:26 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id rBTNNPvq042576 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 29 Dec 2013 15:23:26 -0800 (PST) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id rBTNNPEo042575; Sun, 29 Dec 2013 15:23:25 -0800 (PST) (envelope-from jmg) Date: Sun, 29 Dec 2013 15:23:25 -0800 From: John-Mark Gurney To: Adrian Chadd Subject: Re: review request: sendfile kqueue notification Message-ID: <20131229232325.GG99167@funkthat.com> Mail-Followup-To: Adrian Chadd , "freebsd-arch@freebsd.org" , freebsd-current References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sun, 29 Dec 2013 15:23:26 -0800 (PST) Cc: freebsd-current , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Dec 2013 23:23:27 -0000 Adrian Chadd wrote this message on Thu, Dec 12, 2013 at 12:41 -0800: > I'd like to start committing this to FreeBSD-HEAD: > > http://people.freebsd.org/~adrian/netflix/20131211-sendfile-kqueue-11.diff > > It implements kqueue notifications for sendfile so users can get an > asynchronous notification that the underlying mbufs have been freed. > This allows userland users of sendfile to know that the underlying > memory / file object can be recycled or overwritten. Right now the > only way to do this is to set SF_SYNC and this causes sendfile() to > sleep until the transaction is complete and the mbufs have been freed. > > I've been testing this out locally in my lab environment and it's > running flawlessly at 30gbit/sec of TCP across 32,768 active > transmitting sockets. > > I'd like to start merging this into -HEAD in small pieces to make it > easier to MFC to -10. I have some questions about this. In filt_sfsync_detach, you have: + if (!knlist_empty(knl)) + knlist_remove(knl, kn, 1); Have you had a panic where you were in _detach but the list was empty? a better check instead of knlist_empty is: if (!(kn->kn_status & KN_DETACHED)) You probably copied code from vfs_aio.c, which should probably also use this construct instead. I've noticed that you're manually locking the knotelist, and I should look at this further, but if we really are going to have external people doing lock/unlock, we should make macros for this instead of hard coding it everywhere. Currently only vfs_aio.c is doing this outside of kern_event.c. In all cases, calls to f_event (filt_sfsync) will have the list locked, so turning that into an assert would be correct. We should probably report sendfile errors via the knote. It shouldn't be hard as the kevent data field can be used for this and setting the EV_ERROR. You shouldn't set data to be the sfs pointer in _setup as that can be leaked to userland, and you never use it (maybe you do in your userland side). You also might want to set sfs to NULL after calling sf_sync_free to make sure you don't acidentally use it after you've [possibly] free'd it. I didn't look too closely at the sf logic but what I did see looks fine. Overall I think the patch looks fine, though there are a few XXX that still should be addressed on your side. P.S. When generating a diff, using svn -x -up is useful to show the function code blocks are in. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."