From owner-freebsd-drivers@FreeBSD.ORG Mon Apr 11 02:09:36 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F1E531065670 for ; Mon, 11 Apr 2011 02:09:36 +0000 (UTC) (envelope-from dieterbsd@engineer.com) Received: from imr-ma01.mx.aol.com (imr-ma01.mx.aol.com [64.12.206.39]) by mx1.freebsd.org (Postfix) with ESMTP id 79CD18FC0C for ; Mon, 11 Apr 2011 02:09:36 +0000 (UTC) Received: from imo-ma03.mx.aol.com (imo-ma03.mx.aol.com [64.12.78.138]) by imr-ma01.mx.aol.com (8.14.1/8.14.1) with ESMTP id p3B1xKER010455; Sun, 10 Apr 2011 21:59:20 -0400 Received: from dieterbsd@engineer.com by imo-ma03.mx.aol.com (mail_out_v42.9.) id n.e78.9e811b2 (44228); Sun, 10 Apr 2011 21:59:18 -0400 (EDT) Received: from smtprly-dc01.mx.aol.com (smtprly-dc01.mx.aol.com [205.188.170.1]) by cia-dd08.mx.aol.com (v129.9) with ESMTP id MAILCIADD086-d1c24da260711c5; Sun, 10 Apr 2011 21:59:16 -0400 Received: from web-mmc-m02 (web-mmc-m02.sim.aol.com [64.12.224.135]) by smtprly-dc01.mx.aol.com (v129.9) with ESMTP id MAILSMTPRLYDC013-d1c24da260711c5; Sun, 10 Apr 2011 21:59:13 -0400 To: freebsd-hackers@freebsd.org, freebsd-drivers@freebsd.org Content-Transfer-Encoding: quoted-printable Date: Sun, 10 Apr 2011 21:59:13 -0400 X-MB-Message-Source: WebUI X-AOL-IP: 67.206.163.145 X-MB-Message-Type: User MIME-Version: 1.0 From: dieterbsd@engineer.com Content-Type: text/plain; charset="us-ascii" X-Mailer: Mail.com Webmail 33490-STANDARD Received: from 67.206.163.145 by web-mmc-m02.sysops.aol.com (64.12.224.135) with HTTP (WebMailUI); Sun, 10 Apr 2011 21:59:13 -0400 Message-Id: <8CDC60330838A41-18FC-694F@web-mmc-m02.sysops.aol.com> X-Spam-Flag: NO X-AOL-SENDER: dieterbsd@engineer.com Cc: Subject: Need an alternative to DELAY() X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Apr 2011 02:09:37 -0000 FreeBSD 8.2 amd64 uniprocessor kernel: siisch1: DISCONNECT requested kernel: siisch1: SIIS reset... kernel: siisch1: siis_sata_connect() calling DELAY(1000) last message repeated 59 times kernel: siisch1: SATA connect time=3D60ms status=3D00000123 kernel: siisch1: SIIS reset done: devices=3D00000001 kernel: siisch1: DISCONNECT requested kernel: siisch1: SIIS reset... kernel: siisch1: siis_sata_connect() calling DELAY(1000) last message repeated 58 times kernel: siisch1: SATA connect time=3D59ms status=3D00000123 ... kernel: siisch0: siis_wait_ready() calling DELAY(1000) last message repeated 1300 times kernel: siisch0: port is not ready (timeout 10000ms) status =3D 001f2000 Meanwhile, *everything* comes to a screeching halt. Device drivers are locked out, and thus incoming data is lost. Losing incoming data is unacceptable. Need an alternative to DELAY() that does not lock out other device drivers. There must be a way to reset one bit of hardware without locking down the entire machine. From owner-freebsd-drivers@FreeBSD.ORG Mon Apr 11 07:36:21 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9C0631065672; Mon, 11 Apr 2011 07:36:21 +0000 (UTC) (envelope-from hselasky@c2i.net) Received: from swip.net (mailfe08.c2i.net [212.247.154.226]) by mx1.freebsd.org (Postfix) with ESMTP id 012788FC1F; Mon, 11 Apr 2011 07:36:20 +0000 (UTC) X-Cloudmark-Score: 0.000000 [] X-Cloudmark-Analysis: v=1.1 cv=6QwXiDozn7Gnsf2tGidwH+ndAwLlGixx7JAIKZICKmI= c=1 sm=1 a=IU0TiZmyZPMA:10 a=w6hOy3w7ACoA:10 a=WQU8e4WWZSUA:10 a=8nJEP1OIZ-IA:10 a=CL8lFSKtTFcA:10 a=i9M/sDlu2rpZ9XS819oYzg==:17 a=L6tdpFjRAAAA:8 a=-mmsfIQ70veC9lWLMqEA:9 a=kt4sO8lXSNaTAiA8-_IA:7 a=wPNLvfGTeEIA:10 a=RZHap4myAk8A:10 a=i9M/sDlu2rpZ9XS819oYzg==:117 Received: from [188.126.198.129] (account mc467741@c2i.net HELO laptop002.hselasky.homeunix.org) by mailfe08.swip.net (CommuniGate Pro SMTP 5.2.19) with ESMTPA id 112377227; Mon, 11 Apr 2011 09:26:17 +0200 From: Hans Petter Selasky To: freebsd-hackers@freebsd.org Date: Mon, 11 Apr 2011 09:25:16 +0200 User-Agent: KMail/1.13.5 (FreeBSD/8.2-PRERELEASE; KDE/4.4.5; amd64; ; ) References: <8CDC60330838A41-18FC-694F@web-mmc-m02.sysops.aol.com> In-Reply-To: <8CDC60330838A41-18FC-694F@web-mmc-m02.sysops.aol.com> X-Face: *nPdTl_}RuAI6^PVpA02T?$%Xa^>@hE0uyUIoiha$pC:9TVgl.Oq, NwSZ4V"|LR.+tj}g5 %V,x^qOs~mnU3]Gn; cQLv&.N>TrxmSFf+p6(30a/{)KUU!s}w\IhQBj}[g}bj0I3^glmC( :AuzV9:.hESm-x4h240C`9=w MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201104110925.16882.hselasky@c2i.net> Cc: freebsd-drivers@freebsd.org, dieterbsd@engineer.com Subject: Re: Need an alternative to DELAY() X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Apr 2011 07:36:21 -0000 On Monday 11 April 2011 03:59:13 dieterbsd@engineer.com wrote: > FreeBSD 8.2 amd64 uniprocessor > > kernel: siisch1: DISCONNECT requested > kernel: siisch1: SIIS reset... > kernel: siisch1: siis_sata_connect() calling DELAY(1000) > last message repeated 59 times > kernel: siisch1: SATA connect time=60ms status=00000123 > kernel: siisch1: SIIS reset done: devices=00000001 > kernel: siisch1: DISCONNECT requested > kernel: siisch1: SIIS reset... > kernel: siisch1: siis_sata_connect() calling DELAY(1000) > last message repeated 58 times > kernel: siisch1: SATA connect time=59ms status=00000123 > ... > kernel: siisch0: siis_wait_ready() calling DELAY(1000) > last message repeated 1300 times > kernel: siisch0: port is not ready (timeout 10000ms) status = 001f2000 > > Meanwhile, *everything* comes to a screeching halt. Device > drivers are locked out, and thus incoming data is lost. > Losing incoming data is unacceptable. > > Need an alternative to DELAY() that does not lock out > other device drivers. There must be a way to reset one > bit of hardware without locking down the entire machine. Hi, An alternative to DELAY() is the simplest solution. You probably need to do some redesign in the SCSI layer to find a better solution. --HPS From owner-freebsd-drivers@FreeBSD.ORG Mon Apr 11 19:43:30 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 11A3B106564A; Mon, 11 Apr 2011 19:43:30 +0000 (UTC) (envelope-from dieterbsd@engineer.com) Received: from imr-ma03.mx.aol.com (imr-ma03.mx.aol.com [64.12.206.41]) by mx1.freebsd.org (Postfix) with ESMTP id C731E8FC0A; Mon, 11 Apr 2011 19:43:29 +0000 (UTC) Received: from imo-ma01.mx.aol.com (imo-ma01.mx.aol.com [64.12.78.136]) by imr-ma03.mx.aol.com (8.14.1/8.14.1) with ESMTP id p3BJhFus002034; Mon, 11 Apr 2011 15:43:15 -0400 Received: from dieterbsd@engineer.com by imo-ma01.mx.aol.com (mail_out_v42.9.) id n.1048.4a7d781 (55739); Mon, 11 Apr 2011 15:43:13 -0400 (EDT) Received: from smtprly-md01.mx.aol.com (smtprly-md01.mx.aol.com [64.12.143.154]) by cia-md04.mx.aol.com (v129.9) with ESMTP id MAILCIAMD046-d4154da359c5ff; Mon, 11 Apr 2011 15:43:13 -0400 Received: from web-mmc-m04 (web-mmc-m04.sim.aol.com [64.12.224.137]) by smtprly-md01.mx.aol.com (v129.9) with ESMTP id MAILSMTPRLYMD013-d4154da359c5ff; Mon, 11 Apr 2011 15:43:01 -0400 To: freebsd-hackers@freebsd.org, freebsd-drivers@freebsd.org Content-Transfer-Encoding: quoted-printable Date: Mon, 11 Apr 2011 15:43:00 -0400 X-MB-Message-Source: WebUI X-AOL-IP: 67.206.164.34 X-MB-Message-Type: User MIME-Version: 1.0 From: dieterbsd@engineer.com Content-Type: text/plain; charset="us-ascii"; format=flowed X-Mailer: Mail.com Webmail 33490-STANDARD Received: from 67.206.164.34 by web-mmc-m04.sysops.aol.com (64.12.224.137) with HTTP (WebMailUI); Mon, 11 Apr 2011 15:43:00 -0400 Message-Id: <8CDC697CCCE3652-124C-1B2@web-mmc-m04.sysops.aol.com> X-Spam-Flag: NO X-AOL-SENDER: dieterbsd@engineer.com Cc: Subject: Re: Need an alternative to DELAY() X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Apr 2011 19:43:30 -0000 >> FreeBSD 8.2 amd64 uniprocessor >> >> kernel: siisch1: DISCONNECT requested >> kernel: siisch1: SIIS reset... >> kernel: siisch1: siis_sata_connect() calling DELAY(1000) >> last message repeated 59 times >> kernel: siisch1: SATA connect time=3D60ms status=3D00000123 >> kernel: siisch1: SIIS reset done: devices=3D00000001 >> kernel: siisch1: DISCONNECT requested >> kernel: siisch1: SIIS reset... >> kernel: siisch1: siis_sata_connect() calling DELAY(1000) >> last message repeated 58 times >> kernel: siisch1: SATA connect time=3D59ms status=3D00000123 >> ... >> kernel: siisch0: siis_wait_ready() calling DELAY(1000) >> last message repeated 1300 times >> kernel: siisch0: port is not ready (timeout 10000ms) status =3D=20 001f2000 >> >> Meanwhile, *everything* comes to a screeching halt. Device >> drivers are locked out, and thus incoming data is lost. >> Losing incoming data is unacceptable. >> >> Need an alternative to DELAY() that does not lock out >> other device drivers. There must be a way to reset one >> bit of hardware without locking down the entire machine. Hans Petter Selasky writes: > An alternative to DELAY() is the simplest solution. You probably need > to do some redesign in the SCSI layer to find a better solution. I keep coming back to the idea that a device driver for one controller should not have to lock out *all* the hardware. RS-232 locks out Ethernet. Disk drivers lock out Ethernet. And so on. Why? Is there some fundamental reason that this *has* to be? I thought the conversion from spl() to mutex() was supposed to fix this? I'm making progress on my project converting printf(9) calls to log(9), and fixing some bugs along the way. Eventually I'll have patches to submit. But this is really a workaround, not a fix to the underlying problem. Redesigning the SCSI layer sounds like a job for someone who took a lot more CS classes than I did. /dev/brain returns ENOCLUE. :-( From owner-freebsd-drivers@FreeBSD.ORG Mon Apr 11 23:27:38 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9566E106566B; Mon, 11 Apr 2011 23:27:38 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from harmony.bsdimp.com (bsdimp.com [199.45.160.85]) by mx1.freebsd.org (Postfix) with ESMTP id 345538FC1B; Mon, 11 Apr 2011 23:27:38 +0000 (UTC) Received: from [10.30.101.53] ([209.117.142.2]) (authenticated bits=0) by harmony.bsdimp.com (8.14.4/8.14.3) with ESMTP id p3BNMetI069314 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES128-SHA bits=128 verify=NO); Mon, 11 Apr 2011 17:22:43 -0600 (MDT) (envelope-from imp@bsdimp.com) Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=us-ascii From: Warner Losh In-Reply-To: <8CDC697CCCE3652-124C-1B2@web-mmc-m04.sysops.aol.com> Date: Mon, 11 Apr 2011 17:22:34 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <2BD9089E-874C-41BB-80B1-25B0DDE489C4@bsdimp.com> References: <8CDC697CCCE3652-124C-1B2@web-mmc-m04.sysops.aol.com> To: dieterbsd@engineer.com X-Mailer: Apple Mail (2.1082) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (harmony.bsdimp.com [10.0.0.6]); Mon, 11 Apr 2011 17:22:44 -0600 (MDT) Cc: freebsd-hackers@freebsd.org, freebsd-drivers@freebsd.org Subject: Re: Need an alternative to DELAY() X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Apr 2011 23:27:38 -0000 I don't suppose that your driver could cause the hardware to interrupt = after a little time? That would be more resource friendly... = Otherwise, 1ms is long enough that a msleep or tsleep would likely work = quite nicely. Warner On Apr 11, 2011, at 1:43 PM, dieterbsd@engineer.com wrote: >>> FreeBSD 8.2 amd64 uniprocessor >>>=20 >>> kernel: siisch1: DISCONNECT requested >>> kernel: siisch1: SIIS reset... >>> kernel: siisch1: siis_sata_connect() calling DELAY(1000) >>> last message repeated 59 times >>> kernel: siisch1: SATA connect time=3D60ms status=3D00000123 >>> kernel: siisch1: SIIS reset done: devices=3D00000001 >>> kernel: siisch1: DISCONNECT requested >>> kernel: siisch1: SIIS reset... >>> kernel: siisch1: siis_sata_connect() calling DELAY(1000) >>> last message repeated 58 times >>> kernel: siisch1: SATA connect time=3D59ms status=3D00000123 >>> ... >>> kernel: siisch0: siis_wait_ready() calling DELAY(1000) >>> last message repeated 1300 times >>> kernel: siisch0: port is not ready (timeout 10000ms) status =3D=20 > 001f2000 >>>=20 >>> Meanwhile, *everything* comes to a screeching halt. Device >>> drivers are locked out, and thus incoming data is lost. >>> Losing incoming data is unacceptable. >>>=20 >>> Need an alternative to DELAY() that does not lock out >>> other device drivers. There must be a way to reset one >>> bit of hardware without locking down the entire machine. >=20 > Hans Petter Selasky writes: >> An alternative to DELAY() is the simplest solution. You probably need >> to do some redesign in the SCSI layer to find a better solution. >=20 > I keep coming back to the idea that a device driver for one > controller should not have to lock out *all* the hardware. > RS-232 locks out Ethernet. Disk drivers lock out Ethernet. > And so on. Why? Is there some fundamental reason that this > *has* to be? I thought the conversion from spl() to mutex() > was supposed to fix this? >=20 > I'm making progress on my project converting printf(9) calls > to log(9), and fixing some bugs along the way. Eventually I'll > have patches to submit. But this is really a workaround, not > a fix to the underlying problem. >=20 > Redesigning the SCSI layer sounds like a job for someone who took > a lot more CS classes than I did. /dev/brain returns ENOCLUE. :-( >=20 >=20 > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to = "freebsd-hackers-unsubscribe@freebsd.org" >=20 >=20 From owner-freebsd-drivers@FreeBSD.ORG Tue Apr 12 13:43:45 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2BB1710656D6 for ; Tue, 12 Apr 2011 13:43:45 +0000 (UTC) (envelope-from bcketchum@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id B5EEE8FC16 for ; Tue, 12 Apr 2011 13:43:44 +0000 (UTC) Received: by bwz12 with SMTP id 12so6943176bwz.13 for ; Tue, 12 Apr 2011 06:43:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=DHJsltmkdNXI5PS5B+V3wGGOWcELoNEA780zBlN+aSo=; b=iR1Au+npOFejYS2hY16jE7Qx0buOVSd1e8bLmD4qd2Kr/MIOOoKVu9ELiFRzASu8X3 kQiuTUXV/KE0+re6Qe8bKjHEGjnHqkePyx1jZYtJxYy5ox7sE+fyD1z6XPDxWOhK6ZjT oDjVwbOBOY3q7YXuoJwhdl0D8d6tlVhgzVjuw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=q3rAJg21z4MEOs2s/NYQQQjK9Na3dVdp0Um+uHB/+pRgW6GwGnNYAXJgymha6s9NHI +0Udx8jN8vcBBGAXjROkXmaXsaIgfQkqlMIC5cjmPHm1u+dZmC4BAgXEmRcsABtwy3Or Z/RkgUvnPtasNaTPYqvdy5Ak4OrytWt9QKOJM= MIME-Version: 1.0 Received: by 10.204.84.203 with SMTP id k11mr255501bkl.10.1302614176744; Tue, 12 Apr 2011 06:16:16 -0700 (PDT) Received: by 10.204.36.77 with HTTP; Tue, 12 Apr 2011 06:16:16 -0700 (PDT) Date: Tue, 12 Apr 2011 08:16:16 -0500 Message-ID: From: Bret Ketchum To: freebsd-drivers@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: MSI interrupts. X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Apr 2011 13:43:45 -0000 I've a roll-your-own driver for FreeBSD 8.x that uses MSI interrupts for PCI-E HBAs where one or more will be installed in a off-the-shelf amd64 pizza box. The driver is using bus_setup_intr() and depending upon the slots the HBAs are install I see log messages from apic_alloc_vectors(), for example: Apr 12 06:44:15 mfsbsd kernel: xxxpci10: attempting to allocate 1 MSI vectors (16 supported) Apr 12 06:44:15 mfsbsd kernel: APIC: Couldn't find APIC vectors for 1 IRQs Apr 12 06:44:15 mfsbsd kernel: ioapic1: routing intpin 13 (PCI IRQ 37) to lapic 0 vector 59 Using vmstat -ia: interrupt total rate irq37: xxxpci10 74 0 The problem appears to be that HBA interrupts are not being delivered to the driver. If I swap cards around in slots I can eliminate the message and: Apr 12 06:44:15 mfsbsd kernel: msi: routing MSI IRQ 266 to local APIC 0 vector 80 And interrupts appear to be delivered properly. Before I dive in, can anyone explain this behavior? Thanks in advance. Dr. From owner-freebsd-drivers@FreeBSD.ORG Tue Apr 12 14:21:22 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F01F6106564A for ; Tue, 12 Apr 2011 14:21:22 +0000 (UTC) (envelope-from jamesbrandongooch@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8AFE68FC12 for ; Tue, 12 Apr 2011 14:21:22 +0000 (UTC) Received: by wyf23 with SMTP id 23so6561695wyf.13 for ; Tue, 12 Apr 2011 07:21:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=TOJ1S2WvWQK4Vd401eCb3SOHG6J48wxDKQFG74TU1Ss=; b=UoqGfZUCkUB68Xb0b8KLwGkRZ/L5wGjBdznemjwWbWJTVtjKwiSH7FBLk+Iqc4b88A H/mfmWMM4l5Fc+j/UtDWT9P312uy9fAm8hpKMTjww/p1GYxUvoL/HsCHgEBILzhB2aS+ mcLxZRQxhsV/eqo9eDHdUlhFudOywBOaNu2Bs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=qgbf1WuTbvLz49PpG5ZQiTE307jDeYX8O+D2n3buTe8iH/QUaxKMF3Hsph3QLrFftl ddYmtoVl9APSTmJH6mV8VjVNCLec0SRBSUw36kv6EO3gm9Sv/nlxNPaGmVzAaaiVXaqM sA3pHEeYXlZw1gUgxDM2N5CRk23zCLDHuLzH0= MIME-Version: 1.0 Received: by 10.216.136.91 with SMTP id v69mr4121872wei.16.1302616771792; Tue, 12 Apr 2011 06:59:31 -0700 (PDT) Received: by 10.216.138.29 with HTTP; Tue, 12 Apr 2011 06:59:31 -0700 (PDT) In-Reply-To: References: Date: Tue, 12 Apr 2011 08:59:31 -0500 Message-ID: From: Brandon Gooch To: Bret Ketchum Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-drivers@freebsd.org Subject: Re: MSI interrupts. X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Apr 2011 14:21:23 -0000 On Tue, Apr 12, 2011 at 8:16 AM, Bret Ketchum wrote: > =A0 =A0I've a roll-your-own driver for FreeBSD 8.x that uses MSI interrup= ts for > PCI-E HBAs where one or more will be installed in a off-the-shelf amd64 > pizza box. The driver is using bus_setup_intr() and depending upon the sl= ots > the HBAs are install I see log messages from apic_alloc_vectors(), for > example: > > Apr 12 06:44:15 mfsbsd kernel: xxxpci10: attempting to allocate 1 MSI > vectors (16 supported) > Apr 12 06:44:15 mfsbsd kernel: APIC: Couldn't find APIC vectors for 1 IRQ= s > Apr 12 06:44:15 mfsbsd kernel: ioapic1: routing intpin 13 (PCI IRQ 37) to > lapic 0 vector 59 > > =A0 =A0Using vmstat -ia: > > interrupt =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0total =A0 = =A0 =A0 rate > irq37: xxxpci10 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 74 =A0 =A0 = =A0 =A0 =A00 > > =A0 =A0The problem appears to be that HBA interrupts are not being delive= red to > the driver. If I swap cards around in slots I can eliminate the message a= nd: > > Apr 12 06:44:15 mfsbsd kernel: msi: routing MSI IRQ 266 to local APIC 0 > vector 80 > > =A0 =A0And interrupts appear to be delivered properly. Before I dive in, = can > anyone explain this behavior? > > =A0 =A0Thanks in advance. > > =A0 =A0Dr. Can you provide output from a verbose boot of the system (perhaps one for each variation of card installation in the slots)? That may shed some light for the developers... -Brandon From owner-freebsd-drivers@FreeBSD.ORG Thu Apr 14 18:15:23 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E91F1065676; Thu, 14 Apr 2011 18:15:23 +0000 (UTC) (envelope-from dieterbsd@engineer.com) Received: from imr-da03.mx.aol.com (imr-da03.mx.aol.com [205.188.105.145]) by mx1.freebsd.org (Postfix) with ESMTP id 2A7298FC14; Thu, 14 Apr 2011 18:15:22 +0000 (UTC) Received: from imo-ma04.mx.aol.com (imo-ma04.mx.aol.com [64.12.78.139]) by imr-da03.mx.aol.com (8.14.1/8.14.1) with ESMTP id p3EIEtY4007964; Thu, 14 Apr 2011 14:14:55 -0400 Received: from dieterbsd@engineer.com by imo-ma04.mx.aol.com (mail_out_v42.9.) id n.fca.f31f65d (44669); Thu, 14 Apr 2011 14:14:51 -0400 (EDT) Received: from smtprly-dd01.mx.aol.com (smtprly-dd01.mx.aol.com [205.188.84.129]) by cia-mc01.mx.aol.com (v129.9) with ESMTP id MAILCIAMC018-d3e64da7399933d; Thu, 14 Apr 2011 14:14:51 -0400 Received: from web-mmc-m04 (web-mmc-m04.sim.aol.com [64.12.224.137]) by smtprly-dd01.mx.aol.com (v129.9) with ESMTP id MAILSMTPRLYDD012-d3e64da7399933d; Thu, 14 Apr 2011 14:14:49 -0400 To: mav@freebsd.org Content-Transfer-Encoding: quoted-printable Date: Thu, 14 Apr 2011 14:14:49 -0400 X-AOL-IP: 67.206.162.44 X-MB-Message-Source: WebUI Received: from 67.206.162.44 by web-mmc-m04.sysops.aol.com (64.12.224.137) with HTTP (WebMailUI); Thu, 14 Apr 2011 14:14:49 -0400 MIME-Version: 1.0 From: dieterbsd@engineer.com X-MB-Message-Type: User Content-Type: text/plain; charset="us-ascii"; format=flowed X-Mailer: Mail.com Webmail 33540-STANDARD Message-Id: <8CDC8E6FA136231-29B0-2128@web-mmc-m04.sysops.aol.com> X-Spam-Flag: NO X-AOL-SENDER: dieterbsd@engineer.com Cc: freebsd-hackers@freebsd.org, freebsd-drivers@freebsd.org Subject: (no subject) X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2011 18:15:23 -0000 [ Email attempt #3 and counting... ] Alexander Motin wrote: >> Warner Losh wrote: >>> I don't suppose that your driver could cause the hardware to=20 interrupt after a little time? That would be more resource friendly...=20 Otherwise, 1ms is long enough that a msleep or tsleep would likely=20 work quite nicely. >> >> It's not his driver, it's mine. Actually, unlike AHCI, this hardware >> even has interrupt for ready transition (second, biggest of sleeps).=20 But >> it is not used in present situation. >> >>> On Apr 11, 2011, at 1:43 PM, dieterbsd@engineer.com wrote: >>>>>> FreeBSD 8.2 amd64 uniprocessor >>>>>> >>>>>> kernel: siisch1: DISCONNECT requested >>>>>> kernel: siisch1: SIIS reset... >>>>>> kernel: siisch1: siis_sata_connect() calling DELAY(1000) >>>>>> last message repeated 59 times >>>>>> kernel: siisch1: SATA connect time=3D60ms status=3D00000123 >>>>>> kernel: siisch1: SIIS reset done: devices=3D00000001 >>>>>> kernel: siisch1: DISCONNECT requested >>>>>> kernel: siisch1: SIIS reset... >>>>>> kernel: siisch1: siis_sata_connect() calling DELAY(1000) >>>>>> last message repeated 58 times >>>>>> kernel: siisch1: SATA connect time=3D59ms status=3D00000123 >>>>>> ... >>>>>> kernel: siisch0: siis_wait_ready() calling DELAY(1000) >>>>>> last message repeated 1300 times >>>>>> kernel: siisch0: port is not ready (timeout 10000ms) status =3D >>>> 001f2000 >>>>>> Meanwhile, *everything* comes to a screeching halt. Device >>>>>> drivers are locked out, and thus incoming data is lost. >>>>>> Losing incoming data is unacceptable. >>>>>> >>>>>> Need an alternative to DELAY() that does not lock out >>>>>> other device drivers. There must be a way to reset one >>>>>> bit of hardware without locking down the entire machine. >>>> Hans Petter Selasky writes: >>>>> An alternative to DELAY() is the simplest solution. You probably=20 need >>>>> to do some redesign in the SCSI layer to find a better solution. >>>> I keep coming back to the idea that a device driver for one >>>> controller should not have to lock out *all* the hardware. >>>> RS-232 locks out Ethernet. Disk drivers lock out Ethernet. >>>> And so on. Why? Is there some fundamental reason that this >>>> *has* to be? I thought the conversion from spl() to mutex() >>>> was supposed to fix this? >>>> >>>> I'm making progress on my project converting printf(9) calls >>>> to log(9), and fixing some bugs along the way. Eventually I'll >>>> have patches to submit. But this is really a workaround, not >>>> a fix to the underlying problem. >>>> >>>> Redesigning the SCSI layer sounds like a job for someone who took >>>> a lot more CS classes than I did. /dev/brain returns ENOCLUE. :-( >> >> CAM is not completely innocent in this situation indeed. CAM defines >> XPT_RESET_BUS request as synchronous. It is not queued, and called=20 under >> the SIM mutex lock. I don't think lock can be safely dropped in the >> middle there. >> >> Now I think that I could try to move readiness waiting out of the >> siis_reset() to do it asynchronously. I'll think about it. > > I've fixed this problem for ahci(4) in HEAD, there should be no sleeps > longer then 100ms now (typical 1-2ms). > > With siis(4) the situation is different. There by default should be no > sleeps longer then 100ms (typical 1-2ms). Longer sleep means that=20 either > controller is not responding, or it can't establish link to device it > sees. I've reduced waiting timeout from 10s to 1s. It should improve > situation a bit, but I would look for the original problem cause. Have > you done something specific to trigger it? Are your drive/cables OK? Thank you for your prompt attention to this problem, it is very much appreciated. (losing data sucks) However, 100 ms is still way too long. (assuming ms =3D milliseconds) 1 millisecond is dangerous, if Ethernet is locked out for approx 4 milliseconds there is guaranteed data loss. I'd like to see something more like 100 microseconds worst case (for TCP). Closed source closed hardware black box generates data, has a very small output buffer, cannot be changed. In some cases it insists on using UDP rather than TCP so dropping even a single packet screws up the data. I have cranked the TCP and UDP receive buffer sizes way up, I'm reading the ports at rtprio into a large buffer locked into main memory, etc. etc. Most of the time it works. But if a device driver takes too long, incoming Ethernet packets do not get serviced in time, and I lose data. A device driver doing printf(9) to the RS-232 console is too slow. Changing printf to log(9) works around this. If a disk controller, port multiplier, or disk has a hiccup, I lose data. Siis(4) is the current problem, but IIRC I've had problems from ahci(4) and ata(4) in the past. I'm currently using all three drivers. Is there any way I can keep the Ethernet from being locked out by other drivers? From owner-freebsd-drivers@FreeBSD.ORG Fri Apr 15 16:42:41 2011 Return-Path: Delivered-To: freebsd-drivers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7109C106566B for ; Fri, 15 Apr 2011 16:42:41 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 48ED08FC1D for ; Fri, 15 Apr 2011 16:42:41 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id DFDBE46B09; Fri, 15 Apr 2011 12:42:40 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 709EE8A01B; Fri, 15 Apr 2011 12:42:40 -0400 (EDT) From: John Baldwin To: freebsd-drivers@freebsd.org Date: Fri, 15 Apr 2011 12:11:49 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110325; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201104151211.49148.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Fri, 15 Apr 2011 12:42:40 -0400 (EDT) Cc: Bret Ketchum Subject: Re: MSI interrupts. X-BeenThere: freebsd-drivers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Writing device drivers for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Apr 2011 16:42:41 -0000 On Tuesday, April 12, 2011 9:16:16 am Bret Ketchum wrote: > I've a roll-your-own driver for FreeBSD 8.x that uses MSI interrupts for > PCI-E HBAs where one or more will be installed in a off-the-shelf amd64 > pizza box. The driver is using bus_setup_intr() and depending upon the slots > the HBAs are install I see log messages from apic_alloc_vectors(), for > example: > > Apr 12 06:44:15 mfsbsd kernel: xxxpci10: attempting to allocate 1 MSI > vectors (16 supported) > Apr 12 06:44:15 mfsbsd kernel: APIC: Couldn't find APIC vectors for 1 IRQs > Apr 12 06:44:15 mfsbsd kernel: ioapic1: routing intpin 13 (PCI IRQ 37) to > lapic 0 vector 59 > > Using vmstat -ia: > > interrupt total rate > irq37: xxxpci10 74 0 > > The problem appears to be that HBA interrupts are not being delivered to > the driver. If I swap cards around in slots I can eliminate the message and: > > Apr 12 06:44:15 mfsbsd kernel: msi: routing MSI IRQ 266 to local APIC 0 > vector 80 > > And interrupts appear to be delivered properly. Before I dive in, can > anyone explain this behavior? Hmm, can you capture messages with bootverbose enabled? A full boot -v dmesg might be useful as well. -- John Baldwin