From owner-freebsd-firewire@FreeBSD.ORG Mon Apr 20 18:36:57 2009 Return-Path: Delivered-To: freebsd-firewire@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD88F1065679; Mon, 20 Apr 2009 18:36:57 +0000 (UTC) (envelope-from andreast-list@fgznet.ch) Received: from smtp.fgznet.ch (mail.fgznet.ch [81.92.96.47]) by mx1.freebsd.org (Postfix) with ESMTP id 66BAD8FC1F; Mon, 20 Apr 2009 18:36:56 +0000 (UTC) (envelope-from andreast-list@fgznet.ch) Received: from wolfram.andreas.nets ([91.190.8.131]) by smtp.fgznet.ch (8.13.8/8.13.8/Submit_SMTPAUTH) with ESMTP id n3KIacP5039234; Mon, 20 Apr 2009 20:36:39 +0200 (CEST) (envelope-from andreast-list@fgznet.ch) Message-ID: <49ECC0B6.5000804@fgznet.ch> Date: Mon, 20 Apr 2009 20:36:38 +0200 From: Andreas Tobler User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302) MIME-Version: 1.0 To: Sean Bruno References: <1239382529.21481.7.camel@localhost.localdomain> <20090411154000.GG8143@alchemy.franken.de> <1239600457.24831.8.camel@localhost.localdomain> <49E2F2FA.6000204@fgznet.ch> <1239639423.24831.85.camel@localhost.localdomain> <20090413170537.GI8143@alchemy.franken.de> <1239643406.24831.95.camel@localhost.localdomain> <20090413173528.GJ8143@alchemy.franken.de> <1239646889.24831.135.camel@localhost.localdomain> <20090414184741.GK8143@alchemy.franken.de> <49E4DF9F.1090804@fgznet.ch> <1239814413.15474.2.camel@localhost.localdomain> <49E61B4D.1050209@fgznet.ch> <1239819547.15474.5.camel@localhost.localdomain> <49E633C7.9030909@fgznet.ch> <1239826803.15474.48.camel@localhost.localdomain> <49E7931C.8050603@fgznet.ch> <1240248579.29756.4.camel@localhost.localdomain> In-Reply-To: <1240248579.29756.4.camel@localhost.localdomain> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.64 on 81.92.96.47 Cc: freebsd-firewire , scottl , Marius Strobl Subject: Re: fwochi.c and bus_space_barrier() X-BeenThere: freebsd-firewire@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Firewire support in FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Apr 2009 18:36:58 -0000 Sean Bruno wrote: > On Thu, 2009-04-16 at 22:20 +0200, Andreas Tobler wrote: >> Sean Bruno wrote: >>>>> You may want to retry several times. Like you pointed out in earlier >>>>> posts, this issue seems to be a race condition. >>>> Heh, now I remember, I did not speak about a race condition, but about a >>>> timing issue. >>>> >>>> If I leave the printfs away, it panics here. >>>> >>>> for (lps = 0, lps_counter = 0; !lps && lps_counter < 3; lps_counter++) { >>>> lps = (OREAD(sc, OHCI_HCCCTL) & OHCI_HCC_LPS); >>>> if (!lps) { >>>> pause("fwlps", (50 * hz + 999) / 1000); >>>> device_printf(dev, "lps not set, >>>> attempt(%d)\n", lps_cou >>>> nter); >>>> } /* else >>>> device_printf(dev, "lps(%0x) set\n", lps);*/ >>>> } >>>> >>>> In my case the lps is not NULL, so we print something in the first run >>>> of the loop, this print statement is enough 'time' for the card to come >>>> up. If we leave the printf away, it is not enough time to come up for >>>> the card. Panic. >>>> >>>> This was the same thing I reported, adding a printf statement at the >>>> beginning of fwphy_rddata cures my panic. >>>> >>>> So I'd suggest to leave the lps test away and add always a pause(9), or >>>> does this cause headache on other archs? >>>> >>>> Thanks, >>>> Andreas >>>> >>> >>> Ok, I think I've finally caught up to Marius (at least in this >>> situation). >>> >>> The *ACTUAL* issue is that fwochi_probe_phy() code isn't properly >>> handling the transition state from LPS==0 to LPS==1. In this period of >>> time, the internal SCLK on the firewire board may have not started yet. >>> There can be a period of time between the value of LPS==1 and the SCLK >>> actually starting. >>> >>> fwphy_rddata() appears to be *trying* to deal with this, but is >>> obviously failing. >>> >>> So "lps" has been set, but the PHY is not up yet. In order to access >>> PHY resources, we have to wait for SCLK to start(OHCI spec v1.1 table >>> 6.1). >>> >>> I believe your error is defined in the OHCI spec, Appendix A.6, PCI Bus >>> Errors. The bus error is supposed to happen! :) The driver just isn't >>> handling the error case properly. >>> >>> The proper fix is to handle the ERROR according to spec. I will work on >>> a proper solution this weekend. In the meantime, here is a patch to get >>> you by based on the pause() mechanism. >>> > > > Here is a different, more generic patch that seems to work for me on my > machines. > > Essentially, make fwphy_rddata() responsible for catching the error and > implementing the pause(). This *should* have the same effect, unless I > don't understand what I'm doing. > > I have eliminated the LPS check for now(see ifdef's) in > fwohci_probe_phy(). > > If this works, I will cleanup the ifdef's before I commit. Unfortunately it doesn't. For me it looks the same as the original trace. Sorry for the bad news. Andreas u60# kldload firewire fwohci0: mem 0x4008000-0x40087ff,0x400c000-0x400ff ff irq 2008 at device 4.0 on pci0 fwohci0: latency timer 24 -> 32. fwohci0: cache size 16 -> 16. fwohci0: [ITHREAD] fwohci0: OHCI version 1.0 (ROM=1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:10:74:60:00:00:ee:a9 fwohci0: resetting OHCI...done (loop=0) panic: pcib: PCI bus B error AFAR 0x1ff840080ec AFSR 0x4000f00000000000 cpuid = 0 KDB: stack backtrace: panic() at 0xc03316e8 = panic+0x1c8 psycho_pci_bus() at 0xc060aac8 = psycho_pci_bus+0x88 intr_event_handle() at 0xc0309b3c = intr_event_handle+0x5c intr_execute_handlers() at 0xc061aab4 = intr_execute_handlers+0x14 intr_fast() at 0xc00812f0 = intr_fast+0x50 -- interrupt level=0xd pil=0 %o7=0xc0f9cccc -- fwphy_rddata() at 0xc0f9cd0c = fwphy_rddata+0x12c fwohci_reset() at 0xc0fa113c = fwohci_reset+0x1fc fwohci_init() at 0xc0fa22b4 = fwohci_init+0x9d4 fwohci_pci_attach() at 0xc0fa2d58 = fwohci_pci_attach+0x278 device_attach() at 0xc03600e4 = device_attach+0x4a4 device_probe_and_attach() at 0xc0361648 = device_probe_and_attach+0x28 pci_driver_added() at 0xc021c7f4 = pci_driver_added+0x154 devclass_driver_added() at 0xc035df34 = devclass_driver_added+0x74 devclass_add_driver() at 0xc035eb3c = devclass_add_driver+0x7c driver_module_handler() at 0xc035fb78 = driver_module_handler+0x58 module_register_init() at 0xc031e934 = module_register_init+0x154 linker_load_module() at 0xc0315478 = linker_load_module+0xb38 kern_kldload() at 0xc0315a28 = kern_kldload+0xc8 kldload() at 0xc0315c20 = kldload+0x60 syscall() at 0xc062c1e8 = syscall+0x2e8 -- syscall (304, FreeBSD ELF64, kldload) %o7=0x1008e0 -- userland() at 0x4045dd68 user trace: trap %o7=0x1008e0 pc 0x4045dd68, sp 0x7fdffffe1e1 pc 0x1006f0, sp 0x7fdffffe2a1 pc 0x40206954, sp 0x7fdffffe361 done KDB: enter: panic [thread pid 1128 tid 100081 ] Stopped at 0xc0367b40 = kdb_enter+0x80: ta %xcc, 1 db>