Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Apr 2009 20:36:38 +0200
From:      Andreas Tobler <andreast-list@fgznet.ch>
To:        Sean Bruno <sean.bruno@dsl-only.net>
Cc:        freebsd-firewire <freebsd-firewire@freebsd.org>, scottl <scottl@freebsd.org>, Marius Strobl <marius@alchemy.franken.de>
Subject:   Re: fwochi.c and bus_space_barrier()
Message-ID:  <49ECC0B6.5000804@fgznet.ch>
In-Reply-To: <1240248579.29756.4.camel@localhost.localdomain>
References:  <1239382529.21481.7.camel@localhost.localdomain>	 <20090411154000.GG8143@alchemy.franken.de>	 <1239600457.24831.8.camel@localhost.localdomain>	 <49E2F2FA.6000204@fgznet.ch>	 <1239639423.24831.85.camel@localhost.localdomain>	 <20090413170537.GI8143@alchemy.franken.de>	 <1239643406.24831.95.camel@localhost.localdomain>	 <20090413173528.GJ8143@alchemy.franken.de>	 <1239646889.24831.135.camel@localhost.localdomain>	 <20090414184741.GK8143@alchemy.franken.de> <49E4DF9F.1090804@fgznet.ch>	 <1239814413.15474.2.camel@localhost.localdomain>	 <49E61B4D.1050209@fgznet.ch>	 <1239819547.15474.5.camel@localhost.localdomain>	 <49E633C7.9030909@fgznet.ch>	 <1239826803.15474.48.camel@localhost.localdomain>	 <49E7931C.8050603@fgznet.ch> <1240248579.29756.4.camel@localhost.localdomain>

next in thread | previous in thread | raw e-mail | index | archive | help
Sean Bruno wrote:
> On Thu, 2009-04-16 at 22:20 +0200, Andreas Tobler wrote:
>> Sean Bruno wrote:
>>>>> You may want to retry several times.  Like you pointed out in earlier
>>>>> posts, this issue seems to be a race condition.
>>>> Heh, now I remember, I did not speak about a race condition, but about a 
>>>> timing issue.
>>>>
>>>> If I leave the printfs away, it panics here.
>>>>
>>>> for (lps = 0, lps_counter = 0; !lps && lps_counter < 3; lps_counter++) {
>>>>                  lps = (OREAD(sc, OHCI_HCCCTL) & OHCI_HCC_LPS);
>>>>                  if (!lps) {
>>>>                          pause("fwlps", (50 * hz + 999) / 1000);
>>>>                          device_printf(dev, "lps not set, 
>>>> attempt(%d)\n", lps_cou
>>>> nter);
>>>>                  } /* else
>>>>                          device_printf(dev, "lps(%0x) set\n", lps);*/
>>>>          }
>>>>
>>>> In my case the lps is not NULL, so we print something in the first run 
>>>> of the loop, this print statement is enough 'time' for the card to come 
>>>> up. If we leave the printf away, it is not enough time to come up for 
>>>> the card. Panic.
>>>>
>>>> This was the same thing I reported, adding a printf statement at the 
>>>> beginning of fwphy_rddata cures my panic.
>>>>
>>>> So I'd suggest to leave the lps test away and add always a pause(9), or 
>>>> does this cause headache on other archs?
>>>>
>>>> Thanks,
>>>> Andreas
>>>>
>>>
>>> Ok, I think I've finally caught up to Marius (at least in this
>>> situation).  
>>>
>>> The *ACTUAL* issue is that fwochi_probe_phy() code isn't properly
>>> handling the transition state from LPS==0 to LPS==1.  In this period of
>>> time, the internal SCLK on the firewire board may have not started yet.
>>> There can be a period of time between the value of LPS==1 and the SCLK
>>> actually starting.  
>>>
>>> fwphy_rddata() appears to be *trying* to deal with this, but is
>>> obviously failing.  
>>>
>>> So "lps" has been set, but the PHY is not up yet.  In order to access
>>> PHY resources, we have to wait for SCLK to start(OHCI spec v1.1 table
>>> 6.1).
>>>
>>> I believe your error is defined in the OHCI spec, Appendix A.6, PCI Bus
>>> Errors.  The bus error is supposed to happen!  :)  The driver just isn't
>>> handling the error case properly.
>>>
>>> The proper fix is to handle the ERROR according to spec.  I will work on
>>> a proper solution this weekend.  In the meantime, here is a patch to get
>>> you by based on the pause() mechanism.
>>>
> 
> 
> Here is a different, more generic patch that seems to work for me on my
> machines.
> 
> Essentially, make fwphy_rddata() responsible for catching the error and
> implementing the pause().  This *should* have the same effect, unless I
> don't understand what I'm doing.
> 
> I have eliminated the LPS check for now(see ifdef's) in
> fwohci_probe_phy().
> 
> If this works, I will cleanup the ifdef's before I commit.

Unfortunately it doesn't. For me it looks the same as the original trace.

Sorry for the bad news.
Andreas

u60# kldload firewire
fwohci0: <Texas Instruments TSB12LV23> mem 
0x4008000-0x40087ff,0x400c000-0x400ff
ff irq 2008 at device 4.0 on pci0
fwohci0: latency timer 24 -> 32.
fwohci0: cache size 16 -> 16.
fwohci0: [ITHREAD]
fwohci0: OHCI version 1.0 (ROM=1)
fwohci0: No. of Isochronous channels is 4.
fwohci0: EUI64 00:10:74:60:00:00:ee:a9
fwohci0: resetting OHCI...done (loop=0)
panic: pcib: PCI bus B error AFAR 0x1ff840080ec AFSR 0x4000f00000000000
cpuid = 0
KDB: stack backtrace:
panic() at 0xc03316e8 = panic+0x1c8
psycho_pci_bus() at 0xc060aac8 = psycho_pci_bus+0x88
intr_event_handle() at 0xc0309b3c = intr_event_handle+0x5c
intr_execute_handlers() at 0xc061aab4 = intr_execute_handlers+0x14
intr_fast() at 0xc00812f0 = intr_fast+0x50
-- interrupt level=0xd pil=0 %o7=0xc0f9cccc --
fwphy_rddata() at 0xc0f9cd0c = fwphy_rddata+0x12c
fwohci_reset() at 0xc0fa113c = fwohci_reset+0x1fc
fwohci_init() at 0xc0fa22b4 = fwohci_init+0x9d4
fwohci_pci_attach() at 0xc0fa2d58 = fwohci_pci_attach+0x278
device_attach() at 0xc03600e4 = device_attach+0x4a4
device_probe_and_attach() at 0xc0361648 = device_probe_and_attach+0x28
pci_driver_added() at 0xc021c7f4 = pci_driver_added+0x154
devclass_driver_added() at 0xc035df34 = devclass_driver_added+0x74
devclass_add_driver() at 0xc035eb3c = devclass_add_driver+0x7c
driver_module_handler() at 0xc035fb78 = driver_module_handler+0x58
module_register_init() at 0xc031e934 = module_register_init+0x154
linker_load_module() at 0xc0315478 = linker_load_module+0xb38
kern_kldload() at 0xc0315a28 = kern_kldload+0xc8
kldload() at 0xc0315c20 = kldload+0x60
syscall() at 0xc062c1e8 = syscall+0x2e8
-- syscall (304, FreeBSD ELF64, kldload) %o7=0x1008e0 --
userland() at 0x4045dd68
user trace: trap %o7=0x1008e0
pc 0x4045dd68, sp 0x7fdffffe1e1
pc 0x1006f0, sp 0x7fdffffe2a1
pc 0x40206954, sp 0x7fdffffe361
done
KDB: enter: panic
[thread pid 1128 tid 100081 ]
Stopped at      0xc0367b40 = kdb_enter+0x80:    ta              %xcc, 1
db>





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49ECC0B6.5000804>