Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Mar 2009 21:49:26 +0200
From:      Andreas Tobler <andreast-list@fgznet.ch>
To:        Marius Strobl <marius@alchemy.franken.de>
Cc:        freebsd-sparc64@freebsd.org
Subject:   Re: kernel panic with firewire PCI card
Message-ID:  <49D12246.3080905@fgznet.ch>
In-Reply-To: <20090330191239.GA74661@alchemy.franken.de>
References:  <49CD39B7.3050500@nexus-ag.com> <20090328214138.GA93149@alchemy.franken.de> <49CFBAC6.5030809@fgznet.ch> <20090330191239.GA74661@alchemy.franken.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Marius Strobl wrote:
> On Sun, Mar 29, 2009 at 08:15:34PM +0200, Andreas Tobler wrote:
>> Marius Strobl wrote:
>>> PCI AFSR 0x4000000000000000 indicates that the primary error
>>> was a target abort. Given that no DMA is involved at this stage
>>> this means it actually was the OHCI chip which complained
>>> about the PIO access. If this is the first access after the
>>> reset (check with "l *(0xc0659be4)", "l *(fwphy_rddata+0xe8)"
>>> and "l *(fwohci_reset+0x298)" in gdb on the corresponding
>>> kernel.debug what code is actually involved) I'd suspect
>>> the problem to be a combination of a sloppy driver with a
>>> chip that takes some more time than the other contenders
>>> to get ready again after a reset, i.e. fwohci_reset() only
>>> tries 100 times with waiting one millisecond between tries
>>> for OHCI_HCC_RESET to clear after the reset (the latter part
>>> is in line with the OHCI specification). Increasing to f.e.
>>> 1000 tries should solve the panic then, if this is actually
>>> the cause. Generally fwohci(4) should be changed to fail if
>>> the chip doesn't become ready again after a reset instead
>>> of just ignoring that problem though. At least fwohci_reset()
>>> (there are probably more such functions in fwohci(4)) also
>>> seems to miss some bus space barriers, which also could be
>>> the cause of this panic.
>> I increased the for counter to 1000 in fwohci_reset().
>>
>>  	while(OREAD(sc, OHCI_HCCCTL) & OHCI_HCC_RESET) {
>> -		if (i++ > 100) break;
>> +		if (i++ > 1000) break;
>>  		DELAY(1000);
>>  	}
>>
>> This did not help so far.
> 
> Okay, this was my best guess based on the information
> available, sorry.

What? No sorry, thank you! The advice from you helped me to get some 
more into the deepness of FreeBSD. I was unhappy with modifying the 
source and always have to power-off to remove the card to install a new 
kernel.
I learned how to build a kernel with modules only support.
IOW, I learned how to configure and build a kernel with firewire support 
in module only. Now I can boot and install new kernels w/o plugin/off 
the card.
I see the positiv aspect here :)

And feedback form the list is very much appreciated! Writing in 
direction of /dev/null is very frustrating. Getting a feedback with an 
advice helps here a lot! Even if the advice does not give the expected 
results. But it is a feedback.

Thanks!

> 
>> I also tried to check the addresses you mentioned with l *(0xXXXX)
>> But here I miss some things. I guess I need to invoke gdb somehow?
> 
> Yes, simply `gdb /path/to/kernel.debug`

Heh, that only works when the gdb host target is from the same arch as 
the debugging target, right?
Unfortunately I build all (except ports) on a MacBook Pro VM (amd64) 
instance and install it on my Mac and Sparc. There is no second Sparc 
available.

> You might also want to bug firewire@ and simokawa@ regarding this.

I guess so since on my iMac (G3) fw support is broken also since 
r187993. But the previous revision does not help here on sparc64.
Another issue.

Thanks!
Andreas






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?49D12246.3080905>