Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 6 Apr 2012 18:40:12 GMT
From:      Marius Strobl <marius@alchemy.franken.de>
To:        freebsd-sparc64@FreeBSD.org
Subject:   Re: sparc64/141918: [ehci] ehci_interrupt: unrecoverable error, controller halted (sparc64)
Message-ID:  <201204061840.q36IeC2s042548@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR sparc64/141918; it has been noted by GNATS.

From: Marius Strobl <marius@alchemy.franken.de>
To: Manuel Tobias Schiller <mala@hinterbergen.de>
Cc: bug-followup@FreeBSD.org
Subject: Re: sparc64/141918: [ehci] ehci_interrupt: unrecoverable error, controller halted (sparc64)
Date: Fri, 6 Apr 2012 20:37:26 +0200

 --TRYliJ5NKNqkz5bu
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 On Fri, Apr 06, 2012 at 09:58:42AM +0200, Manuel Tobias Schiller wrote:
 > On Thu, 5 Apr 2012 18:21:24 +0200
 > Manuel Tobias Schiller <mala@hinterbergen.de> wrote:
 > 
 > > On Wed, 4 Apr 2012 14:59:46 +0200
 > > Marius Strobl <marius@alchemy.franken.de> wrote:
 > > 
 > > > Hrm, okay, would be interesting to know what the machine actually
 > > > does. Looking at the code I found another bug; the VIA-workaround
 > > > currently doesn't do anything:
 > > > http://people.freebsd.org/~marius/ehci_pci_fix_via_quirk.diff
 > > > This might apply for the insane I/O you've reported but I'm unsure
 > > > whether it makes a difference for the HSE interrupt.
 > > > 
 > > > Marius
 > > 
 > > From the looks of it (with your patch at
 > > http://people.freebsd.org/~marius/usb_busdma.diff), the machine starts
 > > booting, then tries to mount the filesystems residing on the USB disks,
 > > apparently does some I/O (while still processing interrupts), and after
 > > less than a minute locks up solid without any indication on the serial
 > > console as to what went wrong...
 > > 
 > > I've started another build with your "VIA quirk fix" but without the
 > > patch in the last paragraph (the machine locking up is a lot worse than
 > > just USB not working after some heavy I/O, so I left it out for now),
 > > but since I started the build without being properly awake this
 > > morning, I typed "make buildworld" where I wanted to type "make
 > > buildkernel", so it's going to take some time. Also, I'll be leaving
 > > CERN over easter, so I won't be running tests on that machine from
 > > tomorrow morning until Monday evening (I can compile kernels, though).
 > > Anyhow, I'll let you know what comes out.
 > > 
 > > Cheers, thanks a lot for your effort, and, of course, a Happy Easter!
 > > 
 > > Manuel
 > 
 > Hi,
 > 
 > the "VIA quirk fix" on its own gives the familiar message in dmesg
 > (unrecoverable error, controller halted), so I'm compiling a kernel which
 
 Oof, this likely means there's a more basic problem with this device.
 Have you already tried to re-seat the card in case there's an electrical
 problem?
 Please also provide the output of `pciconf -rb ehci0@pci0:2:5:2 0:255'
 from a booting kernel.
 FYI, after some digging I've found the following card
 ehci0@pci0:2:5:2: class=0x0c0320 card=0x31041106 chip=0x31041106 rev=0x6h0
 which is a newer revision of your device and works just fine in a T1-200
 including with the usb(4) fixes. The publicly available datasheets for
 the VIA USB controllers are minimal and exclude errata and Linux also
 doesn't seem to use any additional work arounds, so I'm starting to run
 out of ideas what could be wrong with your revision. The only remaining
 thing to give a try I currently can think of is to test whether it chokes
 on the generic initialization done by the sparc64 PCI code using the
 attached patch.
 
 > combines this fix with your latest busdma fix to try them both together;
 
 This combination is unlikely to make a difference.
 
 Marius
 
 
 --TRYliJ5NKNqkz5bu
 Content-Type: text/x-diff; charset=us-ascii
 Content-Disposition: attachment; filename="skip_via.diff"
 
 Index: ofw_pcibus.c
 ===================================================================
 --- ofw_pcibus.c	(revision 233234)
 +++ ofw_pcibus.c	(working copy)
 @@ -133,6 +133,13 @@ ofw_pcibus_setup_device(device_t bridge, uint32_t
  #ifndef SUN4V
  	uint32_t reg;
  
 +	if ((CS_READ(PCIR_VENDOR, 2) == 0x1106 &&
 +	    CS_READ(PCIR_DEVICE, 2) == 0x3104)) {
 +		device_printf(bridge, "skipping %d/%d/%d\n", busno, slot,
 +		    func);
 +		goto skip;
 +	}
 +
  	/*
  	 * Initialize the latency timer register for busmaster devices to
  	 * work properly.  This is another task which the firmware doesn't
 @@ -210,6 +217,7 @@ ofw_pcibus_setup_device(device_t bridge, uint32_t
  	    CS_READ(PCIR_DEVICE, 2) == 0x5229))
  		CS_WRITE(0x50, CS_READ(0x50, 1) | 0x3, 1);
  
 + skip:
  	/*
  	 * The preset in the intline register is usually wrong.  Reset
  	 * it to 255, so that the PCI code will reroute the interrupt if
 
 --TRYliJ5NKNqkz5bu--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201204061840.q36IeC2s042548>