Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Oct 2002 18:00:42 +0000
From:      Ian Dowse <iedowse@maths.tcd.ie>
To:        freebsd-mobile@freebsd.org
Subject:   Patch to fix/shorten "wi" freezes
Message-ID:   <200210271800.aa11251@salmon.maths.tcd.ie>

next in thread | raw e-mail | index | archive | help

The wi driver causes quite long system freezes both when the pccard
is removed, and also if the hardware becomes confused. I've found
on -current that sometimes the whole machine can become unresponsive
for a period of minutes with messages such as:

	wi0: timeout in wi_cmd 0x0002; event status 0x8080
	wi0: timeout in wi_cmd 0x0000; event status 0x8080
	wi0: wi_cmd: busy bit won't clear.
	wi0: wi_cmd: busy bit won't clear.
	wi0: init failed
	wi0: wi_cmd: busy bit won't clear.
	... <repeated 20 times>
	wi0: wi_cmd: busy bit won't clear.
	wi0: failed to allocate 1594 bytes on NIC
	wi0: tx buffer allocation failed
	wi0: wi_cmd: busy bit won't clear.
	wi0: failed to allocate 1594 bytes on NIC

The "wi_cmd 0x0002" is WI_CMD_DISABLE from wi_stop(). Each of the
"busy bit won't clear" messages comes after a 5-second busy-loop
delay in wi_cmd(), so the above takes 2-3 minutes to complete.
This, BTW, is a Lucent silver card, probed as:

	wi0: <WaveLAN/IEEE> at port 0x100-0x13f irq 9 function 0
	    config 1 on pccard0
	wi0: 802.11 address: 00:02:2d:21:57:d3
	wi0: using Lucent Technologies, WaveLAN/IEEE
	wi0: Lucent Firmware: Station 6.16.01

The patch below does a few things:
- It adds a 20ms delay at the end of wi_init(), which seems to fix
  the above timeouts in wi_stop(), as it seems that calling wi_stop()
  too soon after wi_init() can cause these.
- The busy-bit loop timeout is reduced from 5 seconds to 500ms.
- When a status of 0xffff is returned or the "busy bit won't clear"
  error occurs, sc->wi_gone is set to 1, so that other operations
  will fail immediately instead of going back into the long busy
  loops. Since sc->wi_gone had been used as a sanity test in
  wi_generic_detach() to make sure devices are not detached twice,
  this has been changed to use the previously unused WI_FLAGS_ATTACHED
  flag. We also need to remove the wi_gone test in wi_stop(), since
  otherwise the untimeout() calls will be missed if wi_gone is set
  by something other than wi_generic_detach().
- The functions wi_cmd() and wi_seek() now test wi_gone, and return
  immediately if it is set.

For me this makes the card work much more reliably and it reduces
the length of any hangs to less than 1 second. I guess it will take
testing on other cards and configurations to see if this improves
things in general or causes problems with some combinations.

Ian

Index: if_wi.c
===================================================================
RCS file: /dump/FreeBSD-CVS/src/sys/dev/wi/if_wi.c,v
retrieving revision 1.117
diff -u -r1.117 if_wi.c
--- if_wi.c	14 Oct 2002 01:59:57 -0000	1.117
+++ if_wi.c	27 Oct 2002 16:57:04 -0000
@@ -200,7 +200,7 @@
 	WI_LOCK(sc, s);
 	ifp = &sc->arpcom.ac_if;
 
-	if (sc->wi_gone) {
+	if ((sc->wi_flags & WI_FLAGS_ATTACHED) == 0) {
 		device_printf(dev, "already unloaded\n");
 		WI_UNLOCK(sc, s);
 		return(ENODEV);
@@ -214,6 +214,7 @@
 	ether_ifdetach(ifp, ETHER_BPF_SUPPORTED);
 	bus_teardown_intr(dev, sc->irq, sc->wi_intrhand);
 	wi_free(dev);
+	sc->wi_flags &= ~WI_FLAGS_ATTACHED;
 	sc->wi_gone = 1;
 
 	WI_UNLOCK(sc, s);
@@ -471,6 +472,7 @@
 	 */
 	ether_ifattach(ifp, ETHER_BPF_SUPPORTED);
 	callout_handle_init(&sc->wi_stat_ch);
+	sc->wi_flags |= WI_FLAGS_ATTACHED;
 	WI_UNLOCK(sc, s);
 
 	return(0);
@@ -1002,20 +1004,24 @@
 {
 	int			i, s = 0;
 	static volatile int count  = 0;
+
+	if (sc->wi_gone)
+		return (ENODEV);
 	
 	if (count > 1)
 		panic("Hey partner, hold on there!");
 	count++;
 
 	/* wait for the busy bit to clear */
-	for (i = 500; i > 0; i--) {	/* 5s */
+	for (i = 500; i > 0; i--) {	/* 500ms */
 		if (!(CSR_READ_2(sc, WI_COMMAND) & WI_CMD_BUSY)) {
 			break;
 		}
-		DELAY(10*1000);	/* 10 m sec */
+		DELAY(1000);	/* 1ms */
 	}
 	if (i == 0) {
 		device_printf(sc->dev, "wi_cmd: busy bit won't clear.\n" );
+		sc->wi_gone = 1;
 		count--;
 		return(ETIMEDOUT);
 	}
@@ -1052,6 +1058,8 @@
 	if (i == WI_TIMEOUT) {
 		device_printf(sc->dev,
 		    "timeout in wi_cmd 0x%04x; event status 0x%04x\n", cmd, s);
+		if (s == 0xffff)
+			sc->wi_gone = 1;
 		return(ETIMEDOUT);
 	}
 	return(0);
@@ -1364,6 +1372,9 @@
 	int			selreg, offreg;
 	int			status;
 
+	if (sc->wi_gone)
+		return (ENODEV);
+
 	switch (chan) {
 	case WI_BAP0:
 		selreg = WI_SEL0;
@@ -1391,6 +1402,8 @@
 	if (i == WI_TIMEOUT) {
 		device_printf(sc->dev, "timeout in wi_seek to %x/%x; last status %x\n",
 			id, off, status);
+		if (status == 0xffff)
+			sc->wi_gone = 1;
 		return(ETIMEDOUT);
 	}
 
@@ -2196,6 +2209,13 @@
 	sc->wi_stat_ch = timeout(wi_inquire, sc, hz * 60);
 	WI_UNLOCK(sc, s);
 
+	/*
+	 * A 10ms or greater delay here seems to avoid a problem that
+	 * causes some Lucent orinoco cards to time out in wi_stop()
+	 * if called immediately after wi_init(). Use 20ms to be safe.
+	 */
+	DELAY(20000);
+
 	return;
 }
 
@@ -2471,11 +2491,11 @@
 	int			s;
 
 	WI_LOCK(sc, s);
-
-	if (sc->wi_gone) {
-		WI_UNLOCK(sc, s);
-		return;
-	}
+	/*
+	 * Ignore wi_gone here, as we still need to do the untimeout calls.
+	 * Currently everything here should be safe to do even if the
+	 * hardware is gone.
+	 */
 
 	wihap_shutdown(sc);
 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-mobile" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi? <200210271800.aa11251>