Date: Thu, 12 Oct 2000 16:34:24 +0200 From: sthaug@nethelp.no To: gibbs@scsiguy.com Cc: freebsd-scsi@FreeBSD.ORG Subject: Re: Stressed SCSI subsystem locks up the system Message-ID: <54202.971361264@verdi.nethelp.no> In-Reply-To: Your message of "Wed, 11 Oct 2000 05:27:31 %2B0000" References: <200010110527.e9B5RV603276@aslan.scsiguy.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> As always, I am interested in knowing the details of this problem and > would like to resolve it. The easiest way to do this is to switch > over to using 4.1-stable built from source so I can work directly with > the site to debug the problem. We have a similar problem (may not be the same). We have a mail server with the following SCSI configuration: ahc0: <Adaptec aic7890/91 Ultra2 SCSI adapter> port 0xe800-0xe8ff mem 0xfebff000-0xfebfffff irq 10 at device 14.0 on pci0 aic7890/91: Wide Channel A, SCSI Id=7, 32/255 SCBs sa0 at ahc0 bus 0 target 2 lun 0 sa0: <ARCHIVE Python 04106-XXX 7270> Removable Sequential Access SCSI-2 device sa0: 7.812MB/s transfers (7.812MHz, offset 15) da0 at ahc0 bus 0 target 0 lun 0 da0: <SEAGATE ST39173LW 6246> Fixed Direct Access SCSI-2 device da0: 80.000MB/s transfers (40.000MHz, offset 15, 16bit) da0: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) da1 at ahc0 bus 0 target 6 lun 0 da1: <IBM DDRS-39130D DC1B> Fixed Direct Access SCSI-2 device da1: 80.000MB/s transfers (40.000MHz, offset 15, 16bit) da1: 8715MB (17850000 512 byte sectors: 255H 63S/T 1111C) This server has been extremely stable with 4.1-STABLE and earlier. With 4.1.1-STABLE we have had two cases of the system crashing with "page fault while in kernel mode" - and then it hangs while trying to sync the disks (but still responds to ping!). The instruction pointer that is printed is 0xc0135167 (same in both cases), which is inside ahc_action(): c0134ca8 T ahc_done c0134f78 t ahc_action c01358bc t ahc_get_tran_settings Specifically, line 441 in ahc_action, from $FreeBSD: src/sys/dev/aic7xxx/aic7xxx_freebsd.c,v 1.3.2.1 2000/09/23 00:24:03 gibbs Exp $ 436 if ((scb = ahc_get_scb(ahc)) == NULL) { 437 438 ahc_lock(ahc, &s); 439 ahc->flags |= AHC_RESOURCE_SHORTAGE; 440 ahc_unlock(ahc, &s); 441 xpt_freeze_simq(sim, /*count*/1); 442 ahc_set_transaction_status(scb, CAM_REQUEUE_REQ); 443 xpt_done(ccb); 444 return; Line 441 of "../../dev/aic7xxx/aic7xxx_freebsd.c" starts at address 0xc0135159 <ahc_action+481> and ends at 0xc0135175 <ahc_action+509>. At the moment I'm tempted to simply revert to the 4.1-STABLE code on this host. It looks like the differences between 4.1-STABLE and 4.1.1-STABLE are rather large - aic7xxx_freebsd.c doesn't exist in 4.1-STABLE, ahc_action is in aic7xxx.c instead: $FreeBSD: src/sys/dev/aic7xxx/aic7xxx.c,v 1.41.2.1 2000/03/18 23:00:11 gibbs Exp $ Any suggestions before I revert to 4.1-STABLE? Steinar Haug, Nethelp consulting, sthaug@nethelp.no To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54202.971361264>