Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Oct 2011 09:26:03 +1100
From:      Peter Jeremy <peterjeremy@acm.org>
To:        freebsd-scsi@freebsd.org
Subject:   Watchdog timeouts on isp(4)
Message-ID:  <20111027222603.GA39946@server.vk2pj.dyndns.org>

next in thread | raw e-mail | index | archive | help

--7AUc2qLy4jB3hD7Z
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

I have a 16-CPU SunFire V890 running FreeBSD-current/sparc64 r226167.
Whilst doing some stress testing (6 parallel -j16 buildworlds with
regular "sysctl vm.vmtotal"), I have started seeing isp watchdog
timeouts.  I didn't see this with similar testing on about r223035.
Does anyone have any suggestions?

The messages look like:

(da4:isp0:0:4:0): first watchdog (handle 0xb4d820d9) timed out- deferring f=
or grace period
(da4:isp0:0:4:0): first watchdog (handle 0xb4d920ba) timed out- deferring f=
or grace period
(da11:isp1:0:5:0): first watchdog (handle 0x5f0620c9) timed out- deferring =
for grace period
(da11:isp1:0:5:0): first watchdog (handle 0x5f07203a) timed out- deferring =
for grace period
(da11:isp1:0:5:0): first watchdog (handle 0x5f08208c) timed out- deferring =
for grace period
=2E..
(da11:isp1:0:5:0): first watchdog (handle 0x5f762088) timed out- deferring =
for grace period
isp1: isp_watchdog: timeout for handle 0x5f0620c9
(da11:isp1:0:5:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0x4e 0x66 0x20 =
0x00 0x00 0x20 0x00  STS 0x0 XS_ERR=3D0xb
isp1: isp_watchdog: timeout for handle 0x5f07203a
(da11:isp1:0:5:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0x4e 0x66 0x40 =
0x00 0x00 0x20 0x00  STS 0x0 XS_ERR=3D0xb
isp1: isp_watchdog: timeout for handle 0x5f08208c
(da11:isp1:0:5:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0x4e 0x66 0x60 =
0x00 0x00 0x20 0x00  STS 0x0 XS_ERR=3D0xb
isp1: isp_watchdog: timeout for handle 0x5f09201f
(da11:isp1:0:5:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0x4e 0x66 0x80 =
0x00 0x00 0x20 0x00  STS 0x0 XS_ERR=3D0xb
isp1: bad request handle 0x5f0620c9 (iocb type 0x3)
isp1: bad request handle 0x5f07203a (iocb type 0x3)
=2E..
(da4:isp0:0:4:0): first watchdog (handle 0x5cf1206d) timed out- deferring f=
or grace period
(da4:isp0:0:4:0): first watchdog (handle 0x5cf2203a) timed out- deferring f=
or grace period
isp0: isp_watchdog: timeout for handle 0x5cad2046
(da4:isp0:0:4:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0xdd 0xe8 0xe0 0=
x00 0x00 0x20 0x00  STS 0x0 XS_ERR=3D0xb
isp0: bad request handle 0x5cad2046 (iocb type 0x3)
isp0: isp_watchdog: timeout for handle 0x5cdb20cb
(da4:isp0:0:4:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0xe3 0xa8 0x00 0=
x00 0x00 0x20 0x00  STS 0x0 XS_ERR=3D0xb
isp0: isp_watchdog: timeout for handle 0x5cdc2059
(da4:isp0:0:4:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0xe3 0xa8 0x20 0=
x00 0x00 0x20 0x00  STS 0x0 XS_ERR=3D0xb
isp0: isp_watchdog: timeout for handle 0x5cdd2020
(da4:isp0:0:4:0): FIN dl16384 resid 0 CDB=3D0x2a 0x00 0x0f 0xe3 0xa8 0x40 0=
x00 0x00 0x20 0x00  STS 0x0 XS_ERR=3D0xb
isp0: bad request handle 0x5cdb20cb (iocb type 0x3)
isp0: bad request handle 0x5cdc2059 (iocb type 0x3)
isp0: bad request handle 0x5cdd2020 (iocb type 0x3)

The da11 errors are unexpected because there should be no activity
on filesystems on that disk.

$ df -k
Filesystem 1024-blocks     Used    Avail Capacity  Mounted on
/dev/da4a      9850318  3495908  5566386    39%    /
devfs                1        1        0   100%    /dev
/dev/da4e       981998    17936   885504     2%    /tmp
/dev/da4d    127993860 60068290 57686062    51%    /var
/dev/da11a     9850318  1404716  7657578    16%    /8
/dev/da11e      981998     7870   895570     1%    /8/tmp
/dev/da11d   127993860 22228452 95525900    19%    /8/var
/dev/md0      28784540  1482784 24998996     6%    /a
$ swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/da0b        67110720        0 67110720     0%
/dev/da6b        67110720        0 67110720     0%
Total           134221440        0 134221440     0%

The isp's and associated disks are:

isp0: <Qlogic ISP 2200 PCI FC-AL Adapter> port 0x300-0x3ff mem 0x400000-0x4=
00fff at device 2.0 on pci1
isp0: invalid NVRAM header
isp0: invalid NVRAM header
isp0: bad frame length (0) from NVRAM- using 1024
isp0: bad execution throttle of 0- using 16
=2E..
isp1: <Qlogic ISP 2200 PCI FC-AL Adapter> port 0x1000-0x10ff mem 0x100000-0=
x100fff at device 4.0 on pci4
isp2: <Qlogic ISP 2200 PCI FC-AL Adapter> port 0x1100-0x11ff mem 0x102000-0=
x102fff at device 5.0 on pci4
=2E..
(probe6:isp0:0:6:0): TEST UNIT READY. CDB: 0 0 0 0 0 0=20
(probe6:isp0:0:6:0): CAM status: SCSI Status Error
(probe6:isp0:0:6:0): SCSI status: Check Condition
(probe6:isp0:0:6:0): SCSI sense: UNIT ATTENTION asc:29,1 (Power on occurred)
(probe518:isp1:0:6:0): TEST UNIT READY. CDB: 0 0 0 0 0 0=20
(probe518:isp1:0:6:0): CAM status: SCSI Status Error
(probe518:isp1:0:6:0): SCSI status: Check Condition
(probe518:isp1:0:6:0): SCSI sense: UNIT ATTENTION asc:29,1 (Power on occurr=
ed)
uhub0: 4 ports with 4 removable, self powered
ses0 at isp0 bus 0 scbus1 target 6 lun 0
ses0: <SUNW SUNWGS INT FCBPL 922A> Fixed Enclosure Services SCSI-3 device=
=20
ses0: 100.000MB/s transfers WWNN 0x508002000025a7c0 WWPN 0x508002000025a7c1=
 PortID 0xdc
ses0: SCSI-3 SES Device
ses1 at isp1 bus 0 scbus2 target 6 lun 0
ses1: <SUNW SUNWGS INT FCBPL 922A> Fixed Enclosure Services SCSI-3 device=
=20
ses1: 100.000MB/s transfers WWNN 0x508002000065a568 WWPN 0x508002000065a569=
 PortID 0xdc
ses1: SCSI-3 SES Device
da0 at isp0 bus 0 scbus1 target 0 lun 0
da0: <HITACHI HUS1014FASUN146G 2A08> Fixed Direct Access SCSI-3 device=20
da0: 100.000MB/s transfers WWNN 0x20000000875017a5 WWPN 0x21000000875017a5 =
PortID 0xef
da0: Command Queueing enabled
da0: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da1 at isp0 bus 0 scbus1 target 1 lun 0
da1: <HITACHI HUS1014FASUN146G 2A08> Fixed Direct Access SCSI-3 device=20
da1: 100.000MB/s transfers WWNN 0x20000000875014ba WWPN 0x21000000875014ba =
PortID 0xe8
da1: Command Queueing enabled
da1: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da2 at isp0 bus 0 scbus1 target 2 lun 0
da2: <HITACHI HUS1014FASUN146G 2A08> Fixed Direct Access SCSI-3 device=20
da2: 100.000MB/s transfers WWNN 0x2000000087501828 WWPN 0x2100000087501828 =
PortID 0xe4
da2: Command Queueing enabled
da2: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da3 at isp0 bus 0 scbus1 target 3 lun 0
da3: <HITACHI HUS1014FASUN146G 2A08> Fixed Direct Access SCSI-3 device=20
da3: 100.000MB/s transfers WWNN 0x200000008796f785 WWPN 0x210000008796f785 =
PortID 0xe2
da3: Command Queueing enabled
da3: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da4 at isp0 bus 0 scbus1 target 4 lun 0
da4: <HITACHI HUS1014FASUN146G 2A08> Fixed Direct Access SCSI-3 device=20
da4: 100.000MB/s transfers WWNN 0x2000000087501755 WWPN 0x2100000087501755 =
PortID 0xe1
da4: Command Queueing enabled
da4: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da5 at isp0 bus 0 scbus1 target 5 lun 0
da5: <HITACHI HUS1014FASUN146G 2A08> Fixed Direct Access SCSI-3 device=20
da5: 100.000MB/s transfers WWNN 0x2000000087505ebc WWPN 0x2100000087505ebc =
PortID 0xe0
da5: Command Queueing enabled
da5: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da6 at isp1 bus 0 scbus2 target 0 lun 0
da6: <FUJITSU MAW3147FCSUN146G 1303> Fixed Direct Access SCSI-4 device=20
da6: 100.000MB/s transfers WWNN 0x500000e01999c6b0 WWPN 0x500000e01999c6b1 =
PortID 0xef
da6: Command Queueing enabled
da6: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da7 at isp1 bus 0 scbus2 target 1 lun 0
da7: <FUJITSU MAW3147FCSUN146G 1303> Fixed Direct Access SCSI-4 device=20
da7: 100.000MB/s transfers WWNN 0x500000e01365b4e0 WWPN 0x500000e01365b4e1 =
PortID 0xe8
da7: Command Queueing enabled
da7: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da8 at isp1 bus 0 scbus2 target 2 lun 0
da8: <FUJITSU MAW3147FCSUN146G 1303> Fixed Direct Access SCSI-4 device=20
da8: 100.000MB/s transfers WWNN 0x500000e014422a50 WWPN 0x500000e014422a51 =
PortID 0xe4
da8: Command Queueing enabled
da8: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da9 at isp1 bus 0 scbus2 target 3 lun 0
da9: <FUJITSU MAW3147FCSUN146G 1303> Fixed Direct Access SCSI-4 device=20
da9: 100.000MB/s transfers WWNN 0x500000e01998e820 WWPN 0x500000e01998e821 =
PortID 0xe2
da9: Command Queueing enabled
da9: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da10 at isp1 bus 0 scbus2 target 4 lun 0
da10: <FUJITSU MAW3147FCSUN146G 1303> Fixed Direct Access SCSI-4 device=20
da10: 100.000MB/s transfers WWNN 0x500000e014397bb0 WWPN 0x500000e014397bb1=
 PortID 0xe1
da10: Command Queueing enabled
da10: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)
da11 at isp1 bus 0 scbus2 target 5 lun 0
da11: <FUJITSU MAW3147FCSUN146G 1303> Fixed Direct Access SCSI-4 device=20
da11: 100.000MB/s transfers WWNN 0x500000e0143f8ee0 WWPN 0x500000e0143f8ee1=
 PortID 0xe0
da11: Command Queueing enabled
da11: 140009MB (286739329 512 byte sectors: 255H 63S/T 17848C)

pciconf reports:
isp0@pci1:0:2:0:        class=3D0x010000 card=3D0x00000000 chip=3D0x2200107=
7 rev=3D0x05 hdr=3D0x00
    vendor     =3D 'QLogic Corp.'
    device     =3D 'QLA2200 64-bit Fibre Channel Adapter'
    class      =3D mass storage
    subclass   =3D SCSI
    bar   [10] =3D type I/O Port, range 32, base 0x300, size 256, enabled
    bar   [14] =3D type Memory, range 32, base 0x400000, size 4096, enabled
    cap 01[44] =3D powerspec 1  supports D0 D3  current D0
isp1@pci3:1:4:0:        class=3D0x010000 card=3D0x40831077 chip=3D0x2200107=
7 rev=3D0x05 hdr=3D0x00
    vendor     =3D 'QLogic Corp.'
    device     =3D 'QLA2200 64-bit Fibre Channel Adapter'
    class      =3D mass storage
    subclass   =3D SCSI
    bar   [10] =3D type I/O Port, range 32, base 0x1000, size 256, enabled
    bar   [14] =3D type Memory, range 32, base 0x100000, size 4096, enabled
    cap 01[44] =3D powerspec 1  supports D0 D3  current D0
isp2@pci3:1:5:0:        class=3D0x010000 card=3D0x40831077 chip=3D0x2200107=
7 rev=3D0x05 hdr=3D0x00
    vendor     =3D 'QLogic Corp.'
    device     =3D 'QLA2200 64-bit Fibre Channel Adapter'
    class      =3D mass storage
    subclass   =3D SCSI
    bar   [10] =3D type I/O Port, range 32, base 0x1100, size 256, enabled
    bar   [14] =3D type Memory, range 32, base 0x102000, size 4096, enabled
    cap 01[44] =3D powerspec 1  supports D0 D3  current D0

--=20
Peter Jeremy

--7AUc2qLy4jB3hD7Z
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)

iEYEARECAAYFAk6p2nsACgkQ/opHv/APuIfIEACfWA9wU0okJROhk5ij7SJ3wgXE
G00AoImKh2MUQ+6VEB2Nx7IvaDpKBRgY
=rlxj
-----END PGP SIGNATURE-----

--7AUc2qLy4jB3hD7Z--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111027222603.GA39946>