From owner-freebsd-scsi@FreeBSD.ORG Tue Nov 18 11:37:14 2008 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1AF531065672 for ; Tue, 18 Nov 2008 11:37:14 +0000 (UTC) (envelope-from Andre.Albsmeier@siemens.com) Received: from thoth.sbs.de (thoth.sbs.de [192.35.17.2]) by mx1.freebsd.org (Postfix) with ESMTP id 8F26C8FC0A for ; Tue, 18 Nov 2008 11:37:13 +0000 (UTC) (envelope-from Andre.Albsmeier@siemens.com) Received: from mail3.siemens.de (localhost [127.0.0.1]) by thoth.sbs.de (8.12.11.20060308/8.12.11) with ESMTP id mAIBO38t018086 for ; Tue, 18 Nov 2008 12:24:03 +0100 Received: from curry.mchp.siemens.de (curry.mchp.siemens.de [139.25.40.130]) by mail3.siemens.de (8.12.11.20060308/8.12.11) with ESMTP id mAIBO2ik003188 for ; Tue, 18 Nov 2008 12:24:02 +0100 Received: (from localhost) by curry.mchp.siemens.de (8.14.3/8.14.3) id mAIBO2xl043490; Date: Tue, 18 Nov 2008 12:24:02 +0100 From: Andre Albsmeier To: freebsd-scsi@freebsd.org Message-ID: <20081118112402.GA78188@curry.mchp.siemens.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Echelon: X-Advice: Drop that crappy M$-Outlook, I'm tired of your viruses! User-Agent: Mutt/1.5.18 (2008-05-17) Cc: Andre.Albsmeier@siemens.com Subject: Quantum SLDT600 write problems X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Nov 2008 11:37:14 -0000 Hello, for months, I am experiencing occasionally appearing problems using a Quantum SDLT600. What we do is simple: - open() /dev/sa0 - set the blocksize to 64k using ioctl() - write() data in 64k chunks - close() Sometimes, the write() comes back with EIO. This can happen whenever it likes to -- after a few GB, hundreds of GB or never. If it happens, the kernel spits out errors (see below). Otherwise, the machine runs rock solid as a server running quotas, samba, nfsd, dhcpd, NIS, ntpd, ... The complete hardware, apart from the SDLT drive itself, has been replaced a while ago. Earlier it was a 1,4GHz Tualatin on an Asus CUBX-L board using an Adaptec 29160 controller, now it is an 3GHz E8400 on an Asus P5W board using an Adaptec 39320LPE controller. Even the cable from the controller to the drive was changed. OS has always been a recent version of FreeBSD 6.x-STABLE (now 6.4). Any ideas what is happening here? Here are the kernel errors: Nov 18 12:04:13 server kernel: ahd3: Recovery Initiated - Card was not paused Nov 18 12:04:13 server kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<< Nov 18 12:04:13 server kernel: ahd3: Dumping Card State at program address 0x32 Mode 0x0 Nov 18 12:04:13 server kernel: INTSTAT[0x0] SELOID[0x0] SELID[0x0] HS_MAILBOX[0x0] Nov 18 12:04:13 server kernel: INTCTL[0x80] SEQINTSTAT[0x0] SAVED_MODE[0x11] DFFSTAT[0x33] Nov 18 12:04:13 server kernel: SCSISIGI[0x0] SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1] Nov 18 12:04:13 server kernel: SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x0] SEQINTCTL[0x0] Nov 18 12:04:13 server kernel: SEQ_FLAGS[0xc0] SEQ_FLAGS2[0x0] QFREEZE_COUNT[0xe1c] Nov 18 12:04:13 server kernel: KERNEL_QFREEZE_COUNT[0xe1c] MK_MESSAGE_SCB[0xff00] Nov 18 12:04:13 server kernel: MK_MESSAGE_SCSIID[0xff] SSTAT0[0x0] SSTAT1[0x8] SSTAT2[0x0] Nov 18 12:04:13 server kernel: SSTAT3[0x0] PERRDIAG[0x0] SIMODE1[0xa4] LQISTAT0[0x0] Nov 18 12:04:13 server kernel: LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x0] Nov 18 12:04:13 server kernel: LQOSTAT2[0x0] Nov 18 12:04:13 server kernel: Nov 18 12:04:13 server kernel: SCB Count = 16 CMDS_PENDING = 1 LASTSCB 0xffff CURRSCB 0x1 NEXTSCB 0x0 Nov 18 12:04:13 server kernel: qinstart = 9444 qinfifonext = 9444 Nov 18 12:04:13 server kernel: QINFIFO: Nov 18 12:04:13 server kernel: WAITING_TID_QUEUES: Nov 18 12:04:13 server kernel: Pending list: Nov 18 12:04:13 server kernel: 1 FIFO_USE[0x0] SCB_CONTROL[0x44] SCB_SCSIID[0x7] Nov 18 12:04:13 server kernel: Total 1 Nov 18 12:04:13 server kernel: Kernel Free SCB list: 2 15 13 12 11 10 9 8 7 6 5 4 3 14 0 Nov 18 12:04:13 server kernel: Sequencer Complete DMA-inprog list: Nov 18 12:04:13 server kernel: Sequencer Complete list: Nov 18 12:04:13 server kernel: Sequencer DMA-Up and Complete list: Nov 18 12:04:13 server kernel: Sequencer On QFreeze and Complete list: Nov 18 12:04:13 server kernel: Nov 18 12:04:13 server kernel: Nov 18 12:04:13 server kernel: ahd3: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0 Nov 18 12:04:13 server kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89] Nov 18 12:04:13 server kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] Nov 18 12:04:13 server kernel: SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 Nov 18 12:04:13 server kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] Nov 18 12:04:13 server kernel: Nov 18 12:04:13 server kernel: ahd3: FIFO1 Free, LONGJMP == 0x81fc, SCB 0x1 Nov 18 12:04:13 server kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x4] DFSTATUS[0x89] Nov 18 12:04:13 server kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] Nov 18 12:04:13 server kernel: SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 Nov 18 12:04:13 server kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10] Nov 18 12:04:13 server kernel: LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 Nov 18 12:04:13 server kernel: ahd3: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42 Nov 18 12:04:13 server kernel: ahd3: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0 Nov 18 12:04:13 server kernel: ahd3: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0 Nov 18 12:04:13 server kernel: SIMODE0[0xc] Nov 18 12:04:13 server kernel: CCSCBCTL[0x4] Nov 18 12:04:13 server kernel: ahd3: REG0 == 0x6f74, SINDEX = 0x1b8, DINDEX = 0x1ba Nov 18 12:04:13 server kernel: ahd3: SCBPTR == 0x0, SCB_NEXT == 0xff00, SCB_NEXT2 == 0x0 Nov 18 12:04:13 server kernel: CDB 0 0 0 0 0 0 Nov 18 12:04:13 server kernel: STACK: 0x23 0x0 0x0 0x0 0x0 0x0 0x0 0x0 Nov 18 12:04:13 server kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): SCB 1 - timed out Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): Queuing a BDR SCB Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): Bus Device Reset Message Sent Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): no longer in timeout, status = 24b Nov 18 12:04:13 server kernel: ahd3: Bus Device Reset on A:0. 1 SCBs aborted Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): WRITE FILEMARKS(6). CDB: 10 0 0 0 2 0 Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): CAM Status: SCSI Status Error Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): SCSI Status: Check Condition Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): UNIT ATTENTION csi:0,49,cc,1e asc:29,3 Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): Bus device reset function occurred Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): Retries Exhausted Nov 18 12:04:13 server kernel: (sa0:ahd3:0:0:0): failed to write terminating filemark(s) Thanks, -Andre