From owner-freebsd-scsi@FreeBSD.ORG Sat Oct 24 03:58:17 2009 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C72B1065670; Sat, 24 Oct 2009 03:58:17 +0000 (UTC) (envelope-from areilly@bigpond.net.au) Received: from nschwqsrv01p.mx.bigpond.com (nschwqsrv01p.mx.bigpond.com [61.9.189.231]) by mx1.freebsd.org (Postfix) with ESMTP id 708938FC0C; Sat, 24 Oct 2009 03:58:16 +0000 (UTC) Received: from nschwotgx02p.mx.bigpond.com ([124.188.161.100]) by nschwmtas05p.mx.bigpond.com with ESMTP id <20091024022251.IEZS28093.nschwmtas05p.mx.bigpond.com@nschwotgx02p.mx.bigpond.com>; Sat, 24 Oct 2009 02:22:51 +0000 Received: from duncan.reilly.home ([124.188.161.100]) by nschwotgx02p.mx.bigpond.com with ESMTP id <20091024022250.TYDA9934.nschwotgx02p.mx.bigpond.com@duncan.reilly.home>; Sat, 24 Oct 2009 02:22:50 +0000 Date: Sat, 24 Oct 2009 13:22:38 +1100 From: Andrew Reilly To: freebsd-stable@freebsd.org, freebsd-scsi@freebsd.org Message-ID: <20091024022238.GA9296@duncan.reilly.home> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Authentication-Info: Submitted using SMTP AUTH LOGIN at nschwotgx02p.mx.bigpond.com from [124.188.161.100] using ID areilly@bigpond.net.au at Sat, 24 Oct 2009 02:22:50 +0000 X-RPD-ScanID: Class unknown; VirusThreatLevel unknown, RefID str=0001.0A150201.4AE264FB.001E,ss=1,fgs=0 X-SIH-MSG-ID: qRo7FdP/TAD0zmQs0WyzOwJxyArnqyN48Z4QX81loRIGTUDCp8DeQ9rAIudRvt2ixDxIJhqHNGMiaanlTY3RstCK Cc: Subject: Some questions about da0 on USB2 (recent bad behaviour) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Oct 2009 03:58:17 -0000 Hi there, I have a system with a couple of Western Digital "MyBook" USB2 drives connected to it, and have started seeing some odd behaviour that I am not sure how to identify the cause of. Perhaps someone could offer a suggestion or two? The behaviour that I've noticed (and I can't remember any particular event precipitating this, but I do track 8-STABLE approximately weekly, so things do change from time to time...) is that the drive will just stop for a couple of minutes, and then continue what it was doing. For a while that's all I could see: no error messages at all. The last time I booted, I turned on verbose booting and now I see that these periods of inactivity result in streams of syslog messages like: Oct 24 12:48:55 duncan kernel: (da0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR Oct 24 12:48:55 duncan kernel: (da0:umass-sim0:0:0:0): Retrying Command Oct 24 12:50:24 duncan kernel: (da0:umass-sim0:0:0:0): Request completed with CAM_REQ_CMP_ERR Oct 24 12:50:24 duncan kernel: (da0:umass-sim0:0:0:0): Retrying Command The retry seems to be successful, because I'm not getting any hard error messages anywhere, and the disk activity does seem to proceed afterwards. The disk drive isn't making any bad/broken noises, either. That drive is, according to dmesg.boot: ugen1.2: at usbus1 umass0: on usbus1 umass0: SCSI over Bulk-Only; quirks = 0x0000 Root mount waiting for: usbus1 umass0:1:0:-1: Attached to scbus1 (probe0:umass-sim0:0:0:0): Down reving Protocol Version from 2 to 0? pass0 at umass-sim0 bus 0 target 0 lun 0 pass0: Fixed Direct Access SCSI-0 device pass0: 40.000MB/s transfers GEOM: new disk da0 da0 at umass-sim0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-0 device da0: 40.000MB/s transfers da0: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) What is the likelihood that these pauses and command retries are a sign that this specific drive is in the process of dying, physically? If that were the case, are there any diagnostic tools that I could run against it to show, say, internal error logs? What is the significance of the "sim" part of the device designation umass-sim0? I've looked in all of the manual pages I can think of, but that clearly isn't enough. usbdevs -v says "no USB controllers found", which I thought a bit unuseful. I assume it is *supposed* to work, is there a trick? usbconfig shows my connected USB devices and hubs: ugen0.1: at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON ugen1.1: at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON ugen1.2: at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON ugen1.3: at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON ugen0.2: at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON ugen0.3: at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON At this rate the dump/restore backup that I'm running to get the data off it will take a little over a day to finish (according to dump), even though systat shows the drive doing about 8MB/s while it's working, which would allow the dump to finish in about eight hours. These modern, large drives are all very well, but they make doing any kind of system reconfiguration or backup really time consuming... Cheers, -- Andrew