Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Aug 2011 14:48:32 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        David Magda <dmagda@ee.ryerson.ca>
Cc:        freebsd-stable@freebsd.org, dart@es.net
Subject:   Re: Unable to shutdown
Message-ID:  <20110830214832.GA87354@icarus.home.lan>
In-Reply-To: <f0ffdf9eccf14f42ee24f0982bb0fc4b.squirrel@webmail.ee.ryerson.ca>
References:  <CAN6yY1s3x1ojxh-Dx9Ht=L8M4frohLXcMLNgz%2BzgtBCDodBdsg@mail.gmail.com> <uh78vqd9u8e.fsf@P142.sics.se> <4E5BF15F.9070601@es.net> <CAN6yY1u6ZshVZT2DwaQ2Et7Y1JvNA8q%2BFj5os4SmK4=7=Z77vg@mail.gmail.com> <f0ffdf9eccf14f42ee24f0982bb0fc4b.squirrel@webmail.ee.ryerson.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 30, 2011 at 01:29:02PM -0400, David Magda wrote:
> On Tue, August 30, 2011 11:50, Kevin Oberman wrote:
> [...]
> > The more I look at this, the more it seems to me that it is an issue
> > with the Seagate drive and not a FreeBSD issue. Probably a bug that is
> > never triggered on Windows, so is largely unnoticed. I suspect Widows
> > probably orders the command is a subtly different order.
> [...]
> 
> Or not the drive per se, but the USB-to-IDE/SATA chipset.
> 
> A while back on the OpenSolaris zfs-discuss list there was an issue where
> USB drives would have corrupt ZFS pools if a drive was yanked without a
> 'zpool export' being run. Even though ZFS is supposed to always be
> consistent on-disk (because it's transactional), this wasn't happening.
> 
> It turned that the chipset had a list of particular SATA commands that it
> allowed through to the drive, and all others were simply answered with
> "OK", regardless of what actual actions needed to be taken. One of the
> SATA commands that was NOT whitelisted was the 'cache flush'
> command--which ZFS needs to make sure that it's data structures were
> written in the proper order.
> 
> Turns out the drive and its firmware were fine and doing things properly,
> it's just that the necessary commands weren't getting to it because of the
> USB adaptor's chipsset.

I don't think that advice is applicable in this situation.  Here's why:

Kevin's original description indicates that when the drive (or enclosure
translation ASIC for that matter) is in standby, when the system is shut
down, the drive/ASIC never spins back up on I/O (flushing all I/O
buffers to disk).

If he issues "ls" commands or similar userland-induced I/O to the drive
prior to shutting the system down, the drive/ASIC spins up normally.

Here's Kevin's original quote:

>> The drive is "green" and spins down when idle.  If an attempt is made
>> to shutdown the system while the drive is spun down, the system goes
>> through the usual shutdown including flushing all buffer out to disk,
>> but when the final disk access to mark the file systems as clean, the
>> drive never spins up and the system hangs until it is powered down.
>> I've found no way to avoid this other then to remember to access the
>> disk and cause it to spin up before shutting down.
>>
>> If I attempt to unmount the file systems when the drive is shut down.
>> the same thing happens, but I can recover as the second file system
>> is still mounted and an ls(1) to that file system will cause the disk
>> to spin up and everything is fine.

So the question is what's "unique" about flushing all I/O buffers to
disk during shutdown compared to issuing standard I/O in userland.  I
can speculate all day as to what the cause is, but it's highly unlikely
that the USB-to-SATA controller ASIC is causing the problem.

Furthermore, Windows doesn't have "special disk/enclosure drivers" for
such drives, so there's nothing "unique" Windows would be sending across
the wire, ATA-protocol-wise, that would explain why Windows works and
FreeBSD doesn't.  At least that's my opinion.

With ATA/SATA, the FLUSH CACHE (0xe7) and -EXT (0xea) (for 48-bit LBAs)
commands are separate from WRITE DMA (0xca) and -EXT (0x35) (for 48-bit
LBAs).  Both FLUSH CACHE commands do not take LBAs in their input CDB.
To "flush buffers to disk" I imagine what the kernel should be doing is
issuing WRITE commands followed by FLUSH CACHE.  The WRITEs should be
"waking" the drive up.

But wait, there's more.

I want to point out to people that "sleep" and "standby" are two very
different things (they're separate ATA commands too).  So if you're
using "camcontrol sleep" you probably should be using "camcontrol
standby".  The man page is quite clear about the repercussions of the
former (and in the latter case I can imagine I/O to the drive failing or
simply timing out given that a bus reset is not performed during
shutdown TMK).

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110830214832.GA87354>