From owner-freebsd-questions@FreeBSD.ORG Thu May 21 16:31:53 2015 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6E7EC53D for ; Thu, 21 May 2015 16:31:53 +0000 (UTC) Received: from cosmo.uchicago.edu (cosmo.uchicago.edu [128.135.70.90]) by mx1.freebsd.org (Postfix) with ESMTP id 2B2861543 for ; Thu, 21 May 2015 16:31:52 +0000 (UTC) Received: by cosmo.uchicago.edu (Postfix, from userid 48) id 7839ACB8C95; Thu, 21 May 2015 11:31:52 -0500 (CDT) Received: from 128.135.70.2 (SquirrelMail authenticated user valeri) by cosmo.uchicago.edu with HTTP; Thu, 21 May 2015 11:31:52 -0500 (CDT) Message-ID: <48018.128.135.70.2.1432225912.squirrel@cosmo.uchicago.edu> In-Reply-To: References: Date: Thu, 21 May 2015 11:31:52 -0500 (CDT) Subject: Re: SLOG and SSDs: are "super" capacitors really needed? From: "Valeri Galtsev" To: "Jason Cox" Cc: "Chris Stankevitz" , "freebsd-questions" Reply-To: galtsev@kicp.uchicago.edu User-Agent: SquirrelMail/1.4.8-5.el5.centos.7 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 May 2015 16:31:53 -0000 On Thu, May 21, 2015 10:57 am, Jason Cox wrote: > I know I might be off here, but I think it goes like this... > > Write transactions are done in memory, the ZIL/SLOG is just the backup > copy. So a write goes into memory, then written to the pool. The purpose > of > the ZIL is if a transaction does not complete to disk and the power is > yanked from the machine, it can replay that transaction from the ZIL. The > reason having a fast SLOG device speeds up sync writes is that the SLOG is > better able to keep in sync with RAM vs. when the ZIL is on the slower > spinning disk. So ZFS will not acknowledge the write till the data is in > RAM and secure in the ZIL. > > So it is very important to have a SLOG device with a super cap for when > something like the power goes out and the SSD has not completed the write > from its cache to nv storage, you just lost that transaction. To re-phrase that (I'm just stealing somebody's else phrase...): all devices (that have RAM cache) lie about "transaction complete". They report it NOT when the transaction is safely dumped from volatile cache into non-volatile storage, but just as soon as the device accepted all data (most of which may be in volatile memory). This makes their specs look good compared to competition. And which technically is bad. As I for one would prefer "transaction complete" (or similar) only when all data are already safely stored in non-volatile part of the storage device. But the life is bad - there is no way of knowing it (as far as I know) ;-( Valeri PS Make use of UPSes! > > On Thu, May 21, 2015 at 8:38 AM, Chris Stankevitz > > wrote: > >> When sync data is written to the ZIL (or in my case to a SLOG), ZFS >> waits for the write to be "completed" before continuing. Once the >> write has "completed", the sync data is considered written, even if it >> has not yet made it to the real storage devices. Written data has >> "completed" when the ZIL device (SLOG) reports that the data has been >> written. >> >> Question: do SSD drives report the write has "completed" only after >> the data has been burned into non-volatile storage? If so, then why >> do people say a good SLOG SSD has "super capacitors" that allow the >> drive to continue functioning for a short time after a power failure? >> It seems to me that there are two scenarios, none of which need super >> capacitors: >> >> 1. A transaction is completely written to the SLOG, but not the >> storage devices, and the power goes out. No problem, data will write >> to storage when the pool is imported. >> >> 2. A transaction is partially written to the SLOG, but not the storage >> devices, and the power goes out. No problem, the transaction will be >> lost and the pool will be imported with the previously committed >> data/transaction. >> >> I don't see a scenario where a power-outage causes a "corrupted >> transaction" to be posted. >> >> Now if an SSD reports data "written" before it makes it to >> non-volatile storage, then that is another story... but I cannot >> imagine a HDD manufacturer advertising data written that is not >> actually written (or guaranteed to be written even in the face of a >> power outage). >> >> Thank you, >> >> Chris >> _______________________________________________ >> freebsd-questions@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-questions >> To unsubscribe, send any mail to " >> freebsd-questions-unsubscribe@freebsd.org" >> > > > > -- > Jason Cox > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to > "freebsd-questions-unsubscribe@freebsd.org" > ++++++++++++++++++++++++++++++++++++++++ Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247 ++++++++++++++++++++++++++++++++++++++++