From owner-freebsd-fs@FreeBSD.ORG Wed Dec 8 17:30:39 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2720C1065675 for ; Wed, 8 Dec 2010 17:30:39 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from out-0.mx.aerioconnect.net (out-0-36.mx.aerioconnect.net [216.240.47.96]) by mx1.freebsd.org (Postfix) with ESMTP id EBC1B8FC28 for ; Wed, 8 Dec 2010 17:30:38 +0000 (UTC) Received: from idiom.com (postfix@mx0.idiom.com [216.240.32.160]) by out-0.mx.aerioconnect.net (8.13.8/8.13.8) with ESMTP id oB8H7wMN011701; Wed, 8 Dec 2010 09:07:58 -0800 X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by idiom.com (Postfix) with ESMTP id BD6812D6011; Wed, 8 Dec 2010 09:07:57 -0800 (PST) Message-ID: <4CFFBB6C.1050400@freebsd.org> Date: Wed, 08 Dec 2010 09:07:56 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6 MIME-Version: 1.0 To: Oliver Fromme References: <201012081658.oB8Gw9w3010495@lurza.secnetix.de> In-Reply-To: <201012081658.oB8Gw9w3010495@lurza.secnetix.de> Content-Type: multipart/mixed; boundary="------------070506050709000605070007" X-Scanned-By: MIMEDefang 2.67 on 216.240.47.51 Cc: freebsd-fs@freebsd.org, pjd@freebsd.org Subject: Re: TRIM support for UFS? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Dec 2010 17:30:39 -0000 This is a multi-part message in MIME format. --------------070506050709000605070007 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit kirk does have some trim patches which he sent to me once.. let me look... hmmm ah here it is.. this may or may not be out of date. I'll let Kirk chime in if he thinks it 's worth it.. I include the email from him as an attachment, hopefully it wont get stripped by the list, but you should both get it.. julian On 12/8/10 8:58 AM, Oliver Fromme wrote: > Pawel Jakub Dawidek wrote: > > On Tue, Dec 07, 2010 at 04:31:14PM +0100, Oliver Fromme wrote: > > > I've bought an OCZ Vertex2 E (120 GB SSD) and installed > > > FreeBSD i386 stable/8 on it, using UFS (UFS2, to be exact). > > > I've made sure that the partitions are aligned properly, > > > and used newfs with 4k fragsize and 32k blocksize. > > > It works very well so far. > > (I should also mention that I mounted all filesystems from > the SSD with the "noatime" option, to reduce writes during > normal operation.) > > > > So, my question is, are there plans to add TRIM support > > > to UFS? Is anyone working on it? Or is it already there > > > and I just overlooked it? > > > > I hacked up this patch mostly for Kris and md(4) memory-backed UFS, so > > on file remove space can be returned to the system. > > I see. > > > I think you should ask Kirk what to do about that, but I'm afraid my > > patch can break SU - what if we TRIM, but then panic and on fsck decide > > to actually use the block? > > Oh, you're right. That could be a problem. > > Maybe it would be better to write a separate tool that > performs TRIM commands on areas of the file system that > are unused for a while. > > I also remember that mav@ wrote that the TRIM command is > very slow. So, it's probably not feasible to execute it > each time some blocks are freed, because it would make the > file system much slower and nullify all advantages of the > SSD. > > Just found his comment from r201139: > "I have no idea whether it is normal, but for some reason it takes 200ms > to handle any TRIM command on this drive, that was making delete extremely > slow. But TRIM command is able to accept long list of LBAs and the length of > that list seems doesn't affect it's execution time. Implemented request > clusting algorithm allowed me to rise delete rate up to reasonable numbers, > when many parallel DELETE requests running." > > > BTW. Have you actually observed any performance degradation without > > TRIM? > > Not yet. My SSD is still very new. It carries only the > base system (/home is on a normal 1TB disk), so not many > writes happened so far. But as soon as I start doing more > write access (buildworld + installworld, updating ports > and so on), I expect that performance will degrade over > time. > > I've also heard from several people on various mailing lists > that the performance of their SSD drives got worse after > some time. > > That performance degradation is caused by so-called "static > wear leveling". The drive will have to move the contents > of blocks that are never (or rarely) written to to other > blocks, so they can be overwritten, in order to distribute > wear equally over all blocks. If a block is known to be > unused (which is the case when the drive is new, or after > a TRIM command), the contents don't have to be moved, so > the write operation is much faster. I think all modern > SSD drives use static wear leveling. > > Without TRIM support in the file system, a work-around is > to "newfs -E" the file system when the performance gets > too bad. This requires a backup-restore cycle, of course, > so it's a somewhat annoying. > > Another work-around is to leave some space unused, i.e. > don't use 20% at the end of the SSD for any file systems, > for example. Since those 20% are never written to, they > are known to be unused to the SSD's firmware, so it can > use them for wear leveling. This will postpone the > performance degradation somewhat, but it won't completely > avoid it, ultimately. And wasting some space is not a > very satisfying solution either. > > > I've similar SSDs and from what I tested it somehow can handle > > wear leveling internally. You can to TRIM entire disk using this simple > > program below, newfs it and test it. > > It does basically the same as "newfs -E", right? > > > Then fill it with random data, newfs it again, test it and compare > > results. > > Filling it just once will probably not have much of an > effect. In fact, wear leveling will probably not kick > in if you just fill the whole disk, because all blocks > are used equally anyway. > > The performance degradation will only start to occur > after a while (weeks or months) when some blocks are > written much more often than others. In this situation, > (static) wear leveling will kick in and start moving > data in order to re-use seldom-written-to blocks. > > Best regards > Oliver > --------------070506050709000605070007 Content-Type: message/rfc822; name="Attached Message" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Attached Message" Return-Path: X-Original-To: julian@elischer.org Delivered-To: julian@idiom.com X-Client-Authorized: MaGic Cook1e Received: from chez.mckusick.com (chez.mckusick.com [64.81.247.49]) by idiom.com (Postfix) with ESMTP id 04A872D6015 for ; Tue, 3 Nov 2009 20:48:08 -0800 (PST) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id nA44m6eE095082; Tue, 3 Nov 2009 20:48:08 -0800 (PST) (envelope-from mckusick@chez.mckusick.com) Message-Id: <200911040448.nA44m6eE095082@chez.mckusick.com> To: Julian Elischer Subject: Re: UFS2 and TRIM command cc: Scott Long In-reply-to: <4AF0A8A2.4090305@elischer.org> Date: Tue, 03 Nov 2009 20:48:05 -0800 From: Kirk McKusick X-Idiom-Reporting: If this was spam, please report it to http://www.spamcop.net > Date: Tue, 03 Nov 2009 14:03:14 -0800 > From: Julian Elischer > To: Kirk McKusick > CC: Scott Long > Subject: UFS2 and TRIM command > > Kirk, > > You mentioned at BSDCan that you had done some work with the 'trim' > command to tell a drive that a filesystem was no longer intersted > in a particular block. > > can you let us know what the state of that work was and whether you > have done anything more on it? > > thanks, > > Julian Enclosed is my proposed patch that I sent to Poul_henning Kamp. He was working with flash media at the time and wanted notification when blocks were released so he could pre-clear them so they could be used directly when next allocated. i never heard back from him and consequently never followed up on it. ~Kirk =-=-=-= From: Kirk McKusick Date: Wed, 21 May 2008 13:19:18 -0700 To: Poul-Henning Kamp Subject: UFS and BIO_DELETE X-URL: http://WWW.McKusick.COM/ Reply-To: Kirk McKusick I enclose below my proposed patch to add BIO_DELETE to UFS (goes at the end of ffs_blkfree). As I have no way to test it, I am wondering if you could let me know if it works. Also, I am thinking of only enabling it for filesystems mounted with a new flag requesting the behavior since the geteblk is a rather expensive call for the usual no-op case. I did look at just allocating a `struct bio' as a local variable and using that, but it looked like I needed to also come up with a `producer' and/or `consumer' if I wanted to pass it to GEOM directly, so in the end I went with this more expensive solution. If there is an easy way to just pass a bio structure to GEOM, I would much prefer that approach. ~Kirk *** ffs_alloc.c Wed May 21 20:11:04 2008 --- ffs_alloc.c.new Wed May 21 20:10:50 2008 *************** *** 1945,1950 **** --- 1945,1962 ---- ACTIVECLEAR(fs, cg); UFS_UNLOCK(ump); bdwrite(bp); + /* + * Request that the block be cleared. + */ + bp = geteblk(size); + bp->b_iocmd = BIO_DELETE; + bp->b_vp = devvp; + bp->b_blkno = fsbtodb(fs, bno); + bp->b_offset = dbtob(bp->b_blkno); + bp->b_iooffset = bp->b_offset; + bp->b_bcount = size; + BUF_KERNPROC(bp); + BO_STRATEGY(&devvp->v_bufobj, bp); } #ifdef INVARIANTS --------------070506050709000605070007--