From owner-freebsd-scsi@FreeBSD.ORG Thu Jan 24 11:19:24 2013 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4D869D1D; Thu, 24 Jan 2013 11:19:24 +0000 (UTC) (envelope-from prvs=1736dd70aa=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 5F530F73; Thu, 24 Jan 2013 11:19:23 +0000 (UTC) Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50001833957.msg; Thu, 24 Jan 2013 11:19:20 +0000 X-Spam-Processed: mail1.multiplay.co.uk, Thu, 24 Jan 2013 11:19:20 +0000 (not processed: message from valid local sender) X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=1736dd70aa=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: From: "Steven Hartland" To: "Borja Marcos" , References: <492280E6-E3EE-4540-92CE-C535C8943CCF@sarenet.es> Subject: Re: Problem adding SCSI quirks for a SSD, 4K sector and ZFS Date: Thu, 24 Jan 2013 11:19:52 -0000 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0014_01CDFA24.BAC45C90" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: FreeBSD Filesystems X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 11:19:24 -0000 This is a multi-part message in MIME format. ------=_NextPart_000_0014_01CDFA24.BAC45C90 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit ----- Original Message ----- From: "Borja Marcos" To: Cc: "FreeBSD Filesystems" Sent: Thursday, January 24, 2013 9:46 AM Subject: Problem adding SCSI quirks for a SSD, 4K sector and ZFS > > Hello, > > Crossposting to FreeBSD-fs, as I am wondering if I have had a problem with ZFS and sector size detection as well. > > I am doing tests with an OCZ Vertex 4 connected to a SAS backplane. > > < OCZ-VERTEX4 1.5> at scbus6 target 22 lun 0 (pass19,da15) > > (The blank before "OCZ" really appears there) > > pass19: < OCZ-VERTEX4 1.5> Fixed Direct Access SCSI-5 device > pass19: Serial Number OCZ-1SVG6KZ2YRMSS8E1 > pass19: 3.300MB/s transfers > > I am bypassing an "aac" RAID card so that the disks are directly attached to the da driver, instead of relying on the so-called > JBOD feature. > > I have had a weird problem, with the disk being unresponsive to the REQUEST CAPACITY(16) command. Weird, seems it timeouts. > > So, just to complete the tests, I have added a quirk to scsi_da.c. Anyway, I also need the disk to be recognized as a 4K sector > drive. > > I created a new quirk, called it DA_Q_NO_RC16, and added an entry to the quirk table, so that these drives are recognized as 4K > drives and the driver doesn't try to send a RC(16) command. > > diff scsi_da.c.orig scsi_da.c > 93c93,94 > < DA_Q_4K = 0x08 > --- >> DA_Q_4K = 0x08, >> DA_Q_NO_RC16 = 0x10 > 811a813,817 >> /* OCZ Vertex 4 firmware 1.5 */ >> { T_DIRECT, SIP_MEDIA_FIXED, "", "OCZ-VERTEX4", "*" }, >> /*quirks*/DA_Q_NO_RC16 | DA_Q_4K >> }, >> { > 1635,1636c1641,1646 > < /* Predict whether device may support READ CAPACITY(16). */ > < if (SID_ANSI_REV(&cgd->inq_data) >= SCSI_REV_SPC3) { > --- >> /* >> * Predict whether device may support READ CAPACITY(16). >> * BUT Some disks don't support RC(16) even though they should. >> */ >> if ((SID_ANSI_REV(&cgd->inq_data) >= SCSI_REV_SPC3) >> && !(softc->quirks & DA_Q_NO_RC16) ) { > > > > I think it's working. I haven't seen any more RC(16) errors, and the disk is working fine. Anyway I am not sure I've done it > right. After adding the 4K quirk and rebooting, GEOM_PART complained that the partitions weren't aligned to 4K > > /var/log/messages.0:Jan 23 16:01:30 kernel: GEOM_PART: partition 1 is not aligned on 4096 bytes > /var/log/messages.0:Jan 23 16:01:30 kernel: GEOM_PART: partition 2 is not aligned on 4096 bytes > > So it seems it works. However, when using the disk for ZFS, it still detects a 512 byte sector size, which is odd. > > Jan 23 16:01:30 rasputin kernel: GEOM: new disk da15 > Jan 23 16:01:30 rasputin kernel: da15 at aacp0 bus 0 scbus6 target 22 lun 0 > Jan 23 16:01:30 rasputin kernel: da15: < OCZ-VERTEX4 1.5> Fixed Direct Access SCSI-5 device > Jan 23 16:01:30 rasputin kernel: da15: Serial Number OCZ-1SVG6KZ2YRMSS8E1 > Jan 23 16:01:30 rasputin kernel: da15: 3.300MB/s transfers > Jan 23 16:01:30 rasputin kernel: da15: 488386MB (1000215216 512 byte sectors: 255H 63S/T 62260C) > > > diskinfo is returning a sector size of 512 bytes, and a stripesize of 4096. Is this correct? ZFS is still detecting it as a 512 > byte sector disk. > > /dev/da15 > 512 # sectorsize > 512110190592 # mediasize in bytes (477G) > 1000215216 # mediasize in sectors > 4096 # stripesize > 0 # stripeoffset > 62260 # Cylinders according to firmware. > 255 # Heads according to firmware. > 63 # Sectors according to firmware. > OCZ-1SVG6KZ2YRMSS8E1 # Disk ident. > > > > So, to summarize: > > If the quirk was working, should diskinfo return a sector size of 512 bytes, or is it correct to show a "stripesize" of 4096? > > Do we have a bug either on ZFS or the disk drivers? The same experiment on another system (both are 9.1-RELEASE) and a similar > drive attached to a SATA controller, also adding a 4K sector quirk for it, defines a stripe size instead of a sector size. Simple answer is ZFS doesn't understand quirks. The attached patch does what you're looking for along with a few other things, see notes at the top for details. Its not a final version as there's still some discussion about implementation details but it should do what your looking for. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. ------=_NextPart_000_0014_01CDFA24.BAC45C90 Content-Type: application/octet-stream; name="zzz-zfs-ashift-fix.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="zzz-zfs-ashift-fix.patch" Changes zfs zpool initial / desired ashift to be based off stripesize=0A= instead of sectorsize making it compatible with drives marked with=0A= the 4k sector size quirk.=0A= =0A= Without the correct min block size BIO_DELETE requests passed to=0A= a large number of current SSD's via TRIM don't actually perform=0A= any LBA TRIM so its vital for the correct operation of TRIM to get=0A= the correct min block size.=0A= =0A= To do this we added the additional dashift (desired ashift) to=0A= vdev_open_func_t calls. This was needed as just updating ashift to=0A= be based off stripesize would mean that a devices reported minimum=0A= transfer size (ashift) could increase and that in turn would cause=0A= member devices to be unusable and hence break pools with error=0A= ZFS-8000-5E.=0A= =0A= The global minimum ashift used for new zpools can now also be=0A= tuned using the vfs.zfs.min_create_ashift sysctl. This defaults=0A= to 12 (4096 byte blocks) in order to optimise for newer disks which=0A= are migrating from 512 to 4096 byte sectors.=0A= =0A= The value of vfs.zfs.min_create_ashift is limited to min of=0A= SPA_MINBLOCKSHIFT (9) and a max of SPA_MAXBLOCKSHIFT (17).=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c.orig = 2011-06-06 09:36:46.000000000 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c = 2012-11-02 14:47:55.293668071 +0000=0A= @@ -32,6 +32,8 @@=0A= #include =0A= #include =0A= =0A= +extern int zfs_min_ashift;=0A= +=0A= /*=0A= * Virtual device vector for disks.=0A= */=0A= @@ -103,7 +105,7 @@=0A= }=0A= =0A= static int=0A= -vdev_disk_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_disk_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= spa_t *spa =3D vd->vdev_spa;=0A= vdev_disk_t *dvd;=0A= @@ -284,7 +286,7 @@=0A= }=0A= =0A= /*=0A= - * Determine the device's minimum transfer size.=0A= + * Determine the device's minimum and desired transfer size.=0A= * If the ioctl isn't supported, assume DEV_BSIZE.=0A= */=0A= if (ldi_ioctl(dvd->vd_lh, DKIOCGMEDIAINFOEXT, (intptr_t)&dkmext,=0A= @@ -292,6 +294,7 @@=0A= dkmext.dki_pbsize =3D DEV_BSIZE;=0A= =0A= *ashift =3D highbit(MAX(dkmext.dki_pbsize, SPA_MINBLOCKSIZE)) - 1;=0A= + *dashift =3D highbit(MAX(dkmext.dki_pbsize, (1ULL << zfs_min_ashift))) = - 1;=0A= =0A= /*=0A= * Clear the nowritecache bit, so that on a vdev_reopen() we will=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c.orig = 2012-01-05 22:31:25.000000000 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c = 2012-11-02 14:47:38.252107541 +0000=0A= @@ -30,6 +30,8 @@=0A= #include =0A= #include =0A= =0A= +extern int zfs_min_ashift;=0A= +=0A= /*=0A= * Virtual device vector for files.=0A= */=0A= @@ -47,7 +49,7 @@=0A= }=0A= =0A= static int=0A= -vdev_file_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_file_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= vdev_file_t *vf;=0A= vnode_t *vp;=0A= @@ -127,6 +129,7 @@=0A= =0A= *psize =3D vattr.va_size;=0A= *ashift =3D SPA_MINBLOCKSHIFT;=0A= + *dashift =3D zfs_min_ashift;=0A= =0A= return (0);=0A= }=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c.orig = 2012-11-02 12:20:15.918986181 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c = 2012-11-02 14:47:48.135273692 +0000=0A= @@ -36,6 +36,8 @@=0A= #include =0A= #include =0A= =0A= +extern int zfs_min_ashift;=0A= +=0A= /*=0A= * Virtual device vector for GEOM.=0A= */=0A= @@ -408,7 +410,7 @@=0A= }=0A= =0A= static int=0A= -vdev_geom_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_geom_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= struct g_provider *pp;=0A= struct g_consumer *cp;=0A= @@ -494,9 +496,10 @@=0A= *psize =3D pp->mediasize;=0A= =0A= /*=0A= - * Determine the device's minimum transfer size.=0A= + * Determine the device's minimum and desired transfer size.=0A= */=0A= *ashift =3D highbit(MAX(pp->sectorsize, SPA_MINBLOCKSIZE)) - 1;=0A= + *dashift =3D highbit(MAX(pp->stripesize, (1ULL << zfs_min_ashift))) - = 1;=0A= =0A= /*=0A= * Clear the nowritecache settings, so that on a vdev_reopen()=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c.orig = 2012-07-03 11:49:22.342245151 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c = 2012-07-03 11:58:02.161948585 +0000=0A= @@ -127,7 +127,7 @@=0A= }=0A= =0A= static int=0A= -vdev_mirror_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift)=0A= +vdev_mirror_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift, = uint64_t *dashift)=0A= {=0A= int numerrors =3D 0;=0A= int lasterror =3D 0;=0A= @@ -150,6 +150,7 @@=0A= =0A= *asize =3D MIN(*asize - 1, cvd->vdev_asize - 1) + 1;=0A= *ashift =3D MAX(*ashift, cvd->vdev_ashift);=0A= + *dashift =3D MAX(*dashift, cvd->vdev_dashift);=0A= }=0A= =0A= if (numerrors =3D=3D vd->vdev_children) {=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c.orig = 2012-07-03 11:49:10.545275865 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c = 2012-07-03 11:58:07.670470640 +0000=0A= @@ -40,7 +40,7 @@=0A= =0A= /* ARGSUSED */=0A= static int=0A= -vdev_missing_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_missing_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, = uint64_t *dashift)=0A= {=0A= /*=0A= * Really this should just fail. But then the root vdev will be in the=0A= @@ -50,6 +50,7 @@=0A= */=0A= *psize =3D 0;=0A= *ashift =3D 0;=0A= + *dashift =3D 0;=0A= return (0);=0A= }=0A= =0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c.orig = 2012-07-03 11:49:03.675875505 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c = 2012-07-03 11:58:15.334806334 +0000=0A= @@ -1447,7 +1447,7 @@=0A= }=0A= =0A= static int=0A= -vdev_raidz_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift)=0A= +vdev_raidz_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= vdev_t *cvd;=0A= uint64_t nparity =3D vd->vdev_nparity;=0A= @@ -1476,6 +1476,7 @@=0A= =0A= *asize =3D MIN(*asize - 1, cvd->vdev_asize - 1) + 1;=0A= *ashift =3D MAX(*ashift, cvd->vdev_ashift);=0A= + *dashift =3D MAX(*dashift, cvd->vdev_dashift);=0A= }=0A= =0A= *asize *=3D vd->vdev_children;=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c.orig = 2012-07-03 11:49:27.901760380 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c = 2012-07-03 11:58:19.704427068 +0000=0A= @@ -50,7 +50,7 @@=0A= }=0A= =0A= static int=0A= -vdev_root_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift)=0A= +vdev_root_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= int lasterror =3D 0;=0A= int numerrors =3D 0;=0A= @@ -78,6 +78,7 @@=0A= =0A= *asize =3D 0;=0A= *ashift =3D 0;=0A= + *dashift =3D 0;=0A= =0A= return (0);=0A= }=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c.orig = 2012-10-22 20:41:50.234005351 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c 2012-10-22 = 20:42:16.355805894 +0000=0A= @@ -1125,6 +1125,7 @@=0A= uint64_t osize =3D 0;=0A= uint64_t asize, psize;=0A= uint64_t ashift =3D 0;=0A= + uint64_t dashift =3D 0;=0A= =0A= ASSERT(vd->vdev_open_thread =3D=3D curthread ||=0A= spa_config_held(spa, SCL_STATE_ALL, RW_WRITER) =3D=3D = SCL_STATE_ALL);=0A= @@ -1154,7 +1155,7 @@=0A= return (ENXIO);=0A= }=0A= =0A= - error =3D vd->vdev_ops->vdev_op_open(vd, &osize, &ashift);=0A= + error =3D vd->vdev_ops->vdev_op_open(vd, &osize, &ashift, &dashift);=0A= =0A= /*=0A= * Reset the vdev_reopening flag so that we actually close=0A= @@ -1255,14 +1256,16 @@=0A= */=0A= vd->vdev_asize =3D asize;=0A= vd->vdev_ashift =3D MAX(ashift, vd->vdev_ashift);=0A= + vd->vdev_dashift =3D MAX(dashift, vd->vdev_dashift);=0A= } else {=0A= /*=0A= * Make sure the alignment requirement hasn't increased.=0A= */=0A= if (ashift > vd->vdev_top->vdev_ashift) {=0A= + printf("ZFS ashift open failure of %s (%ld > %ld)\n", vd->vdev_path, = ashift, vd->vdev_top->vdev_ashift);=0A= vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN,=0A= VDEV_AUX_BAD_LABEL);=0A= return (EINVAL);=0A= }=0A= }=0A= =0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c.orig = 2012-11-05 15:27:52.092194343 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c = 2012-11-05 15:53:26.449021023 +0000=0A= @@ -145,9 +145,12 @@=0A= #include =0A= =0A= static boolean_t vdev_trim_on_init =3D B_TRUE;=0A= +static boolean_t vdev_dashift_enable =3D B_TRUE;=0A= SYSCTL_DECL(_vfs_zfs_vdev);=0A= SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, trim_on_init, CTLFLAG_RW,=0A= &vdev_trim_on_init, 0, "Enable/disable full vdev trim on = initialisation");=0A= +SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, optimal_ashift, CTLFLAG_RW,=0A= + &vdev_dashift_enable, 0, "Enable/disable optimal ashift usage on = initialisation");=0A= =0A= /*=0A= * Basic routines to read and write from a vdev label.=0A= @@ -282,6 +285,16 @@=0A= vd->vdev_ms_array) =3D=3D 0);=0A= VERIFY(nvlist_add_uint64(nv, ZPOOL_CONFIG_METASLAB_SHIFT,=0A= vd->vdev_ms_shift) =3D=3D 0);=0A= + /*=0A= + * We use the max of ashift and dashift (the desired/optimal=0A= + * ashift), which is typically the stripesize of a device, to=0A= + * ensure we get the best performance from underlying devices.=0A= + * =0A= + * Its done here as it should only ever have an effect on new=0A= + * zpool creation.=0A= + */=0A= + if (vdev_dashift_enable)=0A= + vd->vdev_ashift =3D MAX(vd->vdev_ashift, vd->vdev_dashift);=0A= VERIFY(nvlist_add_uint64(nv, ZPOOL_CONFIG_ASHIFT,=0A= vd->vdev_ashift) =3D=3D 0);=0A= VERIFY(nvlist_add_uint64(nv, ZPOOL_CONFIG_ASIZE,=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h.orig = 2012-10-22 20:40:08.361577293 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h = 2012-10-22 21:02:52.447781800 +0000=0A= @@ -55,7 +55,7 @@=0A= /*=0A= * Virtual device operations=0A= */=0A= -typedef int vdev_open_func_t(vdev_t *vd, uint64_t *size, uint64_t = *ashift);=0A= +typedef int vdev_open_func_t(vdev_t *vd, uint64_t *size, uint64_t = *ashift, uint64_t *dashift);=0A= typedef void vdev_close_func_t(vdev_t *vd);=0A= typedef uint64_t vdev_asize_func_t(vdev_t *vd, uint64_t psize);=0A= typedef int vdev_io_start_func_t(zio_t *zio);=0A= @@ -119,6 +119,7 @@=0A= uint64_t vdev_asize; /* allocatable device capacity */=0A= uint64_t vdev_min_asize; /* min acceptable asize */=0A= uint64_t vdev_ashift; /* block alignment shift */=0A= + uint64_t vdev_dashift; /* desired blk alignment shift */=0A= uint64_t vdev_state; /* see VDEV_STATE_* #defines */=0A= uint64_t vdev_prevstate; /* used when reopening a vdev */=0A= vdev_ops_t *vdev_ops; /* vdev operations */=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c.orig = 2012-11-02 14:56:29.474248887 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c 2012-11-03 = 01:27:28.066912403 +0000=0A= @@ -41,6 +41,30 @@=0A= #include =0A= #include =0A= =0A= +#define ZFS_MIN_ASHIFT SPA_MINBLOCKSHIFT=0A= +/*=0A= + * Max ashift - limited by how labels are accessed by zio_read_phys = using offsets=0A= + * within vdev_label_t=0A= + *=0A= + * If label access is fixed to work with ashift properly then the max = should be=0A= + * set to SPA_MAXBLOCKSHIFT=0A= + */=0A= +#define ZFS_MAX_ASHIFT 13=0A= +/*=0A= + * Optimum ashift - defaults to 12 which results in a min block size of = 4096 as=0A= + * this is the optimum value for newer disks which are migrating from = 512 to 4096=0A= + * byte sectors=0A= + */=0A= +#define ZFS_OPTIMUM_ASHIFT 12 =0A= +=0A= +/*=0A= + * Minimum ashift used when creating new pools=0A= + *=0A= + * This can be tuned using the sysctl vfs.zfs.min_create_ashift but is = limited=0A= + * to a min of ZFS_MIN_ASHIFT and a max of ZFS_MAX_ASHIFT=0A= + * =0A= + */=0A= +int zfs_min_ashift =3D MAX(SPA_MINBLOCKSHIFT, ZFS_OPTIMUM_ASHIFT);=0A= int zfs_no_write_throttle =3D 0;=0A= int zfs_write_limit_shift =3D 3; /* 1/8th of physical memory */=0A= int zfs_txg_synctime_ms =3D 1000; /* target millisecs to sync a txg */=0A= @@ -54,6 +78,9 @@=0A= =0A= static pgcnt_t old_physmem =3D 0;=0A= =0A= +#ifdef _KERNEL=0A= +static int min_ashift_sysctl(SYSCTL_HANDLER_ARGS);=0A= +=0A= SYSCTL_DECL(_vfs_zfs);=0A= TUNABLE_INT("vfs.zfs.no_write_throttle", &zfs_no_write_throttle);=0A= SYSCTL_INT(_vfs_zfs, OID_AUTO, no_write_throttle, CTLFLAG_RDTUN,=0A= @@ -78,6 +105,32 @@=0A= TUNABLE_QUAD("vfs.zfs.write_limit_override", &zfs_write_limit_override);=0A= SYSCTL_QUAD(_vfs_zfs, OID_AUTO, write_limit_override, CTLFLAG_RDTUN,=0A= &zfs_write_limit_override, 0, "");=0A= +SYSCTL_PROC(_vfs_zfs, OID_AUTO, min_create_ashift, CTLTYPE_INT | = CTLFLAG_RW,=0A= + &zfs_min_ashift, 0, min_ashift_sysctl, "I",=0A= + "Minimum ashift used when creating new pools");=0A= +=0A= +static int=0A= +min_ashift_sysctl(SYSCTL_HANDLER_ARGS)=0A= +{=0A= + int error, value;=0A= +=0A= + value =3D *(int *)arg1;=0A= +=0A= + error =3D sysctl_handle_int(oidp, &value, 0, req);=0A= +=0A= + if ((error !=3D 0) || (req->newptr =3D=3D NULL))=0A= + return (error);=0A= +=0A= + if (value < ZFS_MIN_ASHIFT)=0A= + value =3D ZFS_MIN_ASHIFT;=0A= + else if (value > ZFS_MAX_ASHIFT)=0A= + value =3D ZFS_MAX_ASHIFT;=0A= +=0A= + *(int *)arg1 =3D value;=0A= +=0A= + return (0);=0A= +}=0A= +#endif=0A= =0A= int=0A= dsl_pool_open_special_dir(dsl_pool_t *dp, const char *name, dsl_dir_t = **ddp)=0A= ------=_NextPart_000_0014_01CDFA24.BAC45C90--