Date: Mon, 10 Jun 2013 18:25:18 +0900 (JST) From: Hiroki Sato <hrs@FreeBSD.org> To: killing@multiplay.co.uk Cc: rc@FreeBSD.org, zfs-devel@FreeBSD.org Subject: Re: Feeback on patch to add ZFS support to mdconfig rc.d scripts Message-ID: <20130610.182518.138879193135244843.hrs@allbsd.org> In-Reply-To: <5E2843FF4724492FBF2A29E43B698385@multiplay.co.uk> References: <B5D8F93BF50D4D57BF31A4B2FCBA7621@multiplay.co.uk> <20130609.041903.541782050134731313.hrs@allbsd.org> <5E2843FF4724492FBF2A29E43B698385@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
----Security_Multipart0(Mon_Jun_10_18_25_18_2013_749)-- Content-Type: Multipart/Mixed; boundary="--Next_Part(Mon_Jun_10_18_25_18_2013_246)--" Content-Transfer-Encoding: 7bit ----Next_Part(Mon_Jun_10_18_25_18_2013_246)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit "Steven Hartland" <killing@multiplay.co.uk> wrote in <5E2843FF4724492FBF2A29E43B698385@multiplay.co.uk>: ki> ----- Original Message ----- ki> From: "Hiroki Sato" <hrs@FreeBSD.org> ki> > I think this should be separated into adding non-fs md support into ki> > rc.d/mdconfig and on-the-fly zpool creation/import into rc.d/zfs ki> > since the rc.d/mdconfig script does not (and should not) depend on ki> > zfs.ko module. ki> > ki> > rc.d/mdconfig is located before mountcritlocal, so probably ki> > rc.d/zfs_md or something is needed just after rc.d/mdconfig2. ki> ki> Thanks for that Hiroki. If I understand correctly that your only ki> concern is the zfs module dependency? If so then its easier to make ki> that a dynamic dependency using a prestart. Not only that. What I am concerned about is adding ZFS support into rc.d/mdconfig makes mount sequence in rc.d script more complicated. Currently, /etc/rc.d/mdconfig runs, /etc/fstab is parsed, and then "zfs mount -a" is done in rc.d/zfs. The rc.d scripts is not good at handling dependency, so mounting/unmounting ZFS datasets in multiple scripts make difficult to use these scripts separetely at run time while probably they work fine at boot time. If we support md-backed zpool, "rc.d/zfs stop" should also handle unmounting all of ZFS datasets and destroy the md devices, for example, because it is responsible for stopping the ZFS and its related functionality. This is difficult to realize by using only rc.d scripts, I created and attached a prototype to add ZFS support directly to mount_mfs(8). Does this satisfy your needs? Just adding /etc/fstab entry like the following, md-backed zpools are automatically created and mounted via rc.d/mountcritlocal. If it is too early, "late" flag can be used as other file systems. Unmounting them is handled only by rc.d/zfs in this case. The mount options in an fstab entry look a bit odd because most of letters are already used in mount_mfs(8), but they are flexible to support zpool's command-line arguments. ---- # /etc/fstab examples: # # create swap-backed md and zpool "poola" on it, and then mount it on /a. md /a mfs rw,-s1g,-zpoola,-kcompression:on,-kexec:off 0 0 # # create vnode-backed md and zpool "poolb" on it, and then mount it on /b. md /b mfs rw,-s1g,-zpoolb,-F/var/tmp/zero 0 0 # # import (-P flag) a zpool "zpoolc" on vnode-backed md "/var/tmp/zero". # Mountpoint is automatically determined (/c is ignored). md /c mfs rw,-s1g,-zpoolc,-F/var/tmp/zero,-P 0 0 # # import a zpool "zpoold" on vnode-backed md at rc.d/mountlate stage. # Mountpoint is automatically determined (/d is ignored). md /d mfs rw,-s1g,-zpoold,-F/var/tmp/zero,-P,late 0 0 ---- The following command line should work with the above example: # mount /a # sh /etc/rc.d/zfs stop -- Hiroki ----Next_Part(Mon_Jun_10_18_25_18_2013_246)-- Content-Type: Text/X-Patch; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="mdmfs_zfs_support.20130610-1.diff" Index: etc/rc.d/zfs =================================================================== --- etc/rc.d/zfs (revision 251176) +++ etc/rc.d/zfs (working copy) @@ -48,6 +48,26 @@ zfs_stop_main() { + # Export md-backed zpools and destroy the md devices if any + while read dev mt fstype opt; do + case $dev:$opt in + \#*) # skip comment line + ;; + md:*,-F/*,-z*|md:*,-z*,-F/*) + opt=${opt##*,-z} + poolname=${opt%%,*} + # Check if $poolname exists. + df -t zfs $poolname > /dev/null 2>&1 || continue + set -- `zpool list -Hv $poolname` + if zpool export $poolname && mdconfig -d -u $9; then + : + else + warn "Cannot unmount zpool $poolname." + fi + ;; + esac + done < /etc/fstab + zfs unshare -a zfs unmount -a } Index: sbin/mdmfs/mdmfs.8 =================================================================== --- sbin/mdmfs/mdmfs.8 (revision 251176) +++ sbin/mdmfs/mdmfs.8 (working copy) @@ -25,7 +25,7 @@ .\" .\" $FreeBSD$ .\" -.Dd September 4, 2011 +.Dd June 10, 2013 .Dt MDMFS 8 .Os .Sh NAME @@ -46,6 +46,7 @@ .Op Fl F Ar file .Op Fl f Ar frag-size .Op Fl i Ar bytes +.Op Fl k Ar zfs-filesystem-property:value .Op Fl m Ar percent-free .Op Fl n Ar rotational-positions .Op Fl O Ar optimization @@ -54,6 +55,8 @@ .Op Fl s Ar size .Op Fl v Ar version .Op Fl w Ar user : Ns Ar group +.Op Fl z Ar zfs-zpoolname +.Op Fl Z Ar zfs-property:value .Ar md-device .Ar mount-point .Sh DESCRIPTION @@ -263,6 +266,36 @@ .It Fl X Print what command will be run before running it, and other assorted debugging information. +.It Fl z Ar zfs-zpoolname +Specify to use ZFS and the pool name as +.Ar zfs-zpoolname . +When this is specified, +parameters for +.Xr newfs 8 +are ignored except for a +.Fl P +flag. +.It Fl Z Ar zfs-property:value +Specify zpool property. +This will be passed to +.Xr zpool 8 +utility as +.Fl o Ar zfs-property=value . +The default value is +.Fl Z Ar cechefile:none . +.It Fl k Ar zfs-filesystem-property:value +Specify zfs property. +This will be passed to +.Xr zpool 8 +utility as +.Fl O Ar zfs-filesystem-property=value . +The default values are +.Fl k Ar atime:off , +.Fl k Ar primarycache:none , +and +.Fl k Ar sync:disabled +for swap-backed disk, +and none for vnode-backed disk. .El .Pp The @@ -289,6 +322,13 @@ .Xr newfs 8 as .Fl o . +The values in +.Fl z , +.Fl Z , +and +.Fl k +options are passed to +.Xr zpool 8 . The .Fl o option is passed to Index: sbin/mdmfs/mdmfs.c =================================================================== --- sbin/mdmfs/mdmfs.c (revision 251176) +++ sbin/mdmfs/mdmfs.c (working copy) @@ -81,6 +81,8 @@ static void do_mount(const char *, const char *); static void do_mtptsetup(const char *, struct mtpt_info *); static void do_newfs(const char *); +static void do_zpool(const char *, const char *, const char *, + const enum md_types, int); static void extract_ugid(const char *, struct mtpt_info *); static int run(int *, const char *, ...) __printflike(2, 3); static void usage(void); @@ -90,11 +92,11 @@ { struct mtpt_info mi; /* Mountpoint info. */ char *mdconfig_arg, *newfs_arg, /* Args to helper programs. */ - *mount_arg; + *mount_arg, *zpool_arg; enum md_types mdtype; /* The type of our memory disk. */ bool have_mdtype; - bool detach, softdep, autounit, newfs; - char *mtpoint, *unitstr; + bool detach, softdep, autounit, newfs, zfs; + char *mtpoint, *unitstr, *poolname; char *p; int ch; void *set; @@ -106,6 +108,7 @@ softdep = true; autounit = false; newfs = true; + zfs = false; have_mdtype = false; mdtype = MD_SWAP; mdname = MD_NAME; @@ -118,6 +121,7 @@ */ mdconfig_arg = strdup(""); newfs_arg = strdup(""); + zpool_arg = strdup(""); mount_arg = strdup(""); /* If we were started as mount_mfs or mfs, imply -C. */ @@ -129,7 +133,7 @@ } while ((ch = getopt(argc, argv, - "a:b:Cc:Dd:E:e:F:f:hi:LlMm:NnO:o:Pp:Ss:tUv:w:X")) != -1) + "a:B:b:Cc:Dd:E:e:F:f:hi:k:LlMm:NnO:o:Pp:Ss:tUv:w:Xz:Z:")) != -1) switch (ch) { case 'a': argappend(&newfs_arg, "-a %s", optarg); @@ -171,6 +175,14 @@ case 'i': argappend(&newfs_arg, "-i %s", optarg); break; + case 'k': + if (zfs != true) + errx(1, "-z poolname required first"); + p = optarg; + while ((p = strchr(p, ':')) != NULL) + p[0] = (char)'='; + argappend(&zpool_arg, "-O %s", optarg); + break; case 'L': loudsubs = true; break; @@ -231,6 +243,18 @@ case 'X': debug = true; break; + case 'z': + poolname = strdup(optarg); + zfs = true; + break; + case 'Z': + if (zfs != true) + errx(1, "-z poolname required first"); + p = optarg; + while ((p = strchr(p, ':')) != NULL) + p[0] = (char)'='; + argappend(&zpool_arg, "-o %s", optarg); + break; default: usage(); } @@ -272,9 +296,13 @@ do_mdconfig_attach_au(mdconfig_arg, mdtype); else do_mdconfig_attach(mdconfig_arg, mdtype); - if (newfs) - do_newfs(newfs_arg); - do_mount(mount_arg, mtpoint); + if (zfs) + do_zpool(zpool_arg, poolname, mtpoint, mdtype, newfs); + else { + if (newfs) + do_newfs(newfs_arg); + do_mount(mount_arg, mtpoint); + } do_mtptsetup(mtpoint, &mi); return (0); @@ -514,6 +542,63 @@ } /* + * Put a zpool on the memory disk. When vnode is specified, try to import + * zpool. + */ +static void +do_zpool(const char *args, const char *poolname, const char *mtpoint, + const enum md_types mdtype, int newfs) +{ + const char *cmd; + const char *addargs; + char mtarg[PATH_MAX + 10]; + char mdarg[PATH_MAX + 10]; + int rv; + + switch (mdtype) { + case MD_SWAP: + case MD_MALLOC: + cmd = "create"; + /* A set of reasonable default parameters. */ + addargs = " -o cachefile=none " + "-O primarycache=none " + "-O sync=disabled "; + snprintf(mtarg, sizeof(mtarg), " -m %s" , mtpoint); + mtarg[sizeof(mtarg) - 1] = '\0'; + snprintf(mdarg, sizeof(mdarg), " /dev/%s%d" , mdname, unit); + mdarg[sizeof(mdarg) - 1] = '\0'; + break; + case MD_VNODE: + /* Ignore return code. */ + run(NULL, "%s export %s", _PATH_ZPOOL, poolname); + + if (newfs) { + cmd = "create"; + /* Override the existing pool. */ + addargs = " -f -o cachefile=none"; + snprintf(mtarg, sizeof(mtarg), " -m %s" , mtpoint); + mtarg[sizeof(mtarg) - 1] = '\0'; + snprintf(mdarg, sizeof(mdarg), " /dev/%s%d" , mdname, + unit); + mdarg[sizeof(mdarg) - 1] = '\0'; + } else { + cmd = "import"; + addargs = " -o cachefile=none"; + mtarg[0] = '\0'; + mdarg[0] = '\0'; + } + break; + default: + abort(); + } + + rv = run(NULL, "%s %s%s%s%s %s%s", + _PATH_ZPOOL, cmd, addargs, args, mtarg, poolname, mdarg); + if (rv) + errx(1, "zpool exited with error code %d", rv); +} + +/* * 'str' should be a user and group name similar to the last argument * to chown(1); i.e., a user, followed by a colon, followed by a * group. The user and group in 'str' may be either a [ug]id or a Index: include/paths.h =================================================================== --- include/paths.h (revision 251176) +++ include/paths.h (working copy) @@ -88,6 +88,7 @@ #define _PATH_UFSSUSPEND "/dev/ufssuspend" #define _PATH_VI "/usr/bin/vi" #define _PATH_WALL "/usr/bin/wall" +#define _PATH_ZPOOL "/sbin/zpool" /* Provide trailing slash, since mostly used for building pathnames. */ #define _PATH_DEV "/dev/" ----Next_Part(Mon_Jun_10_18_25_18_2013_246)---- ----Security_Multipart0(Mon_Jun_10_18_25_18_2013_749)-- Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (FreeBSD) iEYEABECAAYFAlG1m34ACgkQTyzT2CeTzy0OKgCfRLFGzjAuf/MPKmBSBzZILRd1 GQQAn1vs/zlIGHU5zZLVqzB+HO3WaRBx =26qm -----END PGP SIGNATURE----- ----Security_Multipart0(Mon_Jun_10_18_25_18_2013_749)----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130610.182518.138879193135244843.hrs>