From owner-freebsd-fs@FreeBSD.ORG Tue May 17 09:36:45 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 35B5D1065674 for ; Tue, 17 May 2011 09:36:45 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id E138D8FC15 for ; Tue, 17 May 2011 09:36:44 +0000 (UTC) Received: by qwc9 with SMTP id 9so193173qwc.13 for ; Tue, 17 May 2011 02:36:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to:cc :content-type; bh=MhIlTYRbAgpXtz0FsJldYRkOt/V0yFQAxpCnwUU8SA0=; b=FogjUcFlcMxoTutpp8l1/h8ECSlR5fDtlLsPwz1O8orZ42WB0D27y9aXzQPItE1H1e SrltSsE5WnVFlAOml12zBADCpSp7RPG6CIjFDOXIFEqj7tCtNs58qZPQDi1Chle6dDrA Tj91ktgV0dVbxavYztlw9jMNxGfboD7OAfqeI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; b=qUiHd6hQ2rWodOh//M6XCbk2ll1zi2TvoO1i0cDaWVGSDli87fktw1ryqUrsc3yS7E X26fNewIqg5ew8AmGOug4HWd0xgJzcaq5UTRlOfNetre5I/JLmtt8GgcDMGqIi4G/MvB vYjzDVLdHMVeu6xCbz/N0KH4/zR+vzy75CqyA= MIME-Version: 1.0 Received: by 10.229.181.142 with SMTP id by14mr267219qcb.247.1305625003915; Tue, 17 May 2011 02:36:43 -0700 (PDT) Received: by 10.229.111.218 with HTTP; Tue, 17 May 2011 02:36:43 -0700 (PDT) Date: Tue, 17 May 2011 13:36:43 +0400 Message-ID: From: Sergey Kandaurov To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: [old nfsclient] different nmount() args passed from mount vs. mount_nfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 09:36:45 -0000 Hi. First, sorry for the long mail. I just tried to describe in full details. When mounting nfs with some options, I found that /sbin/mount and /sbin/mount_nfs pass options to nmount() differently, which results in bad things (TM). I traced the options and here they are: >From mount(8) -> mount_nfs(8): "rw" -> "" "addr" -> {something valid } "fh" -> 5 "sec" -> "sys" "nfsv3" -> 0x0 => NFSMNT_NFSV3 "hostname" -> "dev2.mail:/home/svn/freebsd/head" "fstype" -> "oldnfs" "fspath" -> "/usr/src" "errmsg" -> "" (nil) >From pre-r221124 mount(8): = "fstype" -> "oldnfs" "hostname" -> "dev2.mail" = "fspath" -> "/usr/src" "from" -> "dev2.mail:/home/svn/freebsd/head" = "errmsg" -> "" (nil) Note, that pre-r221124 mount(8) knows nothing about oldnfs. 1. "hostname" option is passed differently from mount(8) and mount_nfs(8). When I force to mount oldnfs file system with mount(8) directly (to not bypass the nmount(2) call to mount_nfs(8)), I get this error: ./mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid argument Hmm.. this may be because mount(8) passes value in $hostname:$path format (see the traces above). It might be due to different old nfsclient way to parse args, but I am not sure, I can be wrong. Anyway, it does not matter now. The actual problem manifests when running the command with pre-r221124 mount(8) binary. It knows nothing about "oldnfs" and (attention!) calls nmount(2) directly instead of bypassing the call to the mount_nfs(8) binary as usually done, and this is the place where the "unsanitized nmount(2) args" problem is hidden. [New mount knows about "oldnfs" and passes the call to mount_oldnfs(8) that prepares all the nmount(2) args to correctly hide the problem.] To prove it, that is how old and new mount(8) work differently: 1) new mount(8) as of current mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src exec: mount_oldnfs dev2.mail:/home/svn/freebsd/head /usr/src 2) old mount(8) as of pre-r221124 ./mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src Ok, back to the first paragraph: a different "hostname" mount option. When I first faced with this, I tried to specify value for "hostname" explicitly. Here it comes: ./mount -t oldnfs -o hostname=dev2.mail dev2.mail:/home/svn/freebsd/head /usr/src [CABOOM!] It just crashed. Do not do this :) Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x1 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff805da299 stack pointer = 0x28:0xffffff807bef6240 frame pointer = 0x28:0xffffff807bef62a0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2541 (mount) db> bt Tracing pid 2541 tid 100076 td 0xfffffe0001ace460 nfs_connect() at 0xffffffff805da299 = nfs_connect+0x79 nfs_request() at 0xffffffff805da978 = nfs_request+0x398 nfs_getattr() at 0xffffffff805e2a6c = nfs_getattr+0x2bc VOP_GETATTR_APV() at 0xffffffff806f4283 = VOP_GETATTR_APV+0xd3 mountnfs() at 0xffffffff805de739 = mountnfs+0x329 nfs_mount() at 0xffffffff805dffc7 = nfs_mount+0xcf7 vfs_donmount() at 0xffffffff804d46ff = vfs_donmount+0x82f nmount() at 0xffffffff804d54f3 = nmount+0x63 syscallenter() at 0xffffffff804861cb = syscallenter+0x1cb syscall() at 0xffffffff806ae710 = syscall+0x60 Xfast_syscall() at 0xffffffff8069922d = Xfast_syscall+0xdd --- syscall (378, FreeBSD ELF64, nmount), rip = 0x800ab444c, rsp = 0x7fffffffca48, rbp = 0x801009058 --- As you might see from above nmount(2) args traces, mount(8) itself doesn't pass the "addr" option to the nmount(2) syscall while nfs_mount() expects to receive it, which is the problem. Later deep in nmount(2) in /sys/nfsclient/nfs_krpc.c it tries to dereference addr value and page faults here in nfs_connect() : vers = NFS_VER3; else if (nmp->nm_flag & NFSMNT_NFSV4) vers = NFS_VER4; XXX saddr is NULL, the next line will crash if (saddr->sa_family == AF_INET) if (nmp->nm_sotype == SOCK_DGRAM) nconf = getnetconfigent("udp"); I think that nfsclient, probably in sys/nfsclient/nfs_vfsops.c:mount_nfs(), should handle a missing value for "addr" and/or "fh" mount options. It doesn't check it currently: % static int % nfs_mount(struct mount *mp) % { % struct nfs_args args = { % [...] % .addr = NULL, % }; % int error, ret, has_nfs_args_opt; % int has_addr_opt, has_fh_opt, has_hostname_opt; % struct sockaddr *nam; addr is initialized with NULL. num used later as a pointer to args.addr value. % if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) { % error = nfs_mountroot(mp); % goto out; % } We do not try to mount root, this is not ours. % if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) { [...] % has_nfs_args_opt = 1; % } We do not use old mount(2) interface, not ours. % if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0) % args.flags |= NFSMNT_NFSV3; mount(8) doesn't pass nfsv3 option, so NFSMNT_NFSV3 isn't set. % if (vfs_getopt(mp->mnt_optnew, "addr", (void **)&args.addr, % &args.addrlen) == 0) { % has_addr_opt = 1; % if (args.addrlen > SOCK_MAXADDRLEN) { % error = ENAMETOOLONG; % goto out; % } % nam = malloc(args.addrlen, M_SONAME, % M_WAITOK); % bcopy(args.addr, nam, args.addrlen); % nam->sa_len = args.addrlen; % } mount(8) doesn't pass addr option, so args.addr isn't set, hence struct sockaddr *nam is also NULL, has_addr_opt is 0. % if (vfs_getopt(mp->mnt_optnew, "hostname", (void **)&args.hostname, % NULL) == 0) { % has_hostname_opt = 1; % } % if (args.hostname == NULL) { % vfs_mount_error(mp, "Invalid hostname"); % error = EINVAL; % goto out; % } I don't know why I got here the error. I didn't analyze it deep though. "mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid argument" % if (mp->mnt_flag & MNT_UPDATE) { [...] That's not update case, it's not ours. % if (has_nfs_args_opt) { has_nfs_args_opt is 0, as we don't use legacy mount(2) interface, see above. So, the whole block is ignored. Though, see below. % /* % * In the 'nfs_args' case, the pointers in the args % * structure are in userland - we copy them in here. % */ % if (!has_fh_opt) { % error = copyin((caddr_t)args.fh, (caddr_t)nfh, % args.fhsize); % if (error) { % goto out; % } % args.fh = nfh; % } has_fh_opt is 0, as mount(8) didn't pass "fh" to nmount(2), though this part is not executed anyway. % if (!has_hostname_opt) { % error = copyinstr(args.hostname, hst, MNAMELEN-1, &len) % if (error) { % goto out; % } % bzero(&hst[len], MNAMELEN - len); % args.hostname = hst; has_hostname_opt is 1, as mount(8) passes "hostname" to nmount(2), though this part is not executed anyway. % } % if (!has_addr_opt) { % /* sockargs() call must be after above copyin() calls * % printf("args.addr: %p\n", args.addr); % error = getsockaddr(&nam, (caddr_t)args.addr, % args.addrlen); % printf("error: %d\n", error); % if (error) { % goto out; % } % } has_addr_opt is 0, as mount(8) didn't pass "addr" to nmount(2), though this part is not executed anyway. % } % error = mountnfs(&args, mp, nam, args.hostname, &vp, % curthread->td_ucred, negnametimeo); mountnfs() is called with nam == NULL, then it crashes deep in /sys/nfsclient/nfs_krpc.c:nfs_connect(). Also compare ddb backtrace with one from new mount(8) which bypasses the call to mount_nfs(8). I got it by adding kdb_enter() just before NULL pointer dereference. db> bt Tracing pid 2143 tid 100117 td 0xfffffe0001c58000 kdb_enter() at 0xffffffff80477d1b = kdb_enter+0x3b nfs_connect() at 0xffffffff805da7e8 = nfs_connect+0x88 nfs_request() at 0xffffffff805daec8 = nfs_request+0x398 nfs_fsinfo() at 0xffffffff805ddec0 = nfs_fsinfo+0xd0 mountnfs() at 0xffffffff805ded44 = mountnfs+0x3e4 nfs_mount() at 0xffffffff805e051f = nfs_mount+0xcff vfs_donmount() at 0xffffffff804d5092 = vfs_donmount+0xc92 nmount() at 0xffffffff804d5a33 = nmount+0x63 syscallenter() at 0xffffffff804866eb = syscallenter+0x1cb syscall() at 0xffffffff806aec90 = syscall+0x60 Xfast_syscall() at 0xffffffff806997ad = Xfast_syscall+0xdd --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8008a544c, rsp = 0x7fffffffd258, rbp = 0x7fffffffd30c --- Two backtraces different slightly because of NFSMNT_NFSV3 is not set in the old mount(8) case. From sys/nfsclient/nfs_vfsops.c:mountnfs() if (argp->flags & NFSMNT_NFSV3) nfs_fsinfo(nmp, *vpp, curthread->td_ucred, curthread); else VOP_GETATTR(*vpp, &attrs, curthread->td_ucred); -- wbr, pluknet