From owner-freebsd-hackers@freebsd.org Sun Aug 27 04:51:54 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 60B3FDE74D2; Sun, 27 Aug 2017 04:51:54 +0000 (UTC) (envelope-from rlibby@gmail.com) Received: from mail-pf0-x230.google.com (mail-pf0-x230.google.com [IPv6:2607:f8b0:400e:c00::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 36318653FC; Sun, 27 Aug 2017 04:51:54 +0000 (UTC) (envelope-from rlibby@gmail.com) Received: by mail-pf0-x230.google.com with SMTP id r62so6687389pfj.0; Sat, 26 Aug 2017 21:51:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=V5IiacMvApYtqoP6AfNVrCyHRg+tv7rMabWHTy9F11k=; b=hY0f0zoZ/cMSCuMVtBYmULy7uRcM41we9MGvq0UrWSLMUURfeibREJhNv2HklosSu/ BUzNvPCIqprCegrF1rj43XFCI0aH1C/Y68sfIuGwv2j9ksehkQE7BU4WRN+DxRvX7AgY LFC1yHwC3mXChu3VOE3T1K7VixV6blr/B4g3p0sgHqKEfJvs8d9kaWavKWpZgnB9AF71 At85HMg5bM4VMZmB5UMgdtXl7PrrGSNtmOqATfSHEDwjAPeEx3xE+YHVK/o4P13RfE8y XflE1qbZVSeAxLE7SWkYJ7/e/JWajbQW6gzg2X8mzc27vI7RU25VGLm8Fvxd5TlsbTA5 pQ+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=V5IiacMvApYtqoP6AfNVrCyHRg+tv7rMabWHTy9F11k=; b=K9XwzFH+GmsdapH/cCJywdQAc1S5iCYb7NHGFXs0NViKWYNfqA2gNajFSNK0TSho2C MMwXRO0PJTAss9Bsm7hV2Z83TWwJp9juygCav7HJsJHDd9tH6QTQzo0QpzxwvVr5se78 RYBCS0Kij2GjaREpxBmSwirjaVW+D1vcAqoL18X62IwZSmIYMLpdWR6vF4u3ynKS1JOs otGJk5ZGYP/iCPElTd3tdNlAj03ROV/mvfd690pvnQmHANjTa4IRuInB6S4S2j0kl0sX mgQzdRS69ZSV40HC9xgOdmhDrI/DHIIqZgYXoNHSVNUs4RoqQ78ZgwE/bZ1O7aCegEPJ 41eA== X-Gm-Message-State: AHYfb5jQMCJcxRoG+SoBopNlrHd7N9x2PWy/V+GDmwSvnC3InutbZO20 lttqW5Q1xHMV3HlE2z8YFER7sr2rFA== X-Received: by 10.84.232.10 with SMTP id h10mr3778390plk.261.1503809513373; Sat, 26 Aug 2017 21:51:53 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.207.193 with HTTP; Sat, 26 Aug 2017 21:51:52 -0700 (PDT) In-Reply-To: References: From: Ryan Libby Date: Sat, 26 Aug 2017 21:51:52 -0700 Message-ID: Subject: Re: Compiling the kernel using GCC To: Aijaz Baig Cc: FreeBSD Hackers , FreeBSD Current Content-Type: text/plain; charset="UTF-8" X-Mailman-Approved-At: Sun, 27 Aug 2017 10:51:53 +0000 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Aug 2017 04:51:54 -0000 On Sat, Aug 26, 2017 at 2:41 AM, Aijaz Baig wrote: > Has anyone been able to successfully compile the kernel using GCC as > against CLANG the default compiler on most later versions of FreeBSD? I was > able to successfully buildworld. After which I reboot and now /usr/bin/cc > points to GCC as requested. However kernel fails to link [...] > make -D KERNFAST -m /usr/src/share/mk KERNEL=kernel all -DNO_MODULES_OBJ > linking kernel.full > ck_array.o: In function `ck_cc_popcount': > /usr/src/sys/contrib/ck/include/gcc/ck_cc.h:139: undefined reference to > `__popcountdi2' [...] Do you need ck? Can you share your KERNCONF? And the branch and revision you are building? I have been building an amd64 GENERIC kernel on head with various gccs. I think that does not build ck. The main method I have been focusing on for now is make CROSS_TOOLCHAIN=amd64-gcc TARGET=amd64 TARGET_ARCH=amd64 buildkernel with the amd64-xtoolchain-gcc-0.2 package installed. But I have also built kernel-only successfully with gcc 6 with src.conf e.g. as below. % cat /etc/src.conf.gcc6 WITH_GCC=yes WITH_GCC_BOOTSTRAP=yes WITH_GCC_IS_CC=yes WITHOUT_CLANG=yes WITHOUT_CLANG_BOOTSTRAP=yes WITHOUT_CLANG_IS_CC=yes WITHOUT_FORMAT_EXTENSIONS=yes CC=/usr/local/bin/gcc6 CXX=/usr/local/bin/g++6 From owner-freebsd-hackers@freebsd.org Sun Aug 27 13:18:11 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 33AA6DEFD42 for ; Sun, 27 Aug 2017 13:18:11 +0000 (UTC) (envelope-from daniel@roe.ch) Received: from schoggimuss.roe.ch (schoggimuss.roe.ch [IPv6:2a03:da40:0:35::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E929E71B9E for ; Sun, 27 Aug 2017 13:18:10 +0000 (UTC) (envelope-from daniel@roe.ch) Received: from daniel (ssh-from [178.197.225.197]) by schoggimuss.roe.ch (envelope-from ) with LOCAL id 1dlxS2-0007PJ-HB for freebsd-hackers@freebsd.org; Sun, 27 Aug 2017 15:18:06 +0200 Date: Sun, 27 Aug 2017 15:18:06 +0200 From: Daniel Roethlisberger To: freebsd-hackers@freebsd.org Subject: Re: [PATCH] O_NOATIME support for open(2) Message-ID: <20170827131806.GB21456@schoggimuss.roe.ch> Mail-Followup-To: freebsd-hackers@freebsd.org References: <20170826161827.GA21456@schoggimuss.roe.ch> <20170826175606.GQ1700@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170826175606.GQ1700@kib.kiev.ua> User-Agent: Mutt/1.8.3 (2017-05-23) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Aug 2017 13:18:11 -0000 Konstantin Belousov 2017-08-26: > On Sat, Aug 26, 2017 at 06:18:27PM +0200, Daniel Roethlisberger wrote: > > I'm trying to implement O_NOATIME support for open(2) in order to > > provide a more elegant way for backup/archiving software to > > prevent atime clobbering. Except for a 2008 thread on this list > > I did not find any material; not sure if anybody is interested in > > this or if there are reasons why this was never implemented. > Please point out the thread, e.g. by providing a link to the first > message in the thread in mailman archive. https://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/thread.html#26531 https://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026531.html > > The attached patch against 11.1 implements O_NOATIME support for > > open(2); it prevents read(2) and mmap(2) from clobbering atime if > > the file descriptor was opened with O_NOATIME. O_NOATIME is only > > permitted for root and the owner of the file. Currently it is > > only implemented for ufs/ffs. It seems to work for me but has > > not been extensively tested. > What would happen when additional page-in occurs on the mmaped area ? With mmap, the vnode is marked for atime update at the time of calling mmap (unless O_NOATIME is set on the fd). I do not see how the patch would impact page-ins in any way. Can you elaborate? > > I am interested in feedback from people who know their way around > > I/O and VFS code before I extend this to other file systems, make > > O_NOATIME tunable by fcntl(2), wire it to the Linux compat layer > > and write docs. Does the implementation look sane? Did I miss > > something important? > > > > Specifically, is there a better way to pass O_NOATIME into > > vm_mmap_vnode other than adding an additional boolean_t argument? > > I did not use an additional mmap flag because that would have > > required additional logic to prevent userland from passing the > > flag to the mmap syscall. > If you need two booleans to the function, consider substituting the > arguments with the single u_int flags, and define two flags, one for > the writecounted, one for noatime. Thanks for the feedback, I appreciate it. Daniel -- Daniel Roethlisberger http://daniel.roe.ch/ From owner-freebsd-hackers@freebsd.org Sun Aug 27 14:17:19 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 32618DF0DD5 for ; Sun, 27 Aug 2017 14:17:19 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BCB417328C for ; Sun, 27 Aug 2017 14:17:18 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v7REH8FS033008 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Sun, 27 Aug 2017 17:17:08 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v7REH8FS033008 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v7REH8jk033007 for freebsd-hackers@freebsd.org; Sun, 27 Aug 2017 17:17:08 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 27 Aug 2017 17:17:08 +0300 From: Konstantin Belousov To: freebsd-hackers@freebsd.org Subject: Re: [PATCH] O_NOATIME support for open(2) Message-ID: <20170827141708.GV1700@kib.kiev.ua> References: <20170826161827.GA21456@schoggimuss.roe.ch> <20170826175606.GQ1700@kib.kiev.ua> <20170827131806.GB21456@schoggimuss.roe.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170827131806.GB21456@schoggimuss.roe.ch> User-Agent: Mutt/1.8.3 (2017-05-23) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Aug 2017 14:17:19 -0000 On Sun, Aug 27, 2017 at 03:18:06PM +0200, Daniel Roethlisberger wrote: > Konstantin Belousov 2017-08-26: > > On Sat, Aug 26, 2017 at 06:18:27PM +0200, Daniel Roethlisberger wrote: > > > I'm trying to implement O_NOATIME support for open(2) in order to > > > provide a more elegant way for backup/archiving software to > > > prevent atime clobbering. Except for a 2008 thread on this list > > > I did not find any material; not sure if anybody is interested in > > > this or if there are reasons why this was never implemented. > > Please point out the thread, e.g. by providing a link to the first > > message in the thread in mailman archive. > > https://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/thread.html#26531 > https://lists.freebsd.org/pipermail/freebsd-hackers/2008-October/026531.html > Thank you. > > > The attached patch against 11.1 implements O_NOATIME support for > > > open(2); it prevents read(2) and mmap(2) from clobbering atime if > > > the file descriptor was opened with O_NOATIME. O_NOATIME is only > > > permitted for root and the owner of the file. Currently it is > > > only implemented for ufs/ffs. It seems to work for me but has > > > not been extensively tested. > > What would happen when additional page-in occurs on the mmaped area ? > > With mmap, the vnode is marked for atime update at the time of > calling mmap (unless O_NOATIME is set on the fd). I do not see > how the patch would impact page-ins in any way. Can you > elaborate? I mean, do we have some code paths which would cause page-ins to set atime ? If we currently do not have that, fine. My brief reading of the code suggests that we do not, at least for UFS. Somewhat related, if an image file is opened O_EXEC | O_NOATIME, does calling fexecve(2) on the fd prevents atime update with your patch ? It seems to me that the case is not handled. Note that in kernel code, we usually prefer O_XXX spelling for the open flags over the FXXX. From owner-freebsd-hackers@freebsd.org Sun Aug 27 23:54:29 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6ACB5E012C3 for ; Sun, 27 Aug 2017 23:54:29 +0000 (UTC) (envelope-from daniel@roe.ch) Received: from schoggimuss.roe.ch (schoggimuss.roe.ch [IPv6:2a03:da40:0:35::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 15E23261D for ; Sun, 27 Aug 2017 23:54:28 +0000 (UTC) (envelope-from daniel@roe.ch) Received: from daniel (ssh-from [212.51.147.3]) by schoggimuss.roe.ch (envelope-from ) with LOCAL id 1dm7No-00097s-So for freebsd-hackers@freebsd.org; Mon, 28 Aug 2017 01:54:24 +0200 Date: Mon, 28 Aug 2017 01:54:24 +0200 From: Daniel Roethlisberger To: freebsd-hackers@freebsd.org Subject: Re: [PATCH] O_NOATIME support for open(2) Message-ID: <20170827235424.GA34762@schoggimuss.roe.ch> Mail-Followup-To: freebsd-hackers@freebsd.org References: <20170826161827.GA21456@schoggimuss.roe.ch> <20170826175606.GQ1700@kib.kiev.ua> <20170827131806.GB21456@schoggimuss.roe.ch> <20170827141708.GV1700@kib.kiev.ua> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="envbJBWh7q8WU6mo" Content-Disposition: inline In-Reply-To: <20170827141708.GV1700@kib.kiev.ua> User-Agent: Mutt/1.8.3 (2017-05-23) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Aug 2017 23:54:29 -0000 --envbJBWh7q8WU6mo Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Konstantin Belousov 2017-08-27: > > > > The attached patch against 11.1 implements O_NOATIME support for > > > > open(2); it prevents read(2) and mmap(2) from clobbering atime if > > > > the file descriptor was opened with O_NOATIME. O_NOATIME is only > > > > permitted for root and the owner of the file. Currently it is > > > > only implemented for ufs/ffs. It seems to work for me but has > > > > not been extensively tested. > > > What would happen when additional page-in occurs on the mmaped area ? > > > > With mmap, the vnode is marked for atime update at the time of > > calling mmap (unless O_NOATIME is set on the fd). I do not see > > how the patch would impact page-ins in any way. Can you > > elaborate? > > I mean, do we have some code paths which would cause page-ins to set > atime ? If we currently do not have that, fine. My brief reading of the > code suggests that we do not, at least for UFS. Thanks for clarifying. I am not aware of any atime-updating page-in code paths. > Somewhat related, if an image file is opened O_EXEC | O_NOATIME, does > calling fexecve(2) on the fd prevents atime update with your patch ? > It seems to me that the case is not handled. Correct, thanks. > Note that in kernel code, we usually prefer O_XXX spelling for the open > flags over the FXXX. I removed FNOATIME entirely based on your feedback. Attached a revised patch that incorporates most of the feedback, plus supports setting/unsetting O_NOATIME through fcntl and handles many of the local file systems. It also enables O_NOATIME in the Linux compat shim. Docs are still missing, and testing on other filesystems than UFS very limited or not at all. Additional feedback much appreciated, especially on the code in kern_fcntl and vn_open_vnode. Daniel -- Daniel Roethlisberger http://daniel.roe.ch/ --envbJBWh7q8WU6mo Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="onoatime-v3.diff" diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c index 17819bc..ea736f1 100644 --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c @@ -820,7 +820,13 @@ zfs_read(vnode_t *vp, uio_t *uio, int ioflag, cred_t *cr, caller_context_t *ct) out: zfs_range_unlock(rl); +#ifdef __FreeBSD__ + if (ioflag & IO_NOATIME == 0) { + ZFS_ACCESSTIME_STAMP(zfsvfs, zp); + } +#else ZFS_ACCESSTIME_STAMP(zfsvfs, zp); +#endif /* __FreeBSD__ */ ZFS_EXIT(zfsvfs); return (error); } diff --git a/sys/compat/linux/linux_file.c b/sys/compat/linux/linux_file.c index 09133eb..2bf40c4 100644 --- a/sys/compat/linux/linux_file.c +++ b/sys/compat/linux/linux_file.c @@ -130,7 +130,8 @@ linux_common_open(struct thread *td, int dirfd, char *path, int l_flags, int mod bsd_flags |= O_NOFOLLOW; if (l_flags & LINUX_O_DIRECTORY) bsd_flags |= O_DIRECTORY; - /* XXX LINUX_O_NOATIME: unable to be easily implemented. */ + if (l_flags & LINUX_O_NOATIME) + bsd_flags |= O_NOATIME; error = kern_openat(td, dirfd, path, UIO_SYSSPACE, bsd_flags, mode); if (error != 0) @@ -1327,6 +1328,8 @@ fcntl_common(struct thread *td, struct linux_fcntl_args *args) if (result & O_DIRECT) td->td_retval[0] |= LINUX_O_DIRECT; #endif + if (result & O_NOATIME) + td->td_retval[0] |= LINUX_O_NOATIME; return (error); case LINUX_F_SETFL: @@ -1347,6 +1350,8 @@ fcntl_common(struct thread *td, struct linux_fcntl_args *args) if (args->arg & LINUX_O_DIRECT) arg |= O_DIRECT; #endif + if (args->arg & LINUX_O_NOATIME) + arg |= O_NOATIME; return (kern_fcntl(td, args->fd, F_SETFL, arg)); case LINUX_F_GETLK: diff --git a/sys/fs/ext2fs/ext2_vnops.c b/sys/fs/ext2fs/ext2_vnops.c index da3b267..a9d4702 100644 --- a/sys/fs/ext2fs/ext2_vnops.c +++ b/sys/fs/ext2fs/ext2_vnops.c @@ -1765,6 +1765,7 @@ ext2_ind_read(struct vop_read_args *ap) } if ((error == 0 || uio->uio_resid != orig_resid) && + ((ioflag & IO_NOATIME) == 0) && (vp->v_mount->mnt_flag & (MNT_NOATIME | MNT_RDONLY)) == 0) ip->i_flag |= IN_ACCESS; return (error); diff --git a/sys/fs/msdosfs/msdosfs_vnops.c b/sys/fs/msdosfs/msdosfs_vnops.c index cfabfaa..03b0ab2 100644 --- a/sys/fs/msdosfs/msdosfs_vnops.c +++ b/sys/fs/msdosfs/msdosfs_vnops.c @@ -600,6 +600,7 @@ msdosfs_read(struct vop_read_args *ap) brelse(bp); } while (error == 0 && uio->uio_resid > 0 && n != 0); if (!isadir && (error == 0 || uio->uio_resid != orig_resid) && + ((ap->a_ioflag & IO_NOATIME) == 0) && (vp->v_mount->mnt_flag & (MNT_NOATIME | MNT_RDONLY)) == 0) dep->de_flag |= DE_ACCESS; return (error); diff --git a/sys/fs/tmpfs/tmpfs_vnops.c b/sys/fs/tmpfs/tmpfs_vnops.c index 21baeb7..543d7fc 100644 --- a/sys/fs/tmpfs/tmpfs_vnops.c +++ b/sys/fs/tmpfs/tmpfs_vnops.c @@ -476,7 +476,8 @@ tmpfs_read(struct vop_read_args *v) if (uio->uio_offset < 0) return (EINVAL); node = VP_TO_TMPFS_NODE(vp); - tmpfs_set_status(node, TMPFS_NODE_ACCESSED); + if ((v->a_ioflag & IO_NOATIME) == 0) + tmpfs_set_status(node, TMPFS_NODE_ACCESSED); return (uiomove_object(node->tn_reg.tn_aobj, node->tn_size, uio)); } diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c index bea81a5..0f97601 100644 --- a/sys/kern/kern_descrip.c +++ b/sys/kern/kern_descrip.c @@ -553,6 +553,27 @@ kern_fcntl(struct thread *td, int fd, int cmd, intptr_t arg) cap_rights_init(&rights, CAP_FCNTL), F_SETFL, &fp); if (error != 0) break; + if ((arg & O_NOATIME) && ((fp->f_flag & O_NOATIME) == 0)) { + if (fp->f_type != DTYPE_VNODE) { + error = ENOTTY; + fdrop(fp, td); + break; + } + if ((fp->f_flag & FREAD) == 0) { + error = EOPNOTSUPP; + fdrop(fp, td); + break; + } + vp = fp->f_vnode; + vrefact(vp); + vn_lock(vp, LK_SHARED | LK_RETRY); + error = VOP_ACCESS(vp, VADMIN, td->td_ucred, td); + vput(vp); + if (error != 0) { + fdrop(fp, td); + break; + } + } do { tmp = flg = fp->f_flag; tmp &= ~FCNTLFLAGS; @@ -2820,10 +2841,23 @@ fgetvp_read(struct thread *td, int fd, cap_rights_t *rightsp, struct vnode **vpp } int -fgetvp_exec(struct thread *td, int fd, cap_rights_t *rightsp, struct vnode **vpp) +fgetvp_exec(struct thread *td, int fd, cap_rights_t *rightsp, struct vnode **vpp, + struct file **fpp) { + int error; - return (_fgetvp(td, fd, FEXEC, rightsp, vpp)); + *vpp = NULL; + error = _fget(td, fd, fpp, FEXEC, rightsp, NULL); + if (error != 0) + return (error); + if ((*fpp)->f_vnode == NULL) { + error = EINVAL; + } else { + *vpp = (*fpp)->f_vnode; + vrefact(*vpp); + } + + return (error); } #ifdef notyet diff --git a/sys/kern/kern_exec.c b/sys/kern/kern_exec.c index 1a3cc42..0fcfeee 100644 --- a/sys/kern/kern_exec.c +++ b/sys/kern/kern_exec.c @@ -381,6 +381,8 @@ do_execve(td, args, mac_p) #ifdef HWPMC_HOOKS struct pmckern_procexec pe; #endif + struct file *fp; + boolean_t noatime = FALSE; static const char fexecv_proc_title[] = "(fexecv)"; imgp = &image_params; @@ -453,9 +455,11 @@ interpret: * Descriptors opened only with O_EXEC or O_RDONLY are allowed. */ error = fgetvp_exec(td, args->fd, - cap_rights_init(&rights, CAP_FEXECVE), &newtextvp); + cap_rights_init(&rights, CAP_FEXECVE), &newtextvp, &fp); if (error) goto exec_fail; + noatime = fp->f_flag & O_NOATIME; + fdrop(fp, td); vn_lock(newtextvp, LK_EXCLUSIVE | LK_RETRY); AUDIT_ARG_VNODE1(newtextvp); imgp->vp = newtextvp; @@ -880,7 +884,8 @@ interpret: else exec_setregs(td, imgp, (u_long)(uintptr_t)stack_base); - vfs_mark_atime(imgp->vp, td->td_ucred); + if (!noatime) + vfs_mark_atime(imgp->vp, td->td_ucred); SDT_PROBE1(proc, , , exec__success, args->fname); diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c index 3138dda..6c64da4 100644 --- a/sys/kern/vfs_vnops.c +++ b/sys/kern/vfs_vnops.c @@ -309,6 +309,8 @@ vn_open_vnode(struct vnode *vp, int fmode, struct ucred *cred, return (EOPNOTSUPP); if (vp->v_type != VDIR && fmode & O_DIRECTORY) return (ENOTDIR); + if ((fmode & O_NOATIME) && ((fmode & FREAD) == 0)) + return (EOPNOTSUPP); accmode = 0; if (fmode & (FWRITE | O_TRUNC)) { if (vp->v_type == VDIR) @@ -317,6 +319,8 @@ vn_open_vnode(struct vnode *vp, int fmode, struct ucred *cred, } if (fmode & FREAD) accmode |= VREAD; + if ((fmode & O_NOATIME) && (fmode & FREAD)) + accmode |= VADMIN; if (fmode & FEXEC) accmode |= VEXEC; if ((fmode & O_APPEND) && (fmode & FWRITE)) @@ -798,6 +802,8 @@ vn_read(fp, uio, active_cred, flags, td) ioflag |= IO_NDELAY; if (fp->f_flag & O_DIRECT) ioflag |= IO_DIRECT; + if (fp->f_flag & O_NOATIME) + ioflag |= IO_NOATIME; advice = get_advice(fp, uio); vn_lock(vp, LK_SHARED | LK_RETRY); @@ -2398,6 +2404,7 @@ vn_mmap(struct file *fp, vm_map_t map, vm_offset_t *addr, vm_size_t size, vm_object_t object; vm_prot_t maxprot; boolean_t writecounted; + boolean_t noatime; int error; #if defined(COMPAT_FREEBSD7) || defined(COMPAT_FREEBSD6) || \ @@ -2470,9 +2477,10 @@ vn_mmap(struct file *fp, vm_map_t map, vm_offset_t *addr, vm_size_t size, foff < 0 || foff > OFF_MAX - size) return (EINVAL); + noatime = fp->f_flag & O_NOATIME; writecounted = FALSE; error = vm_mmap_vnode(td, size, prot, &maxprot, &flags, vp, - &foff, &object, &writecounted); + &foff, &object, &writecounted, noatime); if (error != 0) return (error); error = vm_mmap_object(map, addr, size, prot, maxprot, flags, object, diff --git a/sys/sys/fcntl.h b/sys/sys/fcntl.h index d1d0062..71a8e59 100644 --- a/sys/sys/fcntl.h +++ b/sys/sys/fcntl.h @@ -133,6 +133,8 @@ typedef __pid_t pid_t; #define O_VERIFY 0x00200000 /* open only after verification */ #endif +#define O_NOATIME 0x00400000 /* do not update atime */ + /* * XXX missing O_DSYNC, O_RSYNC. */ @@ -150,9 +152,9 @@ typedef __pid_t pid_t; #define OFLAGS(fflags) ((fflags) & O_EXEC ? (fflags) : (fflags) - 1) /* bits to save after open */ -#define FMASK (FREAD|FWRITE|FAPPEND|FASYNC|FFSYNC|FNONBLOCK|O_DIRECT|FEXEC) +#define FMASK (FREAD|FWRITE|FAPPEND|FASYNC|FFSYNC|FNONBLOCK|O_DIRECT|FEXEC|O_NOATIME) /* bits settable by fcntl(F_SETFL, ...) */ -#define FCNTLFLAGS (FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FRDAHEAD|O_DIRECT) +#define FCNTLFLAGS (FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FRDAHEAD|O_DIRECT|O_NOATIME) #if defined(COMPAT_FREEBSD7) || defined(COMPAT_FREEBSD6) || \ defined(COMPAT_FREEBSD5) || defined(COMPAT_FREEBSD4) @@ -164,7 +166,7 @@ typedef __pid_t pid_t; #define FPOSIXSHM O_NOFOLLOW #undef FCNTLFLAGS #define FCNTLFLAGS (FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FPOSIXSHM|FRDAHEAD| \ - O_DIRECT) + O_DIRECT|O_NOATIME) #endif #endif diff --git a/sys/sys/file.h b/sys/sys/file.h index d14ec32..d70eb6d 100644 --- a/sys/sys/file.h +++ b/sys/sys/file.h @@ -259,7 +259,7 @@ void finit(struct file *, u_int, short, void *, struct fileops *); int fgetvp(struct thread *td, int fd, cap_rights_t *rightsp, struct vnode **vpp); int fgetvp_exec(struct thread *td, int fd, cap_rights_t *rightsp, - struct vnode **vpp); + struct vnode **vpp, struct file **fpp); int fgetvp_rights(struct thread *td, int fd, cap_rights_t *needrightsp, struct filecaps *havecaps, struct vnode **vpp); int fgetvp_read(struct thread *td, int fd, cap_rights_t *rightsp, diff --git a/sys/sys/vnode.h b/sys/sys/vnode.h index dedbec6..28e0923 100644 --- a/sys/sys/vnode.h +++ b/sys/sys/vnode.h @@ -302,6 +302,7 @@ struct vattr { #define IO_INVAL 0x0040 /* invalidate after I/O */ #define IO_SYNC 0x0080 /* do I/O synchronously */ #define IO_DIRECT 0x0100 /* attempt to bypass buffer cache */ +#define IO_NOATIME 0x0200 /* do not update atime */ #define IO_EXT 0x0400 /* operate on external attributes */ #define IO_NORMAL 0x0800 /* operate on regular data */ #define IO_NOMACCHECK 0x1000 /* MAC checks unnecessary */ diff --git a/sys/ufs/ffs/ffs_vnops.c b/sys/ufs/ffs/ffs_vnops.c index b1de1b8..aa3aeb4 100644 --- a/sys/ufs/ffs/ffs_vnops.c +++ b/sys/ufs/ffs/ffs_vnops.c @@ -671,6 +671,7 @@ ffs_read(ap) } if ((error == 0 || uio->uio_resid != orig_resid) && + ((ioflag & IO_NOATIME) == 0) && (vp->v_mount->mnt_flag & (MNT_NOATIME | MNT_RDONLY)) == 0 && (ip->i_flag & IN_ACCESS) == 0) { VI_LOCK(vp); diff --git a/sys/vm/vm_extern.h b/sys/vm/vm_extern.h index c37973d..7513122 100644 --- a/sys/vm/vm_extern.h +++ b/sys/vm/vm_extern.h @@ -94,7 +94,7 @@ int vm_mmap_to_errno(int rv); int vm_mmap_cdev(struct thread *, vm_size_t, vm_prot_t, vm_prot_t *, int *, struct cdev *, struct cdevsw *, vm_ooffset_t *, vm_object_t *); int vm_mmap_vnode(struct thread *, vm_size_t, vm_prot_t, vm_prot_t *, int *, - struct vnode *, vm_ooffset_t *, vm_object_t *, boolean_t *); + struct vnode *, vm_ooffset_t *, vm_object_t *, boolean_t *, boolean_t); void vm_set_page_size(void); void vm_sync_icache(vm_map_t, vm_offset_t, vm_size_t); typedef int (*pmap_pinit_t)(struct pmap *pmap); diff --git a/sys/vm/vm_mmap.c b/sys/vm/vm_mmap.c index d0f14f3..470a459 100644 --- a/sys/vm/vm_mmap.c +++ b/sys/vm/vm_mmap.c @@ -1192,7 +1192,7 @@ int vm_mmap_vnode(struct thread *td, vm_size_t objsize, vm_prot_t prot, vm_prot_t *maxprotp, int *flagsp, struct vnode *vp, vm_ooffset_t *foffp, vm_object_t *objp, - boolean_t *writecounted) + boolean_t *writecounted, boolean_t noatime) { struct vattr va; vm_object_t obj; @@ -1283,7 +1283,8 @@ vm_mmap_vnode(struct thread *td, vm_size_t objsize, *objp = obj; *flagsp = flags; - vfs_mark_atime(vp, cred); + if (!noatime) + vfs_mark_atime(vp, cred); done: if (error != 0 && *writecounted) { @@ -1400,7 +1401,7 @@ vm_mmap(vm_map_t map, vm_offset_t *addr, vm_size_t size, vm_prot_t prot, } case OBJT_VNODE: error = vm_mmap_vnode(td, size, prot, &maxprot, &flags, - handle, &foff, &object, &writecounted); + handle, &foff, &object, &writecounted, FALSE); break; case OBJT_DEFAULT: if (handle == NULL) { --envbJBWh7q8WU6mo-- From owner-freebsd-hackers@freebsd.org Mon Aug 28 02:31:42 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 463ABE039F5 for ; Mon, 28 Aug 2017 02:31:42 +0000 (UTC) (envelope-from cedric.blancher@gmail.com) Received: from mail-io0-x231.google.com (mail-io0-x231.google.com [IPv6:2607:f8b0:4001:c06::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 10CCE65252 for ; Mon, 28 Aug 2017 02:31:42 +0000 (UTC) (envelope-from cedric.blancher@gmail.com) Received: by mail-io0-x231.google.com with SMTP id g33so12180806ioj.3 for ; Sun, 27 Aug 2017 19:31:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=ds6LJnvDJ8eWtBcjlq0lth9tBzmNAv+bRntApi+Y98w=; b=PtFGJyqotu9yk48pjpYxaagfhdaGRXGsvIIASa9WQtJliR8nBfbZ5zgOkxcHGmpUEC tQExBYZHaBt1sNGPynd8vD/+X5XMcrrGBXxVykija49B4HETT2fjq82nxqEOAyNYt9NQ Vpb/jKUGR+fNa6dwxs35dnn+20+rz59hdRwlczHgiuIoCues295PCK7dFCk6WhkjldKI OsIHN4pkzRUwqSsBrTtpZyHiaShBoOVmMT3dZ+mTQWaIDKYplK8gB05P7Z19Q1x4GVFe c/3JXq7Q76HlBFwEz1gpeoXU/svXS+H+DoRczT7pRju37QNc3YId/5SLXLasau4w24TI gFAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=ds6LJnvDJ8eWtBcjlq0lth9tBzmNAv+bRntApi+Y98w=; b=KfIJ+oQpqDFXjV58eppvEf6YUHinHE7Jaa3su4DqfpcIWoRXi4ExUJtTqQAKwT4vCJ FphnRSMYdnmznBX6l0PAdKapCJCCC34FhPcprajt82Q4+iqbmoEt59ptZH/HCx862ZtC RTHh1+JErQt9MqUUuCVBOxyBjiuePF8VoQ6gaYvCBLO0AFmfV92SmaGEXqMs5eFbejsx tsLyvZYqBgXXmRgU8Gh4SIOAfRwL3YFzSBZtncQo8rqqZA3CtP/Rdmk6C4FYPKu4F4Ul 5cFlXcrbkk+daM3YTbaMUBKAvSu6IYP4mcKBYvEzAB33qvVzgPd3q7lLrSboiE/kRByu bCrg== X-Gm-Message-State: AHYfb5jslH+CYeU2FJata9dWCsAZe7E+y8Nk6dRk1LNQt7GIV9XGRWAw j4n24n7gWsV4ITEvv15t+H1lcOda4A== X-Received: by 10.107.35.84 with SMTP id j81mr5337725ioj.131.1503887501165; Sun, 27 Aug 2017 19:31:41 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.150.203 with HTTP; Sun, 27 Aug 2017 19:31:40 -0700 (PDT) In-Reply-To: <20170826161827.GA21456@schoggimuss.roe.ch> References: <20170826161827.GA21456@schoggimuss.roe.ch> From: Cedric Blancher Date: Mon, 28 Aug 2017 04:31:40 +0200 Message-ID: Subject: Re: [PATCH] O_NOATIME support for open(2) To: "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Aug 2017 02:31:42 -0000 You know, this was long discussed in a Solaris rfe, and it was found that O_NOATIME has serious security implications and can be used to circumvent atime-based monitoring. So basically, you open a security hole with this. Ced On 26 August 2017 at 18:18, Daniel Roethlisberger wrote: > I'm trying to implement O_NOATIME support for open(2) in order to > provide a more elegant way for backup/archiving software to > prevent atime clobbering. Except for a 2008 thread on this list > I did not find any material; not sure if anybody is interested in > this or if there are reasons why this was never implemented. > > The attached patch against 11.1 implements O_NOATIME support for > open(2); it prevents read(2) and mmap(2) from clobbering atime if > the file descriptor was opened with O_NOATIME. O_NOATIME is only > permitted for root and the owner of the file. Currently it is > only implemented for ufs/ffs. It seems to work for me but has > not been extensively tested. > > I am interested in feedback from people who know their way around > I/O and VFS code before I extend this to other file systems, make > O_NOATIME tunable by fcntl(2), wire it to the Linux compat layer > and write docs. Does the implementation look sane? Did I miss > something important? > > Specifically, is there a better way to pass O_NOATIME into > vm_mmap_vnode other than adding an additional boolean_t argument? > I did not use an additional mmap flag because that would have > required additional logic to prevent userland from passing the > flag to the mmap syscall. > > Daniel > > -- > Daniel Roethlisberger > http://daniel.roe.ch/ > > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" -- Cedric Blancher [https://plus.google.com/u/0/+CedricBlancher/] Institute Pasteur From owner-freebsd-hackers@freebsd.org Mon Aug 28 06:55:11 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E7820E074E7 for ; Mon, 28 Aug 2017 06:55:11 +0000 (UTC) (envelope-from ed@nuxi.nl) Received: from mail-yw0-x230.google.com (mail-yw0-x230.google.com [IPv6:2607:f8b0:4002:c05::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9BD306B915 for ; Mon, 28 Aug 2017 06:55:11 +0000 (UTC) (envelope-from ed@nuxi.nl) Received: by mail-yw0-x230.google.com with SMTP id s143so24522246ywg.0 for ; Sun, 27 Aug 2017 23:55:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nuxi-nl.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=DvodAsPZpi2wonsgqc1V0TKjRmiLF0x3zShDcRQIQuw=; b=O9zCMtlK22rDzs/GFznZmXLiUSnwGnZN9OWsuGmuqHCVuxlUsSCwWrcFtFPZ8FT8NB 9PyCvdH6nYRwlnmN767kQMdbPyARuVFduIhl7FZ+/qyrZsU5LjO2QajJ3/yNW6u9BPS0 AeDWxRWXPkeajxgFmWtyv+tS4RAhQNTGMCZMFcQPzmUhr7e0YeF0RuRqYVUuwieUkkXb Ivd1TAM6Dy00HAcf6kXLHci8fDL3+KT8TR+SB3t5ATQHb0Se5yLU1vPETrly7lU/0zHE AQ4+QdJ7RTiISH/0082nfmusdq49x0MRLGJm+IpwA/qeSZG3AQ1n9zEjj6iSE0TXlv1H Wfhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=DvodAsPZpi2wonsgqc1V0TKjRmiLF0x3zShDcRQIQuw=; b=cl99oktWL1B/ezpentXKfqe105/EbwkdnMYIhwB5ySvx7y2K7D7Va7AZ3XK8uvd5k3 UKuCF7W4/05R49XBQYA5IwjOBEbkXQL/iSU/gQDf29G3dnUjzepPms8DFkaJ7CkR+wbS xmewC9r8Qlhm+cYL72Z/Vq0qyE6csBoEHxYPhX7Nsq/ZQPEHrKZc0l5j8KGK8s4AZqqv Q8Ci5xaQ1KiXA1tTRBk/2xsQJy7QSTCphy7EEbTvRQ4WOvtVvifv7sUY/03QsIukip3H blSCqwKMQVjlYloa9CxflVyRD5aGWwqBsMA44z9scNtzhV7CVX0KADM8gtB41E0nwjUC mpJg== X-Gm-Message-State: AHYfb5joDZjW03qKz4volKKEeDN4SgVW9eRUHwdttLDaIf5VpoEh5z8A ACOx6EFZwDv1iPI0GCxT8lvkTnMevtgv X-Received: by 10.129.50.206 with SMTP id y197mr2857517ywy.314.1503903310687; Sun, 27 Aug 2017 23:55:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.13.227.193 with HTTP; Sun, 27 Aug 2017 23:54:40 -0700 (PDT) In-Reply-To: References: <1C5A448F-C91A-4599-8500-E4E46E6F5205@dsl-only.net> From: Ed Schouten Date: Mon, 28 Aug 2017 08:54:40 +0200 Message-ID: Subject: Re: svn commit: r322875 - head/sys/dev/nvme To: Mark Millard Cc: David Chisnall , Warner Losh , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD-STABLE Mailing List , freebsd-hackers Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Aug 2017 06:55:12 -0000 Mark, 2017-08-25 14:53 GMT+02:00 Ed Schouten : > 2017-08-25 9:46 GMT+02:00 Mark Millard : >> It appears that at least 11.1-STABLE -r322807 does not handle >> -std=c++98 styles of use of _Static_assert for g++7 in that >> g++7 reports an error: > > Maybe we need to do something like this? > > Index: sys/sys/cdefs.h > =================================================================== > --- sys/sys/cdefs.h (revision 322887) > +++ sys/sys/cdefs.h (working copy) > @@ -294,7 +294,7 @@ > #if (defined(__cplusplus) && __cplusplus >= 201103L) || \ > __has_extension(cxx_static_assert) > #define _Static_assert(x, y) static_assert(x, y) > -#elif __GNUC_PREREQ__(4,6) > +#elif __GNUC_PREREQ__(4,6) && !defined(__cplusplus) > /* Nothing, gcc 4.6 and higher has _Static_assert built-in */ > #elif defined(__COUNTER__) > #define _Static_assert(x, y) __Static_assert(x, __COUNTER__) Could you let me know whether this patch fixes the build for you? If so, I'll commit it! -- Ed Schouten Nuxi, 's-Hertogenbosch, the Netherlands KvK-nr.: 62051717 From owner-freebsd-hackers@freebsd.org Mon Aug 28 07:21:39 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C29CFE07B21 for ; Mon, 28 Aug 2017 07:21:39 +0000 (UTC) (envelope-from daniel@roe.ch) Received: from schoggimuss.roe.ch (schoggimuss.roe.ch [IPv6:2a03:da40:0:35::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8B3CB6C561 for ; Mon, 28 Aug 2017 07:21:39 +0000 (UTC) (envelope-from daniel@roe.ch) Received: from daniel (ssh-from [178.197.231.54]) by schoggimuss.roe.ch (envelope-from ) with LOCAL id 1dmEMZ-000AVh-G9 for freebsd-hackers@freebsd.org; Mon, 28 Aug 2017 09:21:35 +0200 Date: Mon, 28 Aug 2017 09:21:35 +0200 From: Daniel Roethlisberger To: freebsd-hackers@freebsd.org Subject: Re: [PATCH] O_NOATIME support for open(2) Message-ID: <20170828072135.GA40198@schoggimuss.roe.ch> Mail-Followup-To: freebsd-hackers@freebsd.org References: <20170826161827.GA21456@schoggimuss.roe.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.3 (2017-05-23) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Aug 2017 07:21:39 -0000 Cedric Blancher 2017-08-28: > You know, this was long discussed in a Solaris rfe, Can you provide a pointer to the discussion you are refering to? > and it was found that O_NOATIME has serious security > implications and can be used to circumvent atime-based > monitoring. So basically, you open a security hole with this. Can you elaborate on what exactly you mean by "atime-based monitoring"? Are you thinking about DFIR? How would the "serious security implications" differ from those of utimes(2)? Note that the use of O_NOATIME is restricted to the file owner and root. My take would be that atimes should not be confused with auditing. Daniel -- Daniel Roethlisberger http://daniel.roe.ch/ From owner-freebsd-hackers@freebsd.org Mon Aug 28 09:02:14 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ED1E7E09401 for ; Mon, 28 Aug 2017 09:02:14 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-57.reflexion.net [208.70.210.57]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9D7CA6EFC6 for ; Mon, 28 Aug 2017 09:02:14 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 2939 invoked from network); 28 Aug 2017 09:02:07 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 28 Aug 2017 09:02:07 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v8.40.2) with SMTP; Mon, 28 Aug 2017 05:02:07 -0400 (EDT) Received: (qmail 3497 invoked from network); 28 Aug 2017 09:02:07 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 28 Aug 2017 09:02:07 -0000 Received: from [192.168.1.109] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id 54F3EEC8816; Mon, 28 Aug 2017 02:02:06 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: svn commit: r322875 - head/sys/dev/nvme From: Mark Millard In-Reply-To: Date: Mon, 28 Aug 2017 02:02:05 -0700 Cc: David Chisnall , Warner Losh , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD-STABLE Mailing List , freebsd-hackers Content-Transfer-Encoding: 7bit Message-Id: <0639EC0E-1F0F-4CB0-A3FE-4E8CD814B6D3@dsl-only.net> References: <1C5A448F-C91A-4599-8500-E4E46E6F5205@dsl-only.net> To: Ed Schouten X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Aug 2017 09:02:15 -0000 On 2017-Aug-27, at 11:54 PM, Ed Schouten wrote: > 2017-08-25 14:53 GMT+02:00 Ed Schouten : >> 2017-08-25 9:46 GMT+02:00 Mark Millard : >>> It appears that at least 11.1-STABLE -r322807 does not handle >>> -std=c++98 styles of use of _Static_assert for g++7 in that >>> g++7 reports an error: >> >> Maybe we need to do something like this? >> >> Index: sys/sys/cdefs.h >> =================================================================== >> --- sys/sys/cdefs.h (revision 322887) >> +++ sys/sys/cdefs.h (working copy) >> @@ -294,7 +294,7 @@ >> #if (defined(__cplusplus) && __cplusplus >= 201103L) || \ >> __has_extension(cxx_static_assert) >> #define _Static_assert(x, y) static_assert(x, y) >> -#elif __GNUC_PREREQ__(4,6) >> +#elif __GNUC_PREREQ__(4,6) && !defined(__cplusplus) >> /* Nothing, gcc 4.6 and higher has _Static_assert built-in */ >> #elif defined(__COUNTER__) >> #define _Static_assert(x, y) __Static_assert(x, __COUNTER__) > > Could you let me know whether this patch fixes the build for you? If > so, I'll commit it! As a variant of stable/11 -r322807 . . . buildworld and buildkernel seem to work fine. (I did not try any port [re-]builds.) Based on the same main.cc as before . . . g++7 -std=c++98 main.cc g++7 -Wpedantic -std=c++98 main.cc g++7 -std=c++03 main.cc g++7 -Wpedantic -std=c++03 main.cc no longer complain (so no error, no warning). clang++ -Wpedantic -std=c++11 main.cc clang++ -Wpedantic -std=c++98 main.cc clang++ -Wpedantic -std=c++03 main.cc each still give the warning but no error. g++7 -Wpedantic -std=c++11 main.cc g++7 -std=c++11 main.cc clang++ -std=c++11 main.cc clang++ -std=c++98 main.cc clang++ -std=c++03 main.cc are still silent, no errors, no warnings. Note that clang here is version 4 --the same as in my original report that had the g++7 rejection example. This is because of the stable/11 context that I used. (An intended MFC had been listed.) If needed I could probably try under some version of head (and so test clang version 5). === Mark Millard markmi at dsl-only.net From owner-freebsd-hackers@freebsd.org Mon Aug 28 09:36:49 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E10ECE09D3C for ; Mon, 28 Aug 2017 09:36:49 +0000 (UTC) (envelope-from ed@nuxi.nl) Received: from mail-yw0-x22f.google.com (mail-yw0-x22f.google.com [IPv6:2607:f8b0:4002:c05::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9B14270177 for ; Mon, 28 Aug 2017 09:36:49 +0000 (UTC) (envelope-from ed@nuxi.nl) Received: by mail-yw0-x22f.google.com with SMTP id s187so13809322ywf.2 for ; Mon, 28 Aug 2017 02:36:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nuxi-nl.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=u/B4CIKTIpXtJTDiWp0pYzBNCK+pwX/dz8cXZe59aU8=; b=OwN4Jvtgsv4wQf6E8c3As7Ou7Z1YkhCYLnKYu8zHXGLKeyX+xTCHlpOBDvhVe10Zvq ElF9ntamn5VUAT/O7Ez2jGTc0rpMacYEWoo6q6BMzQjcfUp+d96BwKq664r9s5MHIkvJ xxfTWcZ4M36HU7iv5qlXL82htQWtiGdhaAtf2Irc8hLIJUVgf9LP48HZ1TNMrM7yjWVA 3nK2iFESEU1UiAlnzZu669y9XN63++gj2rilZo/Qf0hhsMSY2/fq/j/avj/qX1V8ZUJX lXIHdMsBMYYslASPcMxtYzzYv1IHHwyK9mEfjqZD5oRw4+gBERdnYINkmcCvEOPAtk1M 0LZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=u/B4CIKTIpXtJTDiWp0pYzBNCK+pwX/dz8cXZe59aU8=; b=qh/EHMIbrBfDEXyumhU0/laJMsVp4YoBFn+icfpEZqVlGQOl/uGnQGYlV+B2tA8N/z o1DNHoJGZXrSasPOYKj9bk0TrhLhLlJWnTfB6dadznk/54FPBW72c3prmxtw1WmjhBYS SZcl/EpmdT6IXYUMLBf/UVgWg/tvXtL8eOTX1lfsdHUCZFMAXhD2vBpOak6TGUQojAS9 7IDTNIShIUhMSyt2yzP6wv7mc6aX6kSaMHspDEJatb+z4NcZb0ftviYakRfvaQcrAMCB vHDuHVJM4c9Jd9dXsfYHll7ZfaWOq+JyaSxwzTMm5wBDNhBsAak/61SKmDf/0L6rHKGI +QMA== X-Gm-Message-State: AHYfb5hY2xcqlibmEzF0QCdfOXDm6ispJO/FREz1SThfQJaoTv6C56iq WECRg1f8qTDqupui0AfygxByKjZ0ja7o X-Received: by 10.37.171.208 with SMTP id v74mr5173459ybi.99.1503913008729; Mon, 28 Aug 2017 02:36:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.13.227.193 with HTTP; Mon, 28 Aug 2017 02:36:18 -0700 (PDT) In-Reply-To: <0639EC0E-1F0F-4CB0-A3FE-4E8CD814B6D3@dsl-only.net> References: <1C5A448F-C91A-4599-8500-E4E46E6F5205@dsl-only.net> <0639EC0E-1F0F-4CB0-A3FE-4E8CD814B6D3@dsl-only.net> From: Ed Schouten Date: Mon, 28 Aug 2017 11:36:18 +0200 Message-ID: Subject: Re: svn commit: r322875 - head/sys/dev/nvme To: Mark Millard Cc: David Chisnall , Warner Losh , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD-STABLE Mailing List , freebsd-hackers Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Aug 2017 09:36:50 -0000 2017-08-28 11:02 GMT+02:00 Mark Millard : > Based on the same main.cc as before . . . > > g++7 -std=c++98 main.cc > g++7 -Wpedantic -std=c++98 main.cc > g++7 -std=c++03 main.cc > g++7 -Wpedantic -std=c++03 main.cc > > no longer complain (so no error, no > warning). Perfect! I've committed this change as r322965. Thanks for testing! -- Ed Schouten Nuxi, 's-Hertogenbosch, the Netherlands KvK-nr.: 62051717 From owner-freebsd-hackers@freebsd.org Thu Aug 31 11:12:37 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2110CE1C13A; Thu, 31 Aug 2017 11:12:37 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from hz.grosbein.net (hz.grosbein.net [78.47.246.247]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hz.grosbein.net", Issuer "hz.grosbein.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A8FA073A86; Thu, 31 Aug 2017 11:12:36 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from eg.sd.rdtc.ru (root@eg.sd.rdtc.ru [62.231.161.221] (may be forged)) by hz.grosbein.net (8.15.2/8.15.2) with ESMTPS id v7VBCUhj044682 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 31 Aug 2017 13:12:31 +0200 (CEST) (envelope-from eugen@grosbein.net) X-Envelope-From: eugen@grosbein.net X-Envelope-To: ycyc321@gmail.com Received: from eg.sd.rdtc.ru (eugen@localhost [127.0.0.1]) by eg.sd.rdtc.ru (8.15.2/8.15.2) with ESMTP id v7VBCQg4021630; Thu, 31 Aug 2017 18:12:26 +0700 (+07) (envelope-from eugen@grosbein.net) Subject: Re: How to know the address ranges of kernel stacks, for user processes and kernel threads? To: Yue Chen , freebsd-hackers@freebsd.org, freebsd-current@freebsd.org References: From: Eugene Grosbein Message-ID: <59A7EF1A.7020502@grosbein.net> Date: Thu, 31 Aug 2017 18:12:26 +0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Status: Yes, score=5.5 required=5.0 tests=BAYES_00, DATE_IN_FUTURE_96_Q, LOCAL_FROM,RDNS_NONE autolearn=no autolearn_force=no version=3.4.1 X-Spam-Report: * 3.3 DATE_IN_FUTURE_96_Q Date: is 4 days to 4 months after Received: date * -2.3 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * 1.9 RDNS_NONE Delivered to internal network by a host with no rDNS * 2.6 LOCAL_FROM From my domains X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on hz.grosbein.net X-Spam-Flag: YES X-Spam-Level: ***** X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Aug 2017 11:12:37 -0000 On 29.01.2015 07:54, Yue Chen wrote: > It seems that each kernel stack has two pages (IA-32) to use. Does x86_64 > still have two pages or more? One can change number of kernel stack pages for i386 and amd64 platforms by means of /boot/loader.conf without need to rebuild a kernel using kern.kstack_pages tunnable. It equals to 2 for i386 and to 4 for amd64 by default but I change it from 2 to 4 for my i386 systems running IPSEC tunnels and/or wifi due to these subsystems being stack-greedy. See also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219476 From owner-freebsd-hackers@freebsd.org Thu Aug 31 13:00:56 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 319D6E1E4D0 for ; Thu, 31 Aug 2017 13:00:56 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.116.210]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E4CB377281 for ; Thu, 31 Aug 2017 13:00:54 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from bach.cs.huji.ac.il ([132.65.80.20]) by kabab.cs.huji.ac.il with esmtp id 1dnOtE-000NBd-Hq for freebsd-hackers@freebsd.org; Thu, 31 Aug 2017 15:48:08 +0300 From: Daniel Braniss Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: slow pxeboot on newer dell/optiplex Message-Id: <1536BD70-C292-4435-9DD4-0BA81A0B242B@cs.huji.ac.il> Date: Thu, 31 Aug 2017 15:48:08 +0300 To: Freebsd hackers list X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Aug 2017 13:00:56 -0000 Hi, A dell optiplex 9020 boots via pxeboot just fine, newer ones, e.g. 7050 = boot very slow, even after upgrading the bios (which fixed some serial issues) it=E2=80=99= s still very slow. btw, bootting linux is ok. I will try ipxe later, but if anyone has any ideas, you are very = welcome! thanks, danny From owner-freebsd-hackers@freebsd.org Thu Aug 31 13:40:21 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 220A3E1EDC3 for ; Thu, 31 Aug 2017 13:40:21 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 69A347C560 for ; Thu, 31 Aug 2017 13:40:20 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id v7VDeIDB060773 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 31 Aug 2017 09:40:19 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.ca [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id v7VDeGFs094357; Thu, 31 Aug 2017 09:40:16 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: slow pxeboot on newer dell/optiplex To: Daniel Braniss , Freebsd hackers list References: <1536BD70-C292-4435-9DD4-0BA81A0B242B@cs.huji.ac.il> From: Mike Tancsa Organization: Sentex Communications Message-ID: <494d3688-655a-92a2-2254-59b1494a82a0@sentex.net> Date: Thu, 31 Aug 2017 09:40:15 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <1536BD70-C292-4435-9DD4-0BA81A0B242B@cs.huji.ac.il> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Aug 2017 13:40:21 -0000 On 8/31/2017 8:48 AM, Daniel Braniss wrote: > Hi, > A dell optiplex 9020 boots via pxeboot just fine, newer ones, e.g. 7050 boot very slow, > even after upgrading the bios (which fixed some serial issues) it’s still very slow. > btw, bootting linux is ok. > I will try ipxe later, but if anyone has any ideas, you are very welcome! I think I am seeing the same issue-- or at least the same symptoms-- on a SuperMicro X11SSL-F. Upgrading the BIOS didnt fix the issue either (version 2.0a Release Date: 03/09/2017) 0(4usupermicro)# pciconf -lcvb igb0 igb0@pci0:3:0:0: class=0x020000 card=0x153315d9 chip=0x15338086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'I210 Gigabit Network Connection' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xdf500000, size 524288, enabled bar [18] = type I/O Port, range 32, base 0xd000, size 32, enabled bar [1c] = type Memory, range 32, base 0xdf580000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit, vector masks cap 11[70] = MSI-X supports 5 messages, enabled Table in map 0x1c[0x0], PBA in map 0x1c[0x2000] cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR NS link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1) ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected ecap 0003[140] = Serial 1 0cc47affffe3b9a4 ecap 0017[1a0] = TPH Requester 1 0(4usupermicro)# What is the NIC in your Dell ? ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-hackers@freebsd.org Thu Aug 31 14:01:11 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 73159E1F227 for ; Thu, 31 Aug 2017 14:01:11 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 402577D235 for ; Thu, 31 Aug 2017 14:01:11 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id v7VE1AsI065671 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 31 Aug 2017 10:01:10 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.net [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id v7VE18se094423; Thu, 31 Aug 2017 10:01:08 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: slow pxeboot on newer dell/optiplex From: Mike Tancsa To: Daniel Braniss , Freebsd hackers list References: <1536BD70-C292-4435-9DD4-0BA81A0B242B@cs.huji.ac.il> <494d3688-655a-92a2-2254-59b1494a82a0@sentex.net> Organization: Sentex Communications Message-ID: <93865b13-9b06-c727-950d-490d27741dc5@sentex.net> Date: Thu, 31 Aug 2017 10:01:07 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <494d3688-655a-92a2-2254-59b1494a82a0@sentex.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Aug 2017 14:01:11 -0000 On 8/31/2017 9:40 AM, Mike Tancsa wrote: > On 8/31/2017 8:48 AM, Daniel Braniss wrote: >> Hi, >> A dell optiplex 9020 boots via pxeboot just fine, newer ones, e.g. 7050 boot very slow, >> even after upgrading the bios (which fixed some serial issues) it’s still very slow. >> btw, bootting linux is ok. >> I will try ipxe later, but if anyone has any ideas, you are very welcome! > > I think I am seeing the same issue-- or at least the same symptoms-- on > a SuperMicro X11SSL-F. Upgrading the BIOS didnt fix the issue either > (version 2.0a Release Date: 03/09/2017) More details in https://lists.freebsd.org/pipermail/freebsd-questions/2017-March/276305.html ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-hackers@freebsd.org Thu Aug 31 14:02:08 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 89539E1F41C for ; Thu, 31 Aug 2017 14:02:08 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.116.210]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4114A7D416 for ; Thu, 31 Aug 2017 14:02:07 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from imac.bs.cs.huji.ac.il ([132.65.179.42]) by kabab.cs.huji.ac.il with esmtp id 1dnQ2f-0001yu-B9; Thu, 31 Aug 2017 17:01:57 +0300 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: slow pxeboot on newer dell/optiplex From: Daniel Braniss In-Reply-To: <494d3688-655a-92a2-2254-59b1494a82a0@sentex.net> Date: Thu, 31 Aug 2017 17:01:56 +0300 Cc: Freebsd hackers list Message-Id: <268D525C-F99B-434B-BB66-27DE95AC872F@cs.huji.ac.il> References: <1536BD70-C292-4435-9DD4-0BA81A0B242B@cs.huji.ac.il> <494d3688-655a-92a2-2254-59b1494a82a0@sentex.net> To: Mike Tancsa X-Mailer: Apple Mail (2.3124) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Aug 2017 14:02:08 -0000 > On 31 Aug 2017, at 4:40 PM, Mike Tancsa wrote: >=20 > On 8/31/2017 8:48 AM, Daniel Braniss wrote: >> Hi, >> A dell optiplex 9020 boots via pxeboot just fine, newer ones, e.g. = 7050 boot very slow, >> even after upgrading the bios (which fixed some serial issues) it=E2=80= =99s still very slow. >> btw, bootting linux is ok. >> I will try ipxe later, but if anyone has any ideas, you are very = welcome! >=20 > I think I am seeing the same issue-- or at least the same symptoms-- = on > a SuperMicro X11SSL-F. Upgrading the BIOS didnt fix the issue either > (version 2.0a Release Date: 03/09/2017) >=20 >=20 > 0(4usupermicro)# pciconf -lcvb igb0 > igb0@pci0:3:0:0: class=3D0x020000 card=3D0x153315d9 = chip=3D0x15338086 > rev=3D0x03 hdr=3D0x00 > vendor =3D 'Intel Corporation' > device =3D 'I210 Gigabit Network Connection' > class =3D network > subclass =3D ethernet > bar [10] =3D type Memory, range 32, base 0xdf500000, size 524288, > enabled > bar [18] =3D type I/O Port, range 32, base 0xd000, size 32, = enabled > bar [1c] =3D type Memory, range 32, base 0xdf580000, size 16384, = enabled > cap 01[40] =3D powerspec 3 supports D0 D3 current D0 > cap 05[50] =3D MSI supports 1 message, 64 bit, vector masks > cap 11[70] =3D MSI-X supports 5 messages, enabled > Table in map 0x1c[0x0], PBA in map 0x1c[0x2000] > cap 10[a0] =3D PCI-Express 2 endpoint max data 256(512) FLR NS > link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1) > ecap 0001[100] =3D AER 2 0 fatal 0 non-fatal 1 corrected > ecap 0003[140] =3D Serial 1 0cc47affffe3b9a4 > ecap 0017[1a0] =3D TPH Requester 1 > 0(4usupermicro)# >=20 >=20 > What is the NIC in your Dell ? em0@pci0:0:31:6: class=3D0x020000 card=3D0x07a11028 = chip=3D0x15e38086 rev=3D0x00 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'Ethernet Connection (5) I219-LM' class =3D network subclass =3D ethernet bar [10] =3D type Memory, range 32, base 0xf7100000, size 131072, = enabled cap 01[c8] =3D powerspec 3 supports D0 D3 current D0 cap 05[d0] =3D MSI supports 1 message, 64 bit enabled with 1 message cap 13[e0] =3D PCI Advanced Features: FLR TP the thing is, after it boots, all is ok, so it=E2=80=99s something in = the pxe that pxeboot is in conflict with =E2=80=A6 danny >=20 > ---Mike >=20 >=20 > --=20 > ------------------- > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, mike@sentex.net > Providing Internet services since 1994 www.sentex.net > Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-hackers@freebsd.org Thu Aug 31 14:14:26 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6C983E1F7A4 for ; Thu, 31 Aug 2017 14:14:26 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 393347DADD for ; Thu, 31 Aug 2017 14:14:26 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id v7VEEO5c068994 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 31 Aug 2017 10:14:25 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.ca [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id v7VEENpj094444; Thu, 31 Aug 2017 10:14:23 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: slow pxeboot on newer dell/optiplex To: Daniel Braniss Cc: Freebsd hackers list References: <1536BD70-C292-4435-9DD4-0BA81A0B242B@cs.huji.ac.il> <494d3688-655a-92a2-2254-59b1494a82a0@sentex.net> <268D525C-F99B-434B-BB66-27DE95AC872F@cs.huji.ac.il> From: Mike Tancsa Organization: Sentex Communications Message-ID: <6284eb62-91cf-c4c9-78ff-347ec5318696@sentex.net> Date: Thu, 31 Aug 2017 10:14:22 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <268D525C-F99B-434B-BB66-27DE95AC872F@cs.huji.ac.il> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Aug 2017 14:14:26 -0000 On 8/31/2017 10:01 AM, Daniel Braniss wrote: > > the thing is, after it boots, all is ok, so it’s something in the pxe > that pxeboot is in > conflict with … Yes, same here. Once the kernel is loaded, network throughput is normal ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-hackers@freebsd.org Fri Sep 1 06:50:47 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 396FBE0CB27 for ; Fri, 1 Sep 2017 06:50:47 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.116.210]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DC5F663E94 for ; Fri, 1 Sep 2017 06:50:46 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from bach.cs.huji.ac.il ([132.65.80.20]) by kabab.cs.huji.ac.il with esmtp id 1dnfmm-0002uA-UH; Fri, 01 Sep 2017 09:50:36 +0300 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: slow pxeboot on newer dell/optiplex From: Daniel Braniss In-Reply-To: <6284eb62-91cf-c4c9-78ff-347ec5318696@sentex.net> Date: Fri, 1 Sep 2017 09:50:36 +0300 Cc: Freebsd hackers list Content-Transfer-Encoding: quoted-printable Message-Id: <4EB138E7-1DCF-4F00-8E02-120F5704C6C4@cs.huji.ac.il> References: <1536BD70-C292-4435-9DD4-0BA81A0B242B@cs.huji.ac.il> <494d3688-655a-92a2-2254-59b1494a82a0@sentex.net> <268D525C-F99B-434B-BB66-27DE95AC872F@cs.huji.ac.il> <6284eb62-91cf-c4c9-78ff-347ec5318696@sentex.net> To: Mike Tancsa X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Sep 2017 06:50:47 -0000 I tried undionly.kpxe, the loading of pxeboot went really fast, but as soon as pxeboot started = it slowed down. danny > On 31 Aug 2017, at 17:14, Mike Tancsa wrote: >=20 > On 8/31/2017 10:01 AM, Daniel Braniss wrote: >>=20 >> the thing is, after it boots, all is ok, so it=E2=80=99s something in = the pxe >> that pxeboot is in >> conflict with =E2=80=A6 > Yes, same here. Once the kernel is loaded, network throughput is = normal >=20 > ---Mike >=20 >=20 > --=20 > ------------------- > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, mike@sentex.net > Providing Internet services since 1994 www.sentex.net > Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-hackers@freebsd.org Fri Sep 1 15:43:47 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ABEAEE15E26 for ; Fri, 1 Sep 2017 15:43:47 +0000 (UTC) (envelope-from devgs@ukr.net) Received: from frv191.fwdcdn.com (frv191.fwdcdn.com [212.42.77.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6FB0476FA9 for ; Fri, 1 Sep 2017 15:43:46 +0000 (UTC) (envelope-from devgs@ukr.net) Received: from [10.10.2.23] (helo=frv198.fwdcdn.com) by frv191.fwdcdn.com with esmtp ID 1dnnks-000DE4-Ik for freebsd-hackers@freebsd.org; Fri, 01 Sep 2017 18:21:10 +0300 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ukr.net; s=ffe; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-Id:To: Subject:From:Date:Sender:Reply-To:Cc:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=PuQCoafWGHc1y1XBWkuf33OMlHg4Wr1RXSnIhqlGT3k=; b=MzNvjhl9OU1/AISnEJjbvFHNKx +ZJwRxEXXydrUwIf2Ng+FAW+GvTZ59qP7iW1hEEef1WCDCSmUbbXI6BlQffqit/FqvHzh9RoOhb27 m1oe6AkInk5WoLAQKnrSJg4bWB3dscXYgzbw2ioi7j7Mm6p6Ldb4VvhA6Io0jbwq0ZtQ=; Received: from [10.10.10.33] (helo=frv33.fwdcdn.com) by frv198.fwdcdn.com with smtp ID 1dnnki-000D4F-T5 for freebsd-hackers@freebsd.org; Fri, 01 Sep 2017 18:21:00 +0300 Date: Fri, 01 Sep 2017 18:21:00 +0300 From: Paul Subject: High CPU usage in kernel on highly contended lock file To: freebsd-stable@freebsd.org, freebsd-hackers@freebsd.org X-Mailer: mail.ukr.net 5.0 Message-Id: <1504278581.38180443.tad3fj7o@frv33.fwdcdn.com> Received: from devgs@ukr.net by frv33.fwdcdn.com; Fri, 01 Sep 2017 18:21:00 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: binary X-Mailman-Approved-At: Fri, 01 Sep 2017 15:57:14 +0000 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Sep 2017 15:43:47 -0000 It seems that a lot of CPU resources are spend when trying to get exclusive lock on file from multiple processes concurrently. By multiple i mean hundreds. It seems that there's an initial cost of fcntl() call. Each process that tries to lock the file consumes some amount of CPU and cools down. However, each repeated fcntl() call will consume same amount of resources again. It seems as if entering the "wait queue" is expensive. Environment: #uname -a FreeBSD test.com 11.1-STABLE FreeBSD 11.1-STABLE #0 r322650: Thu Aug 31 19:49:49 EEST 2017 root@test.com:/usr/obj/usr/src/sys/SERVER amd64 Test case: test.c: #include #include #include #include #include #include static int child_count = 0; static void schild_handler(int sig) { --child_count; } static void alarm_handler(int sig) { } void lock_write(int fd) { struct flock fl; fl.l_type = F_WRLCK; fl.l_whence = SEEK_SET; fl.l_start = 0; fl.l_len = 1; do { // Simulate interruption with alarm to re-enter the wait queue. alarm(1); } while (fcntl(fd, F_SETLKW, &fl) < 0); } int main(int argc, char ** argv) { if (argc < 2) { return 1; } signal(SIGCHLD, schild_handler); struct sigaction sig_action; memset(&sig_action, 0, sizeof sig_action); sig_action.sa_handler = alarm_handler; sigemptyset(&sig_action.sa_mask); sigaction(SIGALRM, &sig_action, NULL); int fd = open(argv[1], O_CREAT|O_RDWR, 0777); for (int i = 0; i < 300; ++i) { pid_t child_pid = fork(); if (!child_pid) { // Lock the descriptor. lock_write(fd); // Simulate some work. sleep(1); return 0; } ++child_count; } do { printf("\rchild count: %5u\n", child_count); sleep(1); } while(child_count); return 0; } Commands: # cd /tmp # ~~~~~ Create test.c # clang -o test test.c # ./test 11111 Note that on linux, even if 1000 children are spawned instead of 300, none of them ever appear in the top. This is a huge problem, because our current software uses lock files for sync purposes. And at times, when a lot of processes of said software are spawned (prime time), system becomes totally unresponsive with over 1000 LA. From owner-freebsd-hackers@freebsd.org Fri Sep 1 18:26:36 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D9488E1FA7E for ; Fri, 1 Sep 2017 18:26:36 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from know-smtprelay-omc-10.server.virginmedia.net (know-smtprelay-omc-10.server.virginmedia.net [80.0.253.74]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client CN "Bizanga Labs SMTP Client Certificate", Issuer "Bizanga Labs CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 33C86835CE for ; Fri, 1 Sep 2017 18:26:35 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from [192.168.1.5] ([86.10.211.13]) by know-smtprelay-10-imp with bizsmtp id 4JRN1w00P0HtmFq01JRNN3; Fri, 01 Sep 2017 19:25:22 +0100 X-Originating-IP: [86.10.211.13] X-Authenticated-User: J.deBoynePollard-newsgroups@NTLWorld.COM X-Spam: 0 X-Authority: v=2.1 cv=SeoKDalu c=1 sm=1 tr=0 a=SB7hr1IvJSWWr45F2gQiKw==:117 a=SB7hr1IvJSWWr45F2gQiKw==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=x7bEGLp0ZPQA:10 a=r77TgQKjGQsHNAKrUKIA:9 a=89gA_nVi2DQupjHvDmAA:9 a=8hjerSVpUnQ0UbXx:21 a=paioX-cxIUvzcqAK:21 a=pILNOxqGKmIA:10 a=hww22_IDFtXdcSnDCc0A:9 a=_W_S_7VecoQA:10 Subject: Archnosh 1.35 networking References: <20170831222647.19f1a5c1@kadisius> To: Supervision , Debian users , FreeBSD Hackers From: Jonathan de Boyne Pollard Message-ID: <7b3f64d0-1bed-4dad-c9db-75e6f39bab36@NTLWorld.COM> Date: Fri, 1 Sep 2017 19:25:22 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20170831222647.19f1a5c1@kadisius> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Sep 2017 18:26:37 -0000 Thomas: > [...] the new networking documentation [...] > This will benefit a general readership, as well as you looking towards Archnosh 1.35. When you upgrade to 1.35, there will be two ways of configuring things. You can either write a Debian-style /etc/network/interfaces file, or you can employ a FreeBSD-like rc.conf. Both styles will work on FreeBSD, TrueOS, Debian, and (I hope) Arch. The former you will find is translated into the latter. For more detail on that translation process, see the new doco. Here is an example from one of my machines. jdebp % cat /etc/network/interfaces # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). auto lo iface lo inet loopback iface lo inet static address 127.53.53.1 netmask 255.0.0.0 scope host iface lo inet6 loopback iface lo inet6 static address ::2/128 scope host allow-auto eth1 iface eth1 inet dhcp iface eth1 inet6 auto allow-auto rename2 iface rename2 inet dhcp iface rename2 inet6 auto jdebp % sed -ne '/etc.network.interfaces/,/console-setup:/p' /etc/system-control/convert/rc.conf # Converted from /etc/network/interfaces: network_interfaces="eth1 rename2 lo " ifconfig_eth1="AUTO DHCP inet " ifconfig_rename2="AUTO DHCP inet " ifconfig_lo="AUTO inet 127.0.0.1 " ifconfig_lo_ipv6="inet6 ::1 " ifconfig_lo_aliases="inet 127.53.53.1 netmask 255.0.0.0 inet6 ::2/128" # Converted from /etc/default/console-setup: jdebp % From owner-freebsd-hackers@freebsd.org Fri Sep 1 19:24:43 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 726F1E01B2D; Fri, 1 Sep 2017 19:24:43 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E6475F05; Fri, 1 Sep 2017 19:24:42 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v81JOYrB099198 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 1 Sep 2017 22:24:34 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v81JOYrB099198 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id v81JOXvF099197; Fri, 1 Sep 2017 22:24:33 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 1 Sep 2017 22:24:33 +0300 From: Konstantin Belousov To: Paul Cc: freebsd-stable@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: High CPU usage in kernel on highly contended lock file Message-ID: <20170901192433.GA1700@kib.kiev.ua> References: <1504278581.38180443.tad3fj7o@frv33.fwdcdn.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1504278581.38180443.tad3fj7o@frv33.fwdcdn.com> User-Agent: Mutt/1.8.3 (2017-05-23) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Sep 2017 19:24:43 -0000 On Fri, Sep 01, 2017 at 06:21:00PM +0300, Paul wrote: > > It seems that a lot of CPU resources are spend when trying to get exclusive lock on file from multiple processes concurrently. By multiple i mean hundreds. > It seems that there's an initial cost of fcntl() call. Each process that tries to lock the file consumes some amount of CPU and cools down. > However, each repeated fcntl() call will consume same amount of resources again. It seems as if entering the "wait queue" is expensive. Yes, our lockf is somewhat expensive, I believe it is because the implementation tries to maintain the fairness. In other words, the lock requesters are put on queue in order. Another heavy feature is the deadlock detection. POSIX seems to state that the detection is optional, but perhaps it is required for reliable operations of the network locking protocols for NFS. Sure, there can be opportunities to optimize the current algorithms. Somebody interested in such optimization should start with profiling the kernel.