From owner-freebsd-current@freebsd.org Sun Mar 12 17:47:23 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2E80FD09DB7 for ; Sun, 12 Mar 2017 17:47:23 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from pmta2.delivery6.ore.mailhop.org (pmta2.delivery6.ore.mailhop.org [54.200.129.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1198211A0 for ; Sun, 12 Mar 2017 17:47:22 +0000 (UTC) (envelope-from ian@freebsd.org) X-MHO-User: d8b3be6b-074b-11e7-b3c2-c9f38144898e X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 73.78.92.27 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [73.78.92.27]) by outbound2.ore.mailhop.org (Halon) with ESMTPSA id d8b3be6b-074b-11e7-b3c2-c9f38144898e; Sun, 12 Mar 2017 17:46:40 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.15.2) with ESMTP id v2CHlJ1q001580; Sun, 12 Mar 2017 11:47:20 -0600 (MDT) (envelope-from ian@freebsd.org) Message-ID: <1489340839.40576.82.camel@freebsd.org> Subject: Re: process killed: text file modification From: Ian Lepore To: Gergely Czuczy , freebsd-current@freebsd.org Date: Sun, 12 Mar 2017 11:47:19 -0600 In-Reply-To: <5ac94b9a-7ced-9eff-d746-7dddaaeca516@harmless.hu> References: <646c1395-9482-b214-118c-01573243ae5a@harmless.hu> <45436522-77df-f894-0569-737a6a74958f@harmless.hu> <55189643.aaZPuY77s8@ralph.baldwin.cx> <3ed3e4a3-23af-7267-39f1-9090093c9c1e@harmless.hu> <5ac94b9a-7ced-9eff-d746-7dddaaeca516@harmless.hu> Content-Type: text/plain; charset="ISO-8859-1" X-Mailer: Evolution 3.18.5.1 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Mar 2017 17:47:23 -0000 On Thu, 2017-03-09 at 21:07 +0100, Gergely Czuczy wrote: > > On 2017. 03. 09. 20:47, Gergely Czuczy wrote: > > > > > > > > On 2017. 03. 09. 19:44, John Baldwin wrote: > > > > > > On Thursday, March 09, 2017 03:31:56 PM Gergely Czuczy wrote: > > > > > > > > [+freebsd-fs] > > > > > > > > > > > > On 2017. 03. 09. 14:20, Gergely Czuczy wrote: > > > > > > > > > > On 2017. 03. 09. 11:27, Gergely Czuczy wrote: > > > > > > > > > > > > Hello, > > > > > > > > > > > > I'm trying to build a few things from ports on an rpi3, the > > > > > > ports > > > > > > collection is mounted over NFS from another machine. When > > > > > > it's trying > > > > > > to build pkg i'm getting the error message in syslog: > > > > > > > > > > > > rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file > > > > > > modification > > > > > > > > > > > > The report to pkg@: > > > > > > https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/ > > > > > > 002048.html  > > > > > > > > > > > > > > > > > > In ports-mgmt/pkg's config.log It fails at the following > > > > > > entry: > > > > > > configure:3726: checking whether we are cross compiling > > > > > > configure:3734: cc -o conftest -O2 -pipe  -Wno-error > > > > > > -fno-strict-aliasing   conftest.c  >&5 > > > > > > configure:3738: $? = 0 > > > > > > configure:3745: ./conftest > > > > > > configure:3749: $? = 137 > > > > > > configure:3756: error: in  > > > > > > `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0': > > > > > > configure:3760: error: cannot run C compiled programs. > > > > > > If you meant to cross compile, use `--host'. > > > > > > See `config.log' for more details > > > > > > > > > > > > # uname -a > > > > > > FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949: > > > > > > Thu Mar 9 > > > > > > 08:58:46 CET 2017 > > > > > > aegir@marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64. > > > > > > aarch64/tank/rpi3/src/sys/AEGIR  > > > > > > > > > > > > arm64 > > > > > So far, a few additions: > > > > > Time is synced between the NFS server and the client. > > > > > it's an open() call which is getting the kill, and it's not > > > > > the file > > > > > what's being opened, but the process executing it. > > > > > Here's a simple code that reproduces it: > > > > > #include > > > > > > > > > > int main() { > > > > > > > > > >    FILE *f = fopen ("/bar", "w"); > > > > > > > > > >    fclose(f); > > > > >    return 0; > > > > > } > > > > > > > > > > Conditions to reproduce it: > > > > >   - The resulting binary must be executed from the nfs mount > > > > >   - The binary must be built after mounting the NFS share. > > > > > > > > > > I haven't tried building it on a different host, I don't have > > > > > access > > > > > to multiple RPis. Also, if I build the binary, umount/remount > > > > > the NFS > > > > > mount point, which has the binary, execute it, then it works. > > > > > > > > > > I've also tried this with the raspbsd.org's image, I could > > > > > reproduce > > > > > it as well. > > > > > > > > > > Another interesting thing is, when I first booted the RPi up, > > > > > the NFS > > > > > server was a 10.2-STABLE, and later got updated to 11-STABLE. > > > > > While it > > > > > was 10.2 I've tried to build some port, and I don't remember > > > > > having > > > > > this issue. > > > > > > > > > > So, could someone please help me figure this out and fix it? > > > > > This > > > > > stuff should work pretty much. > > > > > > > > > So, this error message comes from here: > > > > https://svnweb.freebsd.org/base/head/sys/fs/nfsclient/nfs_clbio > > > > .c?revision=314436&view=markup#l1674  > > > > > > > > > > > > It's the NFS_TIMESPEC_COMPARE(&np->n_mtime, &np- > > > > >n_vattr.na_mtime) > > > > comparision that fails, np should be the NFS node structure, > > > > from the > > > > vnode's v_data, and n_vattr is the attribute cache. As I've > > > > seen these > > > > two are being updated together, so I don't really see by the > > > > code why > > > > they might differ. Could someone please take a look at it, with > > > > more > > > > experience in the NFS code? -czg > > > Can you print out the two mtimes?  I wonder if what's happening > > > is that > > > your server uses different granularity (for example just seconds) > > > than > > > your client, so on the client we generate a timestamp with a non- > > > zero > > > nanoseconds but when the server receives that timestamp it > > > "truncates" > > > it.  During open() we forcefully re-fetch the timestamp (for CTO > > > consistency) and then notice it doesn't match.  For now I would > > > start > > > with comparing the timestamps and maybe the > > > vfs.timestamp_precision > > > sysctls on client and server (if server is a FreeBSD box). > > Here are the time values: > > Mar  9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 +  > > -3298114786336  &np->n_vattr.na_mtime: -3298114786616 + > > -3298114786608 > > Mar  9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed: > > text  > > file modification > > Mar  9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 +  > > -3298114786336  &np->n_vattr.na_mtime: -3298114786616 + > > -3298114786608 > > Mar  9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed: > > text  > > file modification > > > > Printed this way: > >                          printf("np->n_mtime: %ji + %ji  > > &np->n_vattr.na_mtime: %ji + %ji", > > (intmax_t)(&np->n_mtime.tv_sec), > > (intmax_t)(&np->n_mtime.tv_nsec), > > (intmax_t)(&np->n_vattr.na_mtime.tv_sec), > > (intmax_t)(&np->n_vattr.na_mtime.tv_nsec)); > Sorry, I made a typo there. Here's it now: > Mar  9 20:05:35 rpi3 kernel: np->n_mtime: 1489089935 + 219323000  > &np->n_vattr.na_mtime: 1489089935 + 221438000 > Mar  9 20:05:35 rpi3 kernel: pid 847 (csh), uid 0, was killed: text > file  > modification > Mar  9 20:05:35 rpi3 kernel: np->n_mtime: 1489089935 + 219323000  > &np->n_vattr.na_mtime: 1489089935 + 221438000 > Mar  9 20:05:35 rpi3 kernel: pid 847 (csh), uid 0, was killed: text > file  > modification > > That's a difference of 2115 micro seconds. I think this is a problem that sysctl vfs.timestamp_precision is supposed to help solve.  Unfortunately, that knob doesn't provide very much control; you can truncate timestamps to full seconds, microseconds, or 1/Hz which may be milliseconds on many systems.  You've got a difference of about 2ms, so only the full seconds granularity might help. It's also not clear to me whether that setting would have to be changed on the server, or the client, or both. -- Ian