From owner-freebsd-hackers@freebsd.org Sun Jun 9 08:49:53 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B20615C22CA for ; Sun, 9 Jun 2019 08:49:53 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic306-20.consmr.mail.ne1.yahoo.com (sonic306-20.consmr.mail.ne1.yahoo.com [66.163.189.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F2C368FD29 for ; Sun, 9 Jun 2019 08:49:51 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 8JsAKYIVM1lrqn5of9egCfmjayw2SHc.sCgHFHw1848OsUrP0sTWWa66A2p9tEU 7ht6H2vm11WkOsfqpnR28_fA_0O4I7qOzJsQ7NUFrQM3yQ56fe0QnG.hfQaUt3iMFTJCYFJFhhOr wuzklaUVYaiZ8F6YQzklC8b.QSGMEiBPSlxAPN7Hjq1bjh7OM.D7pZ53p1_Ec6._mDqnx1Re_Zaj QBYg0_J2BtvGgfeoiyb6AsjP8RhqZjdjBAWOThJi1ry0IEQmNmB4mOTTHCT1bWSRGr2q.mvIvDKu iOejUX7nisxDBLTkxycZTb0y5qeTLzxM5p6MRoqv9EjUQd0tXXKtX7PSZG29m4qIQGd31OilkcW. UVlEgbWYAs_tYrIvk9ZcYEtxYJXgmh.cIL_DsGT.6ImydILcQGhFlSLPqgcpjuJI.9ICiyDeYLxn pYQulAtbqYhuK1zmAtGpzmSS4.QiRdJ0YxHNX3kkM7i.hCtqZP6rVV7x9z_6pl.hiI3siUpKKfbe cFgfBDT4l51EIHZXZa.ja5Sn9PDyvnE4YRJHfJa2TtFWFFVEkaW4qWjSrQffPrXJQtWvNGDAhxSx OfOsBEaIZc49zkpNXxRdJ0ezPmXTmDHbaiC5hChe.n4LzL85bXSL5dS5r6mDe.peJK.xcqJzEmTO 4UP.EZMX37gvTVJTEQkl7xHrW.Mk_ONUsVb78s.NItkNdk38Oe5G3uoJ6LNBh7uJ.Hb003UVddxA gIUExtZ87abeA1onOOe4Eh28I29DhbhHvpWl087DwrCNE4c.oOXrwG5UitOKh7K13BD44OhKE0t9 wAGnsqfPFpI1bcOFFKsFRWIv8uV5hr0bzsn.Q42uz_iE32_xnJBUMyKbP4d8WE6iVtnb0ppycm5Z aEFt20ppvmFWxDFB77JKwEk47NXd0vHXRRJGQm5T2le2nGRTzP48Ft_xd5STWXPt0jAQnmARm1lt oFyXueiQlijak1hCufIcpKRE87AsgMz8NGHn4UtI8JkUy1X4aXuIeBf8_9OlIDeVeg78c.Kfu4k6 lUA9TlPWydTRNqTFWfsM98XhmRE3CTiBm.G9gAUrUyfjcZBGQfO0DpDfyOjBolOZm_wLjwzGTa1B IRVM3.XbLCZJRU4dw9by3M3AQHvnHVoXfX.eZNHSeQJuWvExlsXMvsrXfXZnm35eqUgWsvvOwNdw rJvm3hufTcOjE8tYU.bqFOXJiCwi5elQtrbgkNVDoP3RFt5iIM_0eNvg6IhXH6uVwNw-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic306.consmr.mail.ne1.yahoo.com with HTTP; Sun, 9 Jun 2019 08:49:45 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp426.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID ef1f26641a5e8235579f9327ab4d6227; Sun, 09 Jun 2019 08:49:44 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: crash of 32-bit powerpc -r347549 kernel built via system-clang-8, _init_tls is where the initial DIAGNOSTICS-reported SIGSEGV happens Date: Sun, 9 Jun 2019 01:49:42 -0700 References: <8F272F27-0BC3-402A-810A-4608162F9EEE@yahoo.com> To: FreeBSD Hackers , FreeBSD PowerPC ML In-Reply-To: <8F272F27-0BC3-402A-810A-4608162F9EEE@yahoo.com> Message-Id: <35F598E5-2400-4768-8B39-BC5F9B051443@yahoo.com> X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: F2C368FD29 X-Spamd-Bar: + X-Spamd-Result: default: False [1.96 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DKIM_TRACE(0.00)[yahoo.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.01)[0.008,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.51)[ip: (5.16), ipnet: 66.163.184.0/21(1.35), asn: 36646(1.08), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.44)[0.441,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.51)[0.513,0]; RCVD_IN_DNSWL_NONE(0.00)[82.189.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Jun 2019 08:49:53 -0000 So far I've not been able to find the code that is supposed to establish the value of environ in /sbin/init as matching the value of arginfo->ps_envstr from the exec_copyout_strings use by do_execve in the kernel. Anyone know where to point me to for what I seem to have missed? The issue driving the question is having the *sp++ in _init_tls code below get SIGSEGV on 32-bit FreeBSD when built via system-clang-8 and devel/powerpc64-binutils: sp =3D (Elf_Addr *) environ; while (*sp++ !=3D 0) ; The below is relevant detail that I've found. _start in /sbin/init 's instance of lib/csu/powerpc/crt1.c calls _init_tls that is from lib/libc/gen/tls.c but first might assign to environ : . . . #include "ignore_init.c" . . . void _start(int argc, char **argv, char **env, const struct Struct_Obj_Entry *obj __unused, void (*cleanup)(void), struct ps_strings *ps_strings) { handle_argv(argc, argv, env); if (ps_strings !=3D (struct ps_strings *)0) __ps_strings =3D ps_strings; if (&_DYNAMIC !=3D NULL) atexit(cleanup); else _init_tls(); #ifdef GCRT atexit(_mcleanup); monstartup(&eprol, &etext); #endif handle_static_init(argc, argv, env); exit(main(argc, argv, env)); } lib/csu/common/ignore_init.c has: char **environ; . . . static inline void handle_argv(int argc, char *argv[], char **env) { const char *s; if (environ =3D=3D NULL) environ =3D env; if (argc > 0 && argv[0] !=3D NULL) { __progname =3D argv[0]; for (s =3D __progname; *s !=3D '\0'; s++) { if (*s =3D=3D '/') __progname =3D s + 1; } } } So _start's char**env argument might be used to assign environ. But either way I've not managed to find the binding to the kernel exec_copyout_strings operation. _init_tls has the *sp++ loop that I referenced earlier: extern char **environ; void _init_tls(void) { #ifndef PIC Elf_Addr *sp; Elf_Auxinfo *aux, *auxp; Elf_Phdr *phdr; size_t phent, phnum; int i; void *tls; sp =3D (Elf_Addr *) environ; while (*sp++ !=3D 0) ; . . . On the kernel side for invoking /sbin/init is . . . =46rom /usr/src/sys/sys/imgact.h : struct image_args { char *buf; /* pointer to string buffer */ void *bufkva; /* cookie for string buffer KVA */ char *begin_argv; /* beginning of argv in buf */ char *begin_envv; /* (interal use only) beginning of envv = in buf, * access with = exec_args_get_begin_envv(). */ char *endp; /* current `end' pointer of arg & env = strings */ char *fname; /* pointer to filename of executable = (system space) */ char *fname_buf; /* pointer to optional malloc(M_TEMP) = buffer */ int stringspace; /* space left in arg & env buffer */ int argc; /* count of argument strings */ int envc; /* count of environment strings */ int fd; /* file descriptor of the executable */ struct filedesc *fdp; /* new file descriptor table */ }; do_execve from sys/kern/kern_exec.c has use, including envc but avoiding begin_envv (via starting from begin_argv): static int do_execve(struct thread *td, struct image_args *args, struct mac *mac_p) { . . . /* * Copy out strings (args and env) and initialize stack base. */ stack_base =3D (*p->p_sysent->sv_copyout_strings)(imgp); =20 The exec_copyout_strings code (accessed via ->sv_copyout_strings) does stack_base =3D (register_t *)vectp; =20 stringp =3D imgp->args->begin_argv; argc =3D imgp->args->argc; envc =3D imgp->args->envc; . . . /* a null vector table pointer separates the argp's from the = envp's */ suword(vectp++, 0); suword(&arginfo->ps_envstr, (long)(intptr_t)vectp); suword32(&arginfo->ps_nenvstr, envc); /* * Fill in environment portion of vector table. */ for (; envc > 0; --envc) { suword(vectp++, (long)(intptr_t)destp); while (*stringp++ !=3D 0) destp++; destp++; } /* end of vector table is a null pointer */ suword(vectp, 0); . . . (=46rom what I've seen for /sbin/init being invoked, envc=3D=3D0 .) The use involves struct ps_strings from /usr/src/sys/sys/exec.h : struct ps_strings { char **ps_argvstr; /* first of 0 or more argument strings = */ unsigned int ps_nargvstr; /* the number of argument strings */ char **ps_envstr; /* first of 0 or more environment = strings */ unsigned int ps_nenvstr; /* the number of environment strings */ }; The initialization of the begin_envv and envc for much of the code seems to trace back to: static void start_init(void *dummy) { struct image_args args; . . . while ((path =3D strsep(&tmp_init_path, ":")) !=3D NULL) { if (bootverbose) printf("start_init: trying %s\n", path); memset(&args, 0, sizeof(args)); . . . =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-hackers@freebsd.org Sun Jun 9 20:21:13 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 875A615A9136 for ; Sun, 9 Jun 2019 20:21:13 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic310-24.consmr.mail.ne1.yahoo.com (sonic310-24.consmr.mail.ne1.yahoo.com [66.163.186.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 11F0D85297 for ; Sun, 9 Jun 2019 20:21:10 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: o6ItHmQVM1kb14a9SPIsFrDISJ_9A5D8ElQEIKhiiLUH6QouRxxxogrwiZeF0bY 1YpluyU3wrKgEOEuwBZ.ABICSob4SX5Yk1wfq4QzPtEOocEaV5yvpswbwSsQb239KC9lrZ4Je2rj KZH6xFMqTPvJmL7PK8EhPXd7tq563yM.TPqdS3clXzNjgajZKQu38q7EdQozfDBkErNZFRSIJetl ZS_o7nTMilx.0OXb3cl9DGX1i7p562VQwxLzpaJdHFSuKfW77bpdFzdC6Q8UP7BHJeLOsRJC.XL1 8Z3KJJUmxWkyZtzjCLVbjr2P2kEhDU0SfzfThIAeNaep5SgZu7P1JkhDbbooAp0Hf4xi2pzG31Yr H2L8oloBVFp7ISWJTbPCq6r6ne73BOLZ7_Z9CY52lONd0DGrCfOA3KwfD6q8iWvY36P.1h5lko06 cPdR.loRwtyf.7xqEJJ4TSgBDPbhpuKgGdQ1ybY8cP_YBjlqhZhEqm3uOxjoSiMTSCtTIEZNjZVa UUCzgQRfIHUHlFaavVAZBqg3zwKcATMqEM6OwmS03rV2ZqWYUTQvR_G3TLLF0sz0jNxKSxYNdm4e inZP.gy65XkOql1Ecqcvu8S4XgEEIqNKNtZPGZpKxolkEfUx8DSPPVdoxTF6sNod69UozTd1XKPx 8kQyFjYNTcN9XCbPq8u3B1vg0KCHBgE65k34gWOgmyFvD4OtgQEHn.b8Gqxz0tgr1qO_SEUrwwA8 gDVHBUL8WRsrGtZKA8_oumgzlR_w30x4XZA3RiP1G.v4wv.kOpPvInevozthT7Qo__ZhVfNvN3VH UF6mutUv8fFNGzS3TmyqeCvL0MSkFN.QOgZ9cSrkqWNbzBe5uukSGG_kC71vDIy0c3wg_GRVpGcf Z6DQ1VGVDjt7utT3IYOj8WcWR2DdOf2qVKzwOL3hMFDWGD7iWt7oleLD_C_uW6kLB6VCtnyblTe_ a0YYdACroPGx9.vgTqdbXan.uaGprxUzpRpfqAO0pMvYvuRvkAi2CcrVUYUN0FGABPuvQ4F4iCSS eCqgsOWdNOeiVZHE2xIQxDDynsRKi00CJaMXhyogGkUT8s0_uMJ_juwW8RiiJBHfNLniXT0XdafW PQNZizO0ST2ItopzOnjkP5HNA8Z4tHkvlppMYYnpdnHjQbbPxelAGCKiW.KMGAeGkBlC4UJIsEEI BLPK_f0Sms6JLev3SIIb1YSi.hxjfTqXeArwXEM_0ws3jftk6XpcxBNYMx1iMICwXJ1IpE8AA Received: from sonic.gate.mail.ne1.yahoo.com by sonic310.consmr.mail.ne1.yahoo.com with HTTP; Sun, 9 Jun 2019 20:21:03 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp427.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 06cd87487dd4d8a17a49f1b444bcbc53; Sun, 09 Jun 2019 20:10:55 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: crash of 32-bit powerpc -r347549 kernel built via system-clang-8, _init_tls is where the initial DIAGNOSTICS-reported SIGSEGV happens Date: Sun, 9 Jun 2019 13:10:53 -0700 References: <8F272F27-0BC3-402A-810A-4608162F9EEE@yahoo.com> <35F598E5-2400-4768-8B39-BC5F9B051443@yahoo.com> To: FreeBSD Hackers , FreeBSD PowerPC ML In-Reply-To: <35F598E5-2400-4768-8B39-BC5F9B051443@yahoo.com> Message-Id: <141293A3-0111-4E08-AA76-2F9DBBEA5A58@yahoo.com> X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: 11F0D85297 X-Spamd-Bar: / X-Spamd-Result: default: False [-0.51 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-0.83)[-0.833,0]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-0.64)[-0.638,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.97)[ip: (2.47), ipnet: 66.163.184.0/21(1.35), asn: 36646(1.08), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.50)[0.504,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[205.186.163.66.list.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Jun 2019 20:21:13 -0000 [Never mind: I found exec_setregs = /usr/src/sys/powerpc/powerpc/exec_machdep.c and were it is used.] On 2019-Jun-9, at 01:49, Mark Millard wrote: > So far I've not been able to find the code that is supposed > to establish the value of environ in /sbin/init as matching > the value of arginfo->ps_envstr from the exec_copyout_strings > use by do_execve in the kernel. >=20 > Anyone know where to point me to for what I seem to have > missed? >=20 > The issue driving the question is having the *sp++ in > _init_tls code below get SIGSEGV on 32-bit FreeBSD when > built via system-clang-8 and devel/powerpc64-binutils: >=20 > sp =3D (Elf_Addr *) environ; > while (*sp++ !=3D 0) > ; >=20 >=20 > The below is relevant detail that I've found. >=20 > _start in /sbin/init 's instance of lib/csu/powerpc/crt1.c > calls _init_tls that is from lib/libc/gen/tls.c but first > might assign to environ : >=20 > . . . > #include "ignore_init.c" > . . . > void > _start(int argc, char **argv, char **env, > const struct Struct_Obj_Entry *obj __unused, void (*cleanup)(void), > struct ps_strings *ps_strings) > { >=20 >=20 > handle_argv(argc, argv, env); >=20 > if (ps_strings !=3D (struct ps_strings *)0) > __ps_strings =3D ps_strings; >=20 > if (&_DYNAMIC !=3D NULL) > atexit(cleanup); > else > _init_tls(); >=20 > #ifdef GCRT > atexit(_mcleanup); > monstartup(&eprol, &etext); > #endif >=20 > handle_static_init(argc, argv, env); > exit(main(argc, argv, env)); > } >=20 > lib/csu/common/ignore_init.c has: >=20 > char **environ; > . . . > static inline void > handle_argv(int argc, char *argv[], char **env) > { > const char *s; >=20 > if (environ =3D=3D NULL) > environ =3D env; > if (argc > 0 && argv[0] !=3D NULL) { > __progname =3D argv[0]; > for (s =3D __progname; *s !=3D '\0'; s++) { > if (*s =3D=3D '/') > __progname =3D s + 1; > } > } > } >=20 > So _start's char**env argument might be used to assign > environ. But either way I've not managed to find the > binding to the kernel exec_copyout_strings operation. >=20 > _init_tls has the *sp++ loop that I referenced earlier: >=20 > extern char **environ; >=20 > void > _init_tls(void) > { > #ifndef PIC > Elf_Addr *sp; > Elf_Auxinfo *aux, *auxp; > Elf_Phdr *phdr; > size_t phent, phnum; > int i; > void *tls; >=20 > sp =3D (Elf_Addr *) environ; > while (*sp++ !=3D 0) > ; > . . . >=20 >=20 > On the kernel side for invoking /sbin/init is . . . >=20 > =46rom /usr/src/sys/sys/imgact.h : >=20 > struct image_args { > char *buf; /* pointer to string buffer */ > void *bufkva; /* cookie for string buffer KVA */ > char *begin_argv; /* beginning of argv in buf */ > char *begin_envv; /* (interal use only) beginning of envv = in buf, > * access with = exec_args_get_begin_envv(). */ > char *endp; /* current `end' pointer of arg & env = strings */ > char *fname; /* pointer to filename of executable = (system space) */ > char *fname_buf; /* pointer to optional malloc(M_TEMP) = buffer */ > int stringspace; /* space left in arg & env buffer */ > int argc; /* count of argument strings */ > int envc; /* count of environment strings */ > int fd; /* file descriptor of the executable */ > struct filedesc *fdp; /* new file descriptor table */ > }; >=20 > do_execve from sys/kern/kern_exec.c has use, including envc > but avoiding begin_envv (via starting from begin_argv): >=20 > static int > do_execve(struct thread *td, struct image_args *args, struct mac = *mac_p) > { > . . . > /* > * Copy out strings (args and env) and initialize stack base. > */ > stack_base =3D (*p->p_sysent->sv_copyout_strings)(imgp); >=20 >=20 > The exec_copyout_strings code (accessed via ->sv_copyout_strings) > does >=20 > stack_base =3D (register_t *)vectp; >=20 > stringp =3D imgp->args->begin_argv; > argc =3D imgp->args->argc; > envc =3D imgp->args->envc; > . . . >=20 > /* a null vector table pointer separates the argp's from the = envp's */ > suword(vectp++, 0); >=20 > suword(&arginfo->ps_envstr, (long)(intptr_t)vectp); > suword32(&arginfo->ps_nenvstr, envc); >=20 > /* > * Fill in environment portion of vector table. > */ > for (; envc > 0; --envc) { > suword(vectp++, (long)(intptr_t)destp); > while (*stringp++ !=3D 0) > destp++; > destp++; > } >=20 > /* end of vector table is a null pointer */ > suword(vectp, 0); > . . . >=20 > (=46rom what I've seen for /sbin/init being invoked, envc=3D=3D0 .) >=20 > The use involves struct ps_strings from /usr/src/sys/sys/exec.h : >=20 > struct ps_strings { > char **ps_argvstr; /* first of 0 or more argument strings = */ > unsigned int ps_nargvstr; /* the number of argument strings */ > char **ps_envstr; /* first of 0 or more environment = strings */ > unsigned int ps_nenvstr; /* the number of environment strings = */ > }; >=20 >=20 > The initialization of the begin_envv and envc for much of > the code seems to trace back to: >=20 > static void > start_init(void *dummy) > { > struct image_args args; > . . . > while ((path =3D strsep(&tmp_init_path, ":")) !=3D NULL) { > if (bootverbose) > printf("start_init: trying %s\n", path); >=20 > memset(&args, 0, sizeof(args)); > . . . I found it: /usr/src/sys/powerpc/powerpc/exec_machdep.c has exec_setregs that is accessed (via sv_setregs). This sets up arguments for _start . =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-hackers@freebsd.org Mon Jun 10 06:16:12 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CFEFA15B5389 for ; Mon, 10 Jun 2019 06:16:11 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic309-15.consmr.mail.bf2.yahoo.com (sonic309-15.consmr.mail.bf2.yahoo.com [74.6.129.125]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 93CEA944FC for ; Mon, 10 Jun 2019 06:16:10 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 4XfBMCgVM1lOj19.HgnqGZkEE3sVY.sBuI.EuqPoiZFsY.oQW4Qg.O.b0gkuMau pRLc.0Kz2TAKXYcgAGU.N9OaU0l_OuSMp39ARckfmLUav_E0CvsuzRl25RxBC6ajbVQJ9uQdYFZ. .TojWMuJE4LLKOGr4XVEk1PtK7cklF9MkI7P9JL0KOhkFZbcH.Ib9YC.zssYVgwlDXVebel0Msgp q4qL.f3srp64PMkWhAXNyQlUBHBsQ96EM9ndolrZQOJFsoWE.mFk8X4R4614ZBDKCYv_SxukeFHC lohRZ87AGCyVn9yTJrsCA1L9eLIXC0npgKlRVetqP.xMBg2a31At92_LqUWnS9dTyXfXliVFQsUv luOHQ1vGrV78J5.9aU5mBvrE6D97xlrFniRmCROA3rpwd8XCk5BtAWLiUFL57iR275IDjsSRdOz3 WWtWmTvfe.Np57gX4B_5dKQYYt7_sa45SbloD6uu_D6rdVKFm4BWXPILeZ_Rm2U9OP1HxUd7KCPw 9M7bDGvpxU3YlASSG6A4LTrBnCN9k0gKiv0P3PuyeVIik6G1s0OTm7wUXZ8r9x.4q.EoZK8ei35r 4vU2PWTeFjFR22H0rw3Sll1FkC1XjTpi_er9HHNKct8BwxW57O5__BfQ4H1nfcZGJodVK6s9ZGy0 WMyWHU1fxxx0rNjaCKSMIc650C.x5P0i_p0w9vA3kRtVkCxZFsxG9D.aIMneprUBF4x8qxKe54Zw v3l4y1uG8Ee8NMSQrRNrCgP3GyAunlIY_voJJDOutlSiRaQ2.GACddPkmN0DXGdTa8NxBk1gbn1F H6xt55gY0sI9URXNS_POuT0MEkAcryMmY4Rhlxs3Mpk3EXnQWw3iEA2RjsMJJOapmhkgT9uD2aY6 NRN9YgNntcKJlk7necdDKsbOQiZyA0bPnVI9L9fmRYiXQFOSmsEChIt1M_VQnXiMvyGbkc4nk7H_ x7_7rGHti0cInjpbWuKzMEVXTC4bG4P9xg1hFdnoOPUYS9fWgsxMb5CyuoOLHGROBdmp.oiCCrfd PnPxoBBsVaCwA7DMmGKRlskIbCRm5agnas5G1qvocM..xUFmiWkscvKvFdB5E4bm7XB5_KlIlWJK Z0Xnn4u2d1twYdbtb83eFm.ACY3ILibGOuUlcTPwwk3pp4LzboVtJsAQo_uP4pcMbc1oWgFe9pOf etVpZDXK2HiAh4_64ZU46qlpkME8yhZ8RT_q__7e2Q_x1Ls7VrQ-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic309.consmr.mail.bf2.yahoo.com with HTTP; Mon, 10 Jun 2019 06:16:03 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp423.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID fcd0bcfadef083087ebc234ddc7f4c70; Mon, 10 Jun 2019 06:16:00 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? Message-Id: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> Date: Sun, 9 Jun 2019 23:15:57 -0700 To: FreeBSD Hackers , freeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: 93CEA944FC X-Spamd-Bar: ++++ X-Spamd-Result: default: False [4.16 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.95)[0.947,0]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; NEURAL_SPAM_MEDIUM(0.46)[0.465,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.87)[0.865,0]; RCVD_IN_DNSWL_NONE(0.00)[125.129.6.74.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[]; RWL_MAILSPIKE_POSSIBLE(0.00)[125.129.6.74.rep.mailspike.net : 127.0.0.17]; IP_SCORE(1.39)[ip: (4.27), ipnet: 74.6.128.0/21(1.53), asn: 26101(1.22), country: US(-0.06)] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2019 06:16:12 -0000 This leads up to questioning if .sbss and .bss in /sbin/init are always correctly zeroed. But I may have missed something in the sequencing. (The code is not familiar material.) If I've tracked it down right: sys/kern/kern_exec.c uses kern_execve top deal with starting up /sbin/init. kern_execve uses do_execve. do_execve uses: /* * Each of the items is a pointer to a `const struct execsw', hence the * double pointer here. */ static const struct execsw **execsw; . . . /* * Loop through the list of image activators, calling each = one. * An activator returns -1 if there is no match, 0 on = success, * and an error otherwise. */ for (i =3D 0; error =3D=3D -1 && execsw[i]; ++i) { if (execsw[i]->ex_imgact =3D=3D NULL || execsw[i]->ex_imgact =3D=3D img_first) { continue; } error =3D (*execsw[i]->ex_imgact)(imgp); } /usr/src/sys/kern/imgact_elf.c has: /* * Tell kern_execve.c about it, with a little help from the linker. */ static struct execsw __elfN(execsw) =3D { .ex_imgact =3D __CONCAT(exec_, __elfN(imgact)), .ex_name =3D __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) }; EXEC_SET(__CONCAT(elf, __ELF_WORD_SIZE), __elfN(execsw)); __CONCAT(exec_, __elfN(imgact)) uses __elfN(load_sections) . __elfN(load_sections) uses __elfN(load_section). __elfN(load_section) uses vm_imgact_map_page to set up for its copyout. This appears to be how the FileSiz (not including .sbss or .bss) vs. MemSiz (including .sbss and .bss) is handled (attempted?). vm_imgact_map_page uses vm_imgact_hold_page. vm_imgact_hold_page uses vm_pager_get_pages. vm_pager_get_pages uses vm_page_zero_invalid to "Zero out partially filled data". But vm_page_zero_invalid does not zero every "invalid" byte but works in terms of units of DEV_BSIZE : void vm_page_zero_invalid(vm_page_t m, boolean_t setvalid) { int b; int i; VM_OBJECT_ASSERT_WLOCKED(m->object); /* * Scan the valid bits looking for invalid sections that * must be zeroed. Invalid sub-DEV_BSIZE'd areas ( where the * valid bit may be set ) have already been zeroed by * vm_page_set_validclean(). */ for (b =3D i =3D 0; i <=3D PAGE_SIZE / DEV_BSIZE; ++i) { if (i =3D=3D (PAGE_SIZE / DEV_BSIZE) || (m->valid & ((vm_page_bits_t)1 << i))) { if (i > b) { pmap_zero_page_area(m, b << DEV_BSHIFT, (i - b) << = DEV_BSHIFT); } b =3D i + 1; } } /* * setvalid is TRUE when we can safely set the zero'd areas * as being valid. We can do this if there are no cache = consistancy * issues. e.g. it is ok to do with UFS, but not ok to do with = NFS. */ if (setvalid) m->valid =3D VM_PAGE_BITS_ALL; } The comment indicates that areas of "sub-DEV_BSIZE" should have been handled previously by vm_page_set_validclean . But no part of the sequence appears to use vm_page_set_validclean . So, if, say, char**environ ends up at the start of .sbss consistently, does environ always end up zeroed independently of FileSz for the PT_LOAD that spans them? The following is not necessarily an example of problematical figures but is just for showing an example structure of what FileSiz covers vs. MemSiz for PT_LOAD's that involve .sbss and .bss : Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg = Align LOAD 0x000000 0x01800000 0x01800000 0x1222dc 0x1222dc R E = 0x10000 LOAD 0x123000 0x01933000 0x01933000 0x0618c 0x32e88 RWE = 0x10000 NOTE 0x0000d4 0x018000d4 0x018000d4 0x00048 0x00048 R 0x4 TLS 0x123000 0x01933000 0x01933000 0x00b10 0x00b1d R 0x10 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10 Section to Segment mapping: Segment Sections... 00 .note.tag .init .text .fini .rodata .eh_frame=20 01 .tdata .tbss .init_array .fini_array .ctors .dtors .jcr = .data.rel.ro .data .got .sbss .bss=20 02 .note.tag=20 03 .tdata .tbss=20 04 =20 There are 24 section headers, starting at offset 0x14eb20: Section Headers: [Nr] Name Type Addr Off Size ES Flg = Lk Inf Al [ 0] NULL 00000000 000000 000000 00 = 0 0 0 [ 1] .note.tag NOTE 018000d4 0000d4 000048 00 A = 0 0 4 [ 2] .init PROGBITS 0180011c 00011c 000034 00 AX = 0 0 4 [ 3] .text PROGBITS 01800150 000150 111e14 00 AX = 0 0 16 [ 4] .fini PROGBITS 01911f64 111f64 000030 00 AX = 0 0 4 [ 5] .rodata PROGBITS 01911fc0 111fc0 010318 00 A = 0 0 64 [ 6] .eh_frame PROGBITS 019222d8 1222d8 000004 00 A = 0 0 4 [ 7] .tdata PROGBITS 01933000 123000 000b10 00 WAT = 0 0 16 [ 8] .tbss NOBITS 01933b10 123b10 00000d 00 WAT = 0 0 4 [ 9] .init_array INIT_ARRAY 01933b10 123b10 000008 04 WA = 0 0 4 [10] .fini_array FINI_ARRAY 01933b18 123b18 000004 04 WA = 0 0 4 [11] .ctors PROGBITS 01933b1c 123b1c 000008 00 WA = 0 0 4 [12] .dtors PROGBITS 01933b24 123b24 000008 00 WA = 0 0 4 [13] .jcr PROGBITS 01933b2c 123b2c 000004 00 WA = 0 0 4 [14] .data.rel.ro PROGBITS 01933b30 123b30 002ee4 00 WA = 0 0 4 [15] .data PROGBITS 01936a18 126a18 002763 00 WA = 0 0 8 [16] .got PROGBITS 0193917c 12917c 000010 04 WAX = 0 0 4 [17] .sbss NOBITS 0193918c 12918c 0000b0 00 WA = 0 0 4 [18] .bss NOBITS 01939240 12918c 02cc48 00 WA = 0 0 64 [19] .comment PROGBITS 00000000 12918c 0073d4 01 MS = 0 0 1 [20] .gnu_debuglink PROGBITS 00000000 130560 000010 00 = 0 0 4 [21] .symtab SYMTAB 00000000 130570 00fc40 10 = 22 1681 4 [22] .strtab STRTAB 00000000 1401b0 00e8b3 00 = 0 0 1 [23] .shstrtab STRTAB 00000000 14ea63 0000bc 00 = 0 0 1 . . . 2652: 000000000193918c 4 OBJECT GLOBAL DEFAULT 17 environ =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-hackers@freebsd.org Mon Jun 10 14:37:30 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8DD9715C0D7A; Mon, 10 Jun 2019 14:37:30 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: from mail-it1-f179.google.com (mail-it1-f179.google.com [209.85.166.179]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7DD23760FB; Mon, 10 Jun 2019 14:37:29 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: by mail-it1-f179.google.com with SMTP id m3so13629830itl.1; Mon, 10 Jun 2019 07:37:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc; bh=K+bRhC4gixUlUgBIyTPWieIOkecQGcFgRpJlZAizZvk=; b=nbjZlvJ31VKE/7ZJ28GMrZ3RhwrAULI2VFJbWkoQ4ktbteJXkNXkZKpju3s8ayEovS mBmOBJemKO1kUeNBdNYfabNDzU09u6Ztcd5N9MKUUQYIVEDazKhq4+Dz9vXj8ZJoAof0 PLELsVwtkcT4rwHSpixU3V6ZBsz1MXcE2VhKtS3FgTrI7PYH8BxdnDRaqxLlePjWMx0v 86lfy3ypOWHxShrgCz2j4z7JTiqTxellY7DRpPoCHWgaJaCck6jWkpX4nLQOftScdtD1 ZTGYT6vp8+ocIGHwUpyAVIDFTUOjJuw2bfYQ11Luuhdyvmn3OrUdCgi85Mw0VGB5Ssbs cyYg== X-Gm-Message-State: APjAAAXCtILqf7+csKf5/wfy/szsCXqpGiJ0Zjjj9fwJzuzwRezqtdrf CuMuiXsDIbHrtBrMxhkGl4hmzUN2 X-Google-Smtp-Source: APXvYqyTDc0eYM8I6wf+c+OyCaO+7SrKn8hwB7hru5oBCHfdKz9tswSzUPaZ+3oR0h/GLMOgw1cKEA== X-Received: by 2002:a24:240c:: with SMTP id f12mr14071978ita.14.1560177443082; Mon, 10 Jun 2019 07:37:23 -0700 (PDT) Received: from mail-it1-f182.google.com (mail-it1-f182.google.com. [209.85.166.182]) by smtp.gmail.com with ESMTPSA id m129sm4957061itd.6.2019.06.10.07.37.22 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Mon, 10 Jun 2019 07:37:22 -0700 (PDT) Received: by mail-it1-f182.google.com with SMTP id m187so13627998ite.3; Mon, 10 Jun 2019 07:37:22 -0700 (PDT) X-Received: by 2002:a24:a43:: with SMTP id 64mr14907281itw.100.1560177442118; Mon, 10 Jun 2019 07:37:22 -0700 (PDT) MIME-Version: 1.0 References: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> In-Reply-To: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> Reply-To: cem@freebsd.org From: Conrad Meyer Date: Mon, 10 Jun 2019 07:37:11 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? To: Mark Millard Cc: FreeBSD Hackers , freeBSD PowerPC ML Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 7DD23760FB X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of csecem@gmail.com designates 209.85.166.179 as permitted sender) smtp.mailfrom=csecem@gmail.com X-Spamd-Result: default: False [-4.54 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; HAS_REPLYTO(0.00)[cem@freebsd.org]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; IP_SCORE(-2.55)[ip: (-6.98), ipnet: 209.85.128.0/17(-3.40), asn: 15169(-2.28), country: US(-0.06)]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.98)[-0.984,0]; FORGED_SENDER(0.30)[cem@freebsd.org,csecem@gmail.com]; FREEMAIL_TO(0.00)[yahoo.com]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; TAGGED_FROM(0.00)[]; FROM_NEQ_ENVFROM(0.00)[cem@freebsd.org,csecem@gmail.com]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[179.166.85.209.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[]; SUBJECT_ENDS_QUESTION(1.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2019 14:37:30 -0000 Hi Mark, On Sun, Jun 9, 2019 at 11:17 PM Mark Millard via freebsd-hackers wrote: > ... > vm_pager_get_pages uses vm_page_zero_invalid > to "Zero out partially filled data". > > But vm_page_zero_invalid does not zero every "invalid" > byte but works in terms of units of DEV_BSIZE : > ... > The comment indicates that areas of "sub-DEV_BSIZE" > should have been handled previously by > vm_page_set_validclean . Or another VM routine, yes (e.g., vm_page_set_valid_range). The valid and dirty bitmasks in vm_page only have a single bit per DEV_BSIZE region, so care must be taken when marking any sub-DEV_BSIZE region as valid to zero out the rest of the DEV_BSIZE region. This is part of the VM page contract. I'm not sure it's related to the BSS, though. > So, if, say, char**environ ends up at the start of .sbss > consistently, does environ always end up zeroed independently > of FileSz for the PT_LOAD that spans them? It is required to be zeroed, yes. If not, there is a bug. If FileSz covers BSS, that's a bug in the linker. Either the trailing bytes of the corresponding page in the executable should be zero (wasteful; on amd64 ".comment" is packed in there instead), or the linker/loader must zero them at initialization. I'm not familiar with the particular details here, but if you are interested I would suggest looking at __elfN(load_section) in sys/kern/imgact_elf.c. > The following is not necessarily an example of problematical > figures but is just for showing an example structure of what > FileSiz covers vs. MemSiz for PT_LOAD's that involve .sbss > and .bss : > ... Your 2nd LOAD phdr's FileSiz matches up exactly with Segment .sbss Offset minus Segment .tdata Offset, i.e., none of the FileSiz corresponds to the (s)bss regions. (Good! At least the static linker part looks sane.) That said, the boundary is not page-aligned and the section alignment requirement is much lower than page_size, so the beginning of bss will share a file page with some data. Something should zero it at image activation. (Tangent: sbss/bss probably do not need to be RWE on PPC! On amd64, init has three LOAD segments rather than two: one for rodata (R), one for .text, .init, etc (RX); and one for .data (RW).) Best, Conrad From owner-freebsd-hackers@freebsd.org Mon Jun 10 18:25:01 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B547C15C5DC2 for ; Mon, 10 Jun 2019 18:25:01 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic314-20.consmr.mail.gq1.yahoo.com (sonic314-20.consmr.mail.gq1.yahoo.com [98.137.69.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 76B1C877AD for ; Mon, 10 Jun 2019 18:25:00 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: ixc0Ir8VM1lkLbDSjDH51AlwyXXiV1bxk0os4tSHGcpPmSHahFEFWAmQ_Fx88Vn ix42u6AVYQCe57DWPsRklEUtc5gWl4v7tX9V5EMLn1ZVPMQVO2WeLDuugScnM6LOtsAj9C7_5SCa 2RE6eaf3WfXYprfwemTVRv74o1tATXfJlSK3IoPz.es8IHIs0bBmz1TJLZQkikRffvT3YL8pqC.i kl40JeY4I09sOQ2iv13eXtAJoUmGMLYVEo2XxLL2XeksptrApGTNS1n8mlfXyDTyziad1GUg.n2p D6otcbWwgg2xVV3yz0uw0dEPu.pJuYlSnxKx6lkayCRTjbgBHJzBw0TcelVFGOPbjQshfMyl1q2d BHftyzD_b8ai7nyJpRRkf7aoG_hKdNoHzXRKqnXlQm.Nh8VlvgeIhwOvcmDC58aAekKcqryNon8o tDJAz3tW1n6pM19v2bDbiLLCUoI8u335PLR2NOsJ0JjuTrpdDkLEvfRnptbxpay54xFp.RDP84BC B8H8XoXFEnCuBXH.s3hS5WIwVq2FZTwYAIqY4OtJvLCLEjSj3jXqOvYNkH4XWoPj41MfcgjGye3D qmuJYLbavG5m1sdOBLQyeHsjjuFL2g2C1uV9t9jGZzIDRx9JzH2GUNfbbL7cY4spJ3tZm8Kovnwd utIQ06HLlyrp2MEW2ZIkixsswTJjXhutY5D9myQ03WbBB3JMFTXstJcSKZS._sQA7JL79YaeCXsy RdBYEyacsYMtdb4pedPaDpO9mcll_JRJY4.L_EeKafmGOunBYOYaOgHWuPzVf75vvrM4z7lgG_wP 9.oshaXeoSzc.AwI__xNXTJHXt3xmlM_eqhqYgQBLyGyhYyr9qYsL6YHOAObKhmy9jf8HFuVb8Qm WJ1fl7dIcsiA7DtWAHgVp2zNLFAToyrPbIWE41CnyzNWcc8dEo_hA9fM8toLRr1d4zLc98mkWsck 7Jx12JTsd3YYpBiyWEdQ7OQPVcqFjMurGLxnzandIePxPqhcXhFJrjB0AFc_jiEW4N3Pi8Pjd7x. y4N4Emtn_GbKqdSvmd2soOheznidhlTQVszdnZFmpMgSDe.5kBzn41.dnYQvY5mm..McRkS47_zZ HstyPNZHvbd42vQ3lshRI63G8qAuaVs3lvAHZKsIwlXnao9BgKZw8nOpD1..PO6t14jMFunnN3c_ x4kuQkMYomb_snPQPqo2Sg0Amj8eNuhzFI_u2Bh1.s.NAdXub45TEkWbY0HXgGCM3ydbPfVo- Received: from sonic.gate.mail.ne1.yahoo.com by sonic314.consmr.mail.gq1.yahoo.com with HTTP; Mon, 10 Jun 2019 18:24:58 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.113]) ([67.170.167.181]) by smtp431.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 5bec5dc4ee973db8ce5ae635e185b59f; Mon, 10 Jun 2019 18:24:55 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? From: Mark Millard In-Reply-To: Date: Mon, 10 Jun 2019 11:24:54 -0700 Cc: FreeBSD Hackers , freeBSD PowerPC ML , Alfredo Dal Ava Junior , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com> References: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> To: Conrad Meyer X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: 76B1C877AD X-Spamd-Bar: ++ X-Spamd-Result: default: False [2.98 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; NEURAL_HAM_SHORT(-0.74)[-0.744,0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; NEURAL_SPAM_MEDIUM(0.59)[0.586,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.93)[0.935,0]; RCVD_IN_DNSWL_NONE(0.00)[83.69.137.98.list.dnswl.org : 127.0.5.0]; IP_SCORE(1.71)[ip: (6.91), ipnet: 98.137.64.0/21(0.95), asn: 36647(0.76), country: US(-0.06)]; RWL_MAILSPIKE_POSSIBLE(0.00)[83.69.137.98.rep.mailspike.net : 127.0.0.17] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2019 18:25:01 -0000 [Looks like Conrad M. is partially confirming my trace of the issue is reasonable.] On 2019-Jun-10, at 07:37, Conrad Meyer wrote: > Hi Mark, >=20 > On Sun, Jun 9, 2019 at 11:17 PM Mark Millard via freebsd-hackers > wrote: >> ... >> vm_pager_get_pages uses vm_page_zero_invalid >> to "Zero out partially filled data". >>=20 >> But vm_page_zero_invalid does not zero every "invalid" >> byte but works in terms of units of DEV_BSIZE : >> ... >> The comment indicates that areas of "sub-DEV_BSIZE" >> should have been handled previously by >> vm_page_set_validclean . >=20 > Or another VM routine, yes (e.g., vm_page_set_valid_range). The valid > and dirty bitmasks in vm_page only have a single bit per DEV_BSIZE > region, so care must be taken when marking any sub-DEV_BSIZE region as > valid to zero out the rest of the DEV_BSIZE region. This is part of > the VM page contract. I'm not sure it's related to the BSS, though. Yea, I had written from what I'd seen in __elfN(load_section): QUOTE __elfN(load_section) uses vm_imgact_map_page to set up for its copyout. This appears to be how the FileSiz (not including .sbss or .bss) vs. MemSiz (including .sbss and .bss) is handled (attempted?). END QUOTE The copyout only copies through the last byte for filesz but the vm_imgact_map_page does not zero out all the bytes after that on that page: /* * We have to get the remaining bit of the file into the first = part * of the oversized map segment. This is normally because the = .data * segment in the file is extended to provide bss. It's a neat = idea * to try and save a page, but it's a pain in the behind to = implement. */ copy_len =3D filsz =3D=3D 0 ? 0 : (offset + filsz) - = trunc_page(offset + filsz); map_addr =3D trunc_page((vm_offset_t)vmaddr + filsz); map_len =3D round_page((vm_offset_t)vmaddr + memsz) - map_addr; . . . if (copy_len !=3D 0) { sf =3D vm_imgact_map_page(object, offset + filsz); if (sf =3D=3D NULL) return (EIO); /* send the page fragment to user space */ off =3D trunc_page(offset + filsz) - trunc_page(offset + = filsz); error =3D copyout((caddr_t)sf_buf_kva(sf) + off, (caddr_t)map_addr, copy_len); vm_imgact_unmap_page(sf); if (error !=3D 0) return (error); } I looked into the details of the DEV_BSIZE code after sending the original message and so realized that my provided example /sbin/init readelf material was a good example of the issue if I'd not missed something. >> So, if, say, char**environ ends up at the start of .sbss >> consistently, does environ always end up zeroed independently >> of FileSz for the PT_LOAD that spans them? >=20 > It is required to be zeroed, yes. If not, there is a bug. If FileSz > covers BSS, that's a bug in the linker. Either the trailing bytes of > the corresponding page in the executable should be zero (wasteful; on > amd64 ".comment" is packed in there instead), or the linker/loader > must zero them at initialization. I'm not familiar with the > particular details here, but if you are interested I would suggest > looking at __elfN(load_section) in sys/kern/imgact_elf.c. I had looked at it some, see the material around the earlier quote above. >> The following is not necessarily an example of problematical >> figures but is just for showing an example structure of what >> FileSiz covers vs. MemSiz for PT_LOAD's that involve .sbss >> and .bss : >> ... >=20 > Your 2nd LOAD phdr's FileSiz matches up exactly with Segment .sbss > Offset minus Segment .tdata Offset, i.e., none of the FileSiz > corresponds to the (s)bss regions. (Good! At least the static linker > part looks sane.) That said, the boundary is not page-aligned and the > section alignment requirement is much lower than page_size, so the > beginning of bss will share a file page with some data. Something > should zero it at image activation. And, so far, I've not found anything in _start or before that does zero any "sub-DEV_BSIZE" part after FileSz for the PT_LOAD in question. Thanks for checking my trace of the issue. It is good to have some confirmation that I'd not missed something. > (Tangent: sbss/bss probably do not need to be RWE on PPC! On amd64, > init has three LOAD segments rather than two: one for rodata (R), one > for .text, .init, etc (RX); and one for .data (RW).) Yea, the section header flags indicate just WA for .sbss and .bss (but WAX for .got). But such is more general: for example, the beginning of .rodata (not executable) shares the tail part of a page with .fini (executable) in the example. .got has executable code but is in the middle of sections that do not. For something like /sbin/init it is so small that the middle of a page can be the only part that is executable, as in the example. (It is not forced onto its own page.) The form of .got used is also writable: WAX for section header flags. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-hackers@freebsd.org Mon Jun 10 19:20:37 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A56A715C6D84 for ; Mon, 10 Jun 2019 19:20:37 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic315-55.consmr.mail.gq1.yahoo.com (sonic315-55.consmr.mail.gq1.yahoo.com [98.137.65.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 74918892E9 for ; Mon, 10 Jun 2019 19:20:36 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: t6IwKW8VM1nsOUpxsXKqYx76qdNfFA3lRaiFIYD2ZxKgjItMqrYmOxW90aQ85vG J_QpxFTvi0NOCBNx0c16hE.XLyZEFfYw2GYtaXTUnbT2pKK01GYNMEvNtSakslTWbUXeo21ObHYF 9Mq56DSuRFdcqMUv8nj2t8ELa5If5tfpfF6uy7AuodbirlhbBwRG4o3WcT2SSbXN10LFsu3yCKmN uArdPx1.7YDomwtGzsivgY3JGwHl0ihu0YDkVWrAZ2QD.Xfgu_jUaaXV3kEk3CHBVRk77mw7T.vQ Y8Ryj_VzMOx6qrXs0SR72frusmXk45RXANxMf_s5hbBwhDEPs4UVolSfKfcRvFfwB4DFuKK6SO4b 0NwWiy7giuV0dS6.JVR5WYe19lFvk0Jkus6bbymC2Ze_BWFPwhMwUKBS.PimD_FJm8ek9VPwpWkS sNUDTp486hcDi4z2hkDw6eZlDT6GlGwLuX3nDu1.2heRR04r2341BNXecLI1MLIvte.SIHYguYAU LIHXLDhN4_76DQg5VjXuDpXbx39A3FBoqg5S28I8UyfhFqYENgXjcBpWG.tOyHg_feCwkJpkALLA yc.aJDq9iH0vQbG3sCiRBOm7mfGvXkMud2tGa18U4JqthRV1G_cuGsEhH0YOW0DxyhDDo5X9TEIZ lISA_9vIk7pCkzNsnbioWKqU2ElQFGU1ZnlRjFJv0RNNN75ydYw32aON3epTtSJ3usJvFD6.W5sX 7iNEb.aJylDTPx.nZsyXHBj57dKreo.NZBelQVqceSzJSA7_G1UPL3UN88OAi_ra1OYs80BvxfJv r09L.yFto4gi.lop8ZhdWJduK5DuNub_v3OKy5DwvMNLZ_RSLtjVOvEDV8Ma9gxZXyDLpvMqZvnS EEs5iqBfj0KKe1UFwBmptx2Mu7VuGC8udweExNYag8iz54ORNA9vgxSGg9JPKIEZk2zjyPNbpa5h _6.WCmLQkLKrfBe2CFpIWdOvUYcp5kiZYOj30R.Dg0O3s_tZbP7NkDil5uNGc7UUFf_AcYfTJjio TmHPWx7SHPKFkeA9u2N7BacWWZ81Dakf.93TqYk6O_.m8n1RzA93ibJ.cdczGkHxtpJoCIYaFsky f_3WmgH5nWppRkZyEyWSDjbl4t3btVOOMiK_Dp8Y7GYbzPkms8HXvP469jetmxoIoWpDyrZwKN6q S0rt22cdNveLoNlmpysyQJTGQVFXePh8zoClHnHsvm.8Aw8tVaxLhcrX81hoLbwGiNA_0Y2Vn7mi gVtTWug-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic315.consmr.mail.gq1.yahoo.com with HTTP; Mon, 10 Jun 2019 19:20:29 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp415.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 673ffeefadd8b78e7984fa7f49243cfb; Mon, 10 Jun 2019 19:20:28 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? From: Mark Millard In-Reply-To: <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com> Date: Mon, 10 Jun 2019 12:20:27 -0700 Cc: Alfredo Dal Ava Junior , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: <47E002B7-D4A1-4C4B-BFFD-D926263D895E@yahoo.com> References: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com> To: FreeBSD Hackers , freeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: 74918892E9 X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.05 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; NEURAL_HAM_SHORT(-0.71)[-0.707,0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.89)[ip: (7.78), ipnet: 98.137.64.0/21(0.95), asn: 36647(0.76), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.42)[0.418,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.96)[0.957,0]; RCVD_IN_DNSWL_NONE(0.00)[31.65.137.98.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2019 19:20:37 -0000 [I decided to compare some readelf information from some other architectures. I was surprised by some of it. But .bss seems to be forced to start with a large alignment to avoid such issues as I originally traced.] On 2019-Jun-10, at 11:24, Mark Millard wrote: > [Looks like Conrad M. is partially confirming my trace of the > issue is reasonable.] >=20 > On 2019-Jun-10, at 07:37, Conrad Meyer wrote: >=20 >> Hi Mark, >>=20 >> On Sun, Jun 9, 2019 at 11:17 PM Mark Millard via freebsd-hackers >> wrote: >>> ... >>> vm_pager_get_pages uses vm_page_zero_invalid >>> to "Zero out partially filled data". >>>=20 >>> But vm_page_zero_invalid does not zero every "invalid" >>> byte but works in terms of units of DEV_BSIZE : >>> ... >>> The comment indicates that areas of "sub-DEV_BSIZE" >>> should have been handled previously by >>> vm_page_set_validclean . >>=20 >> Or another VM routine, yes (e.g., vm_page_set_valid_range). The = valid >> and dirty bitmasks in vm_page only have a single bit per DEV_BSIZE >> region, so care must be taken when marking any sub-DEV_BSIZE region = as >> valid to zero out the rest of the DEV_BSIZE region. This is part of >> the VM page contract. I'm not sure it's related to the BSS, though. >=20 > Yea, I had written from what I'd seen in __elfN(load_section): >=20 > QUOTE > __elfN(load_section) uses vm_imgact_map_page > to set up for its copyout. This appears to be > how the FileSiz (not including .sbss or .bss) > vs. MemSiz (including .sbss and .bss) is > handled (attempted?). > END QUOTE >=20 > The copyout only copies through the last byte for filesz > but the vm_imgact_map_page does not zero out all the > bytes after that on that page: >=20 > /* > * We have to get the remaining bit of the file into the first = part > * of the oversized map segment. This is normally because the = .data > * segment in the file is extended to provide bss. It's a neat = idea > * to try and save a page, but it's a pain in the behind to = implement. > */ > copy_len =3D filsz =3D=3D 0 ? 0 : (offset + filsz) - = trunc_page(offset + > filsz); > map_addr =3D trunc_page((vm_offset_t)vmaddr + filsz); > map_len =3D round_page((vm_offset_t)vmaddr + memsz) - map_addr; > . . . > if (copy_len !=3D 0) { > sf =3D vm_imgact_map_page(object, offset + filsz); > if (sf =3D=3D NULL) > return (EIO); >=20 > /* send the page fragment to user space */ > off =3D trunc_page(offset + filsz) - trunc_page(offset = + filsz); > error =3D copyout((caddr_t)sf_buf_kva(sf) + off, > (caddr_t)map_addr, copy_len); > vm_imgact_unmap_page(sf); > if (error !=3D 0) > return (error); > } >=20 > I looked into the details of the DEV_BSIZE code after sending > the original message and so realized that my provided example > /sbin/init readelf material was a good example of the issue > if I'd not missed something. >=20 >>> So, if, say, char**environ ends up at the start of .sbss >>> consistently, does environ always end up zeroed independently >>> of FileSz for the PT_LOAD that spans them? >>=20 >> It is required to be zeroed, yes. If not, there is a bug. If FileSz >> covers BSS, that's a bug in the linker. Either the trailing bytes of >> the corresponding page in the executable should be zero (wasteful; on >> amd64 ".comment" is packed in there instead), or the linker/loader >> must zero them at initialization. I'm not familiar with the >> particular details here, but if you are interested I would suggest >> looking at __elfN(load_section) in sys/kern/imgact_elf.c. >=20 > I had looked at it some, see the material around the earlier quote > above. >=20 >>> The following is not necessarily an example of problematical >>> figures but is just for showing an example structure of what >>> FileSiz covers vs. MemSiz for PT_LOAD's that involve .sbss >>> and .bss : >>> ... >>=20 >> Your 2nd LOAD phdr's FileSiz matches up exactly with Segment .sbss >> Offset minus Segment .tdata Offset, i.e., none of the FileSiz >> corresponds to the (s)bss regions. (Good! At least the static = linker >> part looks sane.) That said, the boundary is not page-aligned and = the >> section alignment requirement is much lower than page_size, so the >> beginning of bss will share a file page with some data. Something >> should zero it at image activation. >=20 > And, so far, I've not found anything in _start or before that does > zero any "sub-DEV_BSIZE" part after FileSz for the PT_LOAD in > question. >=20 > Thanks for checking my trace of the issue. It is good to have some > confirmation that I'd not missed something. >=20 >> (Tangent: sbss/bss probably do not need to be RWE on PPC! On amd64, >> init has three LOAD segments rather than two: one for rodata (R), one >> for .text, .init, etc (RX); and one for .data (RW).) >=20 > Yea, the section header flags indicate just WA for .sbss and .bss (but > WAX for .got). >=20 > But such is more general: for example, the beginning of .rodata > (not executable) shares the tail part of a page with .fini > (executable) in the example. .got has executable code but is in > the middle of sections that do not. For something like /sbin/init it > is so small that the middle of a page can be the only part that is > executable, as in the example. (It is not forced onto its own page.) >=20 > The form of .got used is also writable: WAX for section header flags. amd64's /sbin/init : There are 9 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz = MemSiz Flg Align PHDR 0x000040 0x0000000000200040 0x0000000000200040 0x0001f8 = 0x0001f8 R 0x8 LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x039e94 = 0x039e94 R 0x1000 LOAD 0x03a000 0x000000000023a000 0x000000000023a000 0x0e8e40 = 0x0e8e40 R E 0x1000 LOAD 0x123000 0x0000000000323000 0x0000000000323000 0x005848 = 0x2381d9 RW 0x1000 TLS 0x127000 0x0000000000327000 0x0000000000327000 0x001800 = 0x001820 R 0x10 GNU_RELRO 0x127000 0x0000000000327000 0x0000000000327000 0x001848 = 0x001848 R 0x1 GNU_EH_FRAME 0x01b270 0x000000000021b270 0x000000000021b270 0x00504c = 0x00504c R 0x4 GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 = 0x000000 RW 0 NOTE 0x000238 0x0000000000200238 0x0000000000200238 0x000048 = 0x000048 R 0x4 Section to Segment mapping: Segment Sections... 00 =20 01 .note.tag .rela.plt .rodata .eh_frame_hdr .eh_frame=20 02 .text .init .fini .plt=20 03 .data .got.plt .tdata .tbss .ctors .dtors .jcr .init_array = .fini_array .bss=20 04 .tdata .tbss=20 05 .tdata .tbss .ctors .dtors .jcr .init_array .fini_array=20 06 .eh_frame_hdr=20 07 =20 08 .note.tag=20 There are 27 section headers, starting at offset 0x157938: Section Headers: [Nr] Name Type Addr Off Size = ES Flg Lk Inf Al [ 0] NULL 0000000000000000 000000 000000 = 00 0 0 0 [ 1] .note.tag NOTE 0000000000200238 000238 000048 = 00 A 0 0 4 [ 2] .rela.plt RELA 0000000000200280 000280 000030 = 18 AI 0 11 8 [ 3] .rodata PROGBITS 00000000002002c0 0002c0 01afb0 = 00 AMS 0 0 64 [ 4] .eh_frame_hdr PROGBITS 000000000021b270 01b270 00504c = 00 A 0 0 4 [ 5] .eh_frame PROGBITS 00000000002202c0 0202c0 019bd4 = 00 A 0 0 8 [ 6] .text PROGBITS 000000000023a000 03a000 0e8dfc = 00 AX 0 0 16 [ 7] .init PROGBITS 0000000000322dfc 122dfc 00000e = 00 AX 0 0 4 [ 8] .fini PROGBITS 0000000000322e0c 122e0c 00000e = 00 AX 0 0 4 [ 9] .plt PROGBITS 0000000000322e20 122e20 000020 = 00 AX 0 0 16 [10] .data PROGBITS 0000000000323000 123000 003a80 = 00 WA 0 0 16 [11] .got.plt PROGBITS 0000000000326a80 126a80 000010 = 00 WA 0 0 8 [12] .tdata PROGBITS 0000000000327000 127000 001800 = 00 WAT 0 0 16 [13] .tbss NOBITS 0000000000328800 128800 000020 = 00 WAT 0 0 8 [14] .ctors PROGBITS 0000000000328800 128800 000010 = 00 WA 0 0 8 [15] .dtors PROGBITS 0000000000328810 128810 000010 = 00 WA 0 0 8 [16] .jcr PROGBITS 0000000000328820 128820 000008 = 00 WA 0 0 8 [17] .init_array INIT_ARRAY 0000000000328828 128828 000018 = 00 WA 0 0 8 [18] .fini_array FINI_ARRAY 0000000000328840 128840 000008 = 00 WA 0 0 8 [19] .bss NOBITS 0000000000329000 128848 2321d9 = 00 WA 0 0 64 [20] .comment PROGBITS 0000000000000000 128848 0074d4 = 01 MS 0 0 1 [21] .gnu.warning.mkte PROGBITS 0000000000000000 12fd1c 000043 = 00 0 0 1 [22] .gnu.warning.f_pr PROGBITS 0000000000000000 12fd5f 000043 = 00 0 0 1 [23] .gnu_debuglink PROGBITS 0000000000000000 1478b0 000010 = 00 0 0 1 [24] .shstrtab STRTAB 0000000000000000 1478c0 0000f1 = 00 0 0 1 [25] .symtab SYMTAB 0000000000000000 12fda8 017b08 = 18 26 1707 8 [26] .strtab STRTAB 0000000000000000 1479b1 00ff84 = 00 0 0 1 Note that there is space after .finit_array+8 before .bss starts with a sizable alignment. The MemSiz for 03 does span .bss . armv7's /sbin/init is different about MemSiz spanning .bss: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg = Align PHDR 0x000034 0x00010034 0x00010034 0x00120 0x00120 R 0x4 LOAD 0x000000 0x00010000 0x00010000 0x10674 0x10674 R = 0x1000 LOAD 0x011000 0x00021000 0x00021000 0xe9c54 0xe9c54 R E = 0x1000 LOAD 0x0fb000 0x0010b000 0x0010b000 0x03b88 0x30ccd RW = 0x1000 TLS 0x0fe000 0x0010e000 0x0010e000 0x00b60 0x00b70 R 0x20 GNU_RELRO 0x0fe000 0x0010e000 0x0010e000 0x00b88 0x00b88 R 0x1 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0 NOTE 0x000154 0x00010154 0x00010154 0x00064 0x00064 R 0x4 ARM_EXIDX 0x0001b8 0x000101b8 0x000101b8 0x00220 0x00220 R 0x4 (NOTE: 0x0010b000+0x30ccd=3D=3D0x13BCCD . Compare this to the later .bss Addr of 0x10f000.) Section to Segment mapping: Segment Sections... 00 =20 01 .note.tag .ARM.exidx .rodata .ARM.extab=20 02 .text .init .fini=20 03 .data .tdata .tbss .jcr .init_array .fini_array .got .bss=20 04 .tdata .tbss=20 05 .tdata .tbss .jcr .init_array .fini_array .got=20 06 =20 07 .note.tag=20 08 .ARM.exidx=20 There are 24 section headers, starting at offset 0x12be3c: Section Headers: [Nr] Name Type Addr Off Size ES Flg = Lk Inf Al [ 0] NULL 00000000 000000 000000 00 = 0 0 0 [ 1] .note.tag NOTE 00010154 000154 000064 00 A = 0 0 4 [ 2] .ARM.exidx ARM_EXIDX 000101b8 0001b8 000220 00 A = 5 0 4 [ 3] .rodata PROGBITS 00010400 000400 01022c 00 AMS = 0 0 64 [ 4] .ARM.extab PROGBITS 0002062c 01062c 000048 00 A = 0 0 4 [ 5] .text PROGBITS 00021000 011000 0e9c14 00 AX = 0 0 128 [ 6] .init PROGBITS 0010ac20 0fac20 000014 00 AX = 0 0 16 [ 7] .fini PROGBITS 0010ac40 0fac40 000014 00 AX = 0 0 16 [ 8] .data PROGBITS 0010b000 0fb000 002734 00 WA = 0 0 8 [ 9] .tdata PROGBITS 0010e000 0fe000 000b60 00 WAT = 0 0 16 [10] .tbss NOBITS 0010eb60 0feb60 000010 00 WAT = 0 0 4 [11] .jcr PROGBITS 0010eb60 0feb60 000000 00 WA = 0 0 4 [12] .init_array INIT_ARRAY 0010eb60 0feb60 000008 00 WA = 0 0 4 [13] .fini_array FINI_ARRAY 0010eb68 0feb68 000004 00 WA = 0 0 4 [14] .got PROGBITS 0010eb6c 0feb6c 00001c 00 WA = 0 0 4 [15] .bss NOBITS 0010f000 0feb88 02cccd 00 WA = 0 0 64 [16] .comment PROGBITS 00000000 0feb88 0074b6 01 MS = 0 0 1 [17] .ARM.attributes ARM_ATTRIBUTES 00000000 10603e 00004f 00 = 0 0 1 [18] .gnu.warning.mkte PROGBITS 00000000 10608d 000043 00 = 0 0 1 [19] .gnu.warning.f_pr PROGBITS 00000000 1060d0 000043 00 = 0 0 1 [20] .gnu_debuglink PROGBITS 00000000 11b314 000010 00 = 0 0 1 [21] .shstrtab STRTAB 00000000 11b324 0000e3 00 = 0 0 1 [22] .symtab SYMTAB 00000000 106114 015200 10 = 23 3063 4 [23] .strtab STRTAB 00000000 11b407 010a32 00 = 0 0 1 Note that there is space after .got+0x1c before .bss starts with a sizable alignment. The MemSiz for 03 does *not* span .bss , unlike for amd64 (and the rest). aarch64's /sbin/init is similar to amd64 instead of armv7: Program Headers: Type Offset VirtAddr PhysAddr FileSiz = MemSiz Flg Align PHDR 0x000040 0x0000000000200040 0x0000000000200040 0x0001c0 = 0x0001c0 R 0x8 LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x01624f = 0x01624f R 0x10000 LOAD 0x020000 0x0000000000220000 0x0000000000220000 0x0dd354 = 0x0dd354 R E 0x10000 LOAD 0x100000 0x0000000000300000 0x0000000000300000 0x011840 = 0x252111 RW 0x10000 TLS 0x110000 0x0000000000310000 0x0000000000310000 0x001800 = 0x001820 R 0x40 GNU_RELRO 0x110000 0x0000000000310000 0x0000000000310000 0x001840 = 0x001840 R 0x1 GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 = 0x000000 RW 0 NOTE 0x000200 0x0000000000200200 0x0000000000200200 0x000048 = 0x000048 R 0x4 Section to Segment mapping: Segment Sections... 00 =20 01 .note.tag .rodata=20 02 .text .init .fini=20 03 .data .tdata .tbss .jcr .init_array .fini_array .got .bss=20 04 .tdata .tbss=20 05 .tdata .tbss .jcr .init_array .fini_array .got=20 06 =20 07 .note.tag=20 There are 21 section headers, starting at offset 0x14b6f0: Section Headers: [Nr] Name Type Addr Off Size = ES Flg Lk Inf Al [ 0] NULL 0000000000000000 000000 000000 = 00 0 0 0 [ 1] .note.tag NOTE 0000000000200200 000200 000048 = 00 A 0 0 4 [ 2] .rodata PROGBITS 0000000000200280 000280 015fcf = 00 AMS 0 0 64 [ 3] .text PROGBITS 0000000000220000 020000 0dd31c = 00 AX 0 0 64 [ 4] .init PROGBITS 00000000002fd320 0fd320 000014 = 00 AX 0 0 16 [ 5] .fini PROGBITS 00000000002fd340 0fd340 000014 = 00 AX 0 0 16 [ 6] .data PROGBITS 0000000000300000 100000 003a20 = 00 WA 0 0 16 [ 7] .tdata PROGBITS 0000000000310000 110000 001800 = 00 WAT 0 0 16 [ 8] .tbss NOBITS 0000000000311800 111800 000020 = 00 WAT 0 0 8 [ 9] .jcr PROGBITS 0000000000311800 111800 000000 = 00 WA 0 0 8 [10] .init_array INIT_ARRAY 0000000000311800 111800 000018 = 00 WA 0 0 8 [11] .fini_array FINI_ARRAY 0000000000311818 111818 000008 = 00 WA 0 0 8 [12] .got PROGBITS 0000000000311820 111820 000020 = 00 WA 0 0 8 [13] .bss NOBITS 0000000000320000 111840 232111 = 00 WA 0 0 64 [14] .comment PROGBITS 0000000000000000 111840 007191 = 01 MS 0 0 1 [15] .gnu.warning.mkte PROGBITS 0000000000000000 1189d1 000043 = 00 0 0 1 [16] .gnu.warning.f_pr PROGBITS 0000000000000000 118a14 000043 = 00 0 0 1 [17] .gnu_debuglink PROGBITS 0000000000000000 13b7f8 000010 = 00 0 0 1 [18] .shstrtab STRTAB 0000000000000000 13b808 0000bd = 00 0 0 1 [19] .symtab SYMTAB 0000000000000000 118a58 022da0 = 18 20 3621 8 [20] .strtab STRTAB 0000000000000000 13b8c5 00fe2b = 00 0 0 1 Note that there is space after .got+0x20 before .bss starts with a sizable alignment. The MemSiz for 03 does span .bss , like for amd64 (and all but armv7). powerpc64's /sbin/init is similar to amd64 as well: Program Headers: Type Offset VirtAddr PhysAddr FileSiz = MemSiz Flg Align PHDR 0x000040 0x0000000000200040 0x0000000000200040 0x0001f8 = 0x0001f8 R 0x8 LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x039e94 = 0x039e94 R 0x1000 LOAD 0x03a000 0x000000000023a000 0x000000000023a000 0x0e8e40 = 0x0e8e40 R E 0x1000 LOAD 0x123000 0x0000000000323000 0x0000000000323000 0x005848 = 0x2381d9 RW 0x1000 TLS 0x127000 0x0000000000327000 0x0000000000327000 0x001800 = 0x001820 R 0x10 GNU_RELRO 0x127000 0x0000000000327000 0x0000000000327000 0x001848 = 0x001848 R 0x1 GNU_EH_FRAME 0x01b270 0x000000000021b270 0x000000000021b270 0x00504c = 0x00504c R 0x4 GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 = 0x000000 RW 0 NOTE 0x000238 0x0000000000200238 0x0000000000200238 0x000048 = 0x000048 R 0x4 Section to Segment mapping: Segment Sections... 00 =20 01 .note.tag .rela.plt .rodata .eh_frame_hdr .eh_frame=20 02 .text .init .fini .plt=20 03 .data .got.plt .tdata .tbss .ctors .dtors .jcr .init_array = .fini_array .bss=20 04 .tdata .tbss=20 05 .tdata .tbss .ctors .dtors .jcr .init_array .fini_array=20 06 .eh_frame_hdr=20 07 =20 08 .note.tag=20 There are 27 section headers, starting at offset 0x157938: Section Headers: [Nr] Name Type Addr Off Size = ES Flg Lk Inf Al [ 0] NULL 0000000000000000 000000 000000 = 00 0 0 0 [ 1] .note.tag NOTE 0000000000200238 000238 000048 = 00 A 0 0 4 [ 2] .rela.plt RELA 0000000000200280 000280 000030 = 18 AI 0 11 8 [ 3] .rodata PROGBITS 00000000002002c0 0002c0 01afb0 = 00 AMS 0 0 64 [ 4] .eh_frame_hdr PROGBITS 000000000021b270 01b270 00504c = 00 A 0 0 4 [ 5] .eh_frame PROGBITS 00000000002202c0 0202c0 019bd4 = 00 A 0 0 8 [ 6] .text PROGBITS 000000000023a000 03a000 0e8dfc = 00 AX 0 0 16 [ 7] .init PROGBITS 0000000000322dfc 122dfc 00000e = 00 AX 0 0 4 [ 8] .fini PROGBITS 0000000000322e0c 122e0c 00000e = 00 AX 0 0 4 [ 9] .plt PROGBITS 0000000000322e20 122e20 000020 = 00 AX 0 0 16 [10] .data PROGBITS 0000000000323000 123000 003a80 = 00 WA 0 0 16 [11] .got.plt PROGBITS 0000000000326a80 126a80 000010 = 00 WA 0 0 8 [12] .tdata PROGBITS 0000000000327000 127000 001800 = 00 WAT 0 0 16 [13] .tbss NOBITS 0000000000328800 128800 000020 = 00 WAT 0 0 8 [14] .ctors PROGBITS 0000000000328800 128800 000010 = 00 WA 0 0 8 [15] .dtors PROGBITS 0000000000328810 128810 000010 = 00 WA 0 0 8 [16] .jcr PROGBITS 0000000000328820 128820 000008 = 00 WA 0 0 8 [17] .init_array INIT_ARRAY 0000000000328828 128828 000018 = 00 WA 0 0 8 [18] .fini_array FINI_ARRAY 0000000000328840 128840 000008 = 00 WA 0 0 8 [19] .bss NOBITS 0000000000329000 128848 2321d9 = 00 WA 0 0 64 [20] .comment PROGBITS 0000000000000000 128848 0074d4 = 01 MS 0 0 1 [21] .gnu.warning.mkte PROGBITS 0000000000000000 12fd1c 000043 = 00 0 0 1 [22] .gnu.warning.f_pr PROGBITS 0000000000000000 12fd5f 000043 = 00 0 0 1 [23] .gnu_debuglink PROGBITS 0000000000000000 1478b0 000010 = 00 0 0 1 [24] .shstrtab STRTAB 0000000000000000 1478c0 0000f1 = 00 0 0 1 [25] .symtab SYMTAB 0000000000000000 12fda8 017b08 = 18 26 1707 8 [26] .strtab STRTAB 0000000000000000 1479b1 00ff84 = 00 0 0 1 Note that there is space after .fini_array+8 before .bss starts with a sizable alignment. The MemSiz for 03 does span .bss , like for amd64 (and all but armv7). =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-hackers@freebsd.org Mon Jun 10 23:29:27 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F3DD215CB9A1 for ; Mon, 10 Jun 2019 23:29:26 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic312-24.consmr.mail.ne1.yahoo.com (sonic312-24.consmr.mail.ne1.yahoo.com [66.163.191.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9637D6AADA for ; Mon, 10 Jun 2019 23:29:25 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 2AadiqwVM1ncpb7pKtKaam4jpaWPL0zAaEHthDyi5XUT1JM3kTQrgQZVwciY6_v lTaxHmuzptuySoT2rEpsKLnjWKum.VTaDWrZS82C6wNliaxE5v1GR7pkwFA1hwj9xWtdNZuUn9IP kqetpHY.GntF4yorsFo_ZsEM3cclVz2mMH7EK.Cz_NaL9cuxEFxmpB4N7AWnW5Ttqv_PdWfvosbu aWfgghAp0TuKP78cuWT9HY1v5A6E06mwIgliS7290uMiYCTRlZ.q436O_YFdSASo99y4F3He7_EL m4oANRJujkyKZKisU_mcB_kQKXEqW9QJLB2SRssAZuvZtGEptMSDqy9uLeWAGV0SArCzdXhQM_OX NtRdb3TDp0KMGkriBj.FcEKL_VMzfQvOa9glP.2vpZC3a63SuMUb1PVmSg951qPyK1GGN8lquoPV hbRNRmWMI11Jz50aQr_6xF9ofH258XRTWq8vYqT5xBYy1nLMpjBfK_zYO5nGH_igtxJbAhn7CNHa 5P3QvDEJMxNGGCWPstTqZTrBsCCaPLVvuRDD1yL8HgRQmnSJ1zE76JtUxZaJ9UzD3RDqXbT_IWy0 MS7zbOdBI3Q2Vep1846Y97AQwkaVAo8O5Pl_1FHDBv4zj_PtW0mXbBltxUsZ8eNi8UsFTfiueAdQ Udf6IKZFky_UOMZUs2RfweymGQ4Eu_T6cz5EaSZ7pGs57bvLPQMBjJmaUaLqseO30dTUag26c5K0 8LMuJHVwCVmmQmSXkWwMAIltHFdzSVw3BxuV43ShjYgFs6YOOfoxbqGy1nZ1PCDEc4sVXrQ0x50r hIoUvoRBBzFiHEv5gjWbsDD7Vip5HBP1gKUnAqUqwNaGlrarVet0Wga.VuBvJdrd2yLH0m9rfw.z mgEMxl6QLDP67cdh2CkDLcrwTmN0C_jg5FI9xvg47l6DURQLhz6N.TN96JjjyqCo4i7sVfnATuc2 lV9KsKD8.MI7v5Dez1h9yydeInJnivZgwbub03KgCu3jBhPh7iOwAi2ud1_gxZjgqBLVMUW2HqzK mmqC4qaLdh4Sk3xqDALnB3rbBUgwNT7uefVnDXmqzF.nGzReTyk4RLiaMTj.6WtWI..DeWCMol4J iKfYrYPIGemhxumc1YrNvPSrAXTcKsAWZXN3XbvCFhZH3z7B4z3cHJtNbWsbDILId_vOr2G95Vt2 CvjfS3HgfHizj3DRVX0ZzM89SwLwz5JZXrfOhQn3SWksWejxb0TS922Sivb6HgTXZehQyvAgmALs - Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.ne1.yahoo.com with HTTP; Mon, 10 Jun 2019 23:29:23 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp420.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 8e00794d17cd9ee93c1a17d88781154d; Mon, 10 Jun 2019 23:19:16 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? From: Mark Millard In-Reply-To: <47E002B7-D4A1-4C4B-BFFD-D926263D895E@yahoo.com> Date: Mon, 10 Jun 2019 16:19:14 -0700 Cc: Alfredo Dal Ava Junior , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: <48148449-93B0-446C-AA28-F211FFAE1A8B@yahoo.com> References: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com> <47E002B7-D4A1-4C4B-BFFD-D926263D895E@yahoo.com> To: FreeBSD Hackers , freeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: 9637D6AADA X-Spamd-Bar: ++++ X-Spamd-Result: default: False [4.49 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36646, ipnet:66.163.184.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_SPAM_SHORT(0.77)[0.767,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.57)[ip: (5.47), ipnet: 66.163.184.0/21(1.35), asn: 36646(1.08), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.75)[0.747,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.92)[0.917,0]; RCVD_IN_DNSWL_NONE(0.00)[205.191.163.66.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2019 23:29:27 -0000 [Forcing an appropriate large .sbss alignment was not enough to avoid the clang-based problem for *sp++ related environ code in _init_tls .] On 2019-Jun-10, at 12:20, Mark Millard wrote: > [I decided to compare some readelf information from some > other architectures. I was surprised by some of it. But > .bss seems to be forced to start with a large alignment > to avoid such issues as I originally traced.] >=20 > On 2019-Jun-10, at 11:24, Mark Millard wrote: >=20 >> [Looks like Conrad M. is partially confirming my trace of the >> issue is reasonable.] >>=20 >> On 2019-Jun-10, at 07:37, Conrad Meyer wrote: >>=20 >>> Hi Mark, >>>=20 >>> On Sun, Jun 9, 2019 at 11:17 PM Mark Millard via freebsd-hackers >>> wrote: >>>> ... >>>> vm_pager_get_pages uses vm_page_zero_invalid >>>> to "Zero out partially filled data". >>>>=20 >>>> But vm_page_zero_invalid does not zero every "invalid" >>>> byte but works in terms of units of DEV_BSIZE : >>>> ... >>>> The comment indicates that areas of "sub-DEV_BSIZE" >>>> should have been handled previously by >>>> vm_page_set_validclean . >>>=20 >>> Or another VM routine, yes (e.g., vm_page_set_valid_range). The = valid >>> and dirty bitmasks in vm_page only have a single bit per DEV_BSIZE >>> region, so care must be taken when marking any sub-DEV_BSIZE region = as >>> valid to zero out the rest of the DEV_BSIZE region. This is part of >>> the VM page contract. I'm not sure it's related to the BSS, though. >>=20 >> Yea, I had written from what I'd seen in __elfN(load_section): >>=20 >> QUOTE >> __elfN(load_section) uses vm_imgact_map_page >> to set up for its copyout. This appears to be >> how the FileSiz (not including .sbss or .bss) >> vs. MemSiz (including .sbss and .bss) is >> handled (attempted?). >> END QUOTE >>=20 >> The copyout only copies through the last byte for filesz >> but the vm_imgact_map_page does not zero out all the >> bytes after that on that page: >>=20 >> /* >> * We have to get the remaining bit of the file into the first = part >> * of the oversized map segment. This is normally because the = .data >> * segment in the file is extended to provide bss. It's a neat = idea >> * to try and save a page, but it's a pain in the behind to = implement. >> */ >> copy_len =3D filsz =3D=3D 0 ? 0 : (offset + filsz) - = trunc_page(offset + >> filsz); >> map_addr =3D trunc_page((vm_offset_t)vmaddr + filsz); >> map_len =3D round_page((vm_offset_t)vmaddr + memsz) - map_addr; >> . . . >> if (copy_len !=3D 0) { >> sf =3D vm_imgact_map_page(object, offset + filsz); >> if (sf =3D=3D NULL) >> return (EIO); >>=20 >> /* send the page fragment to user space */ >> off =3D trunc_page(offset + filsz) - trunc_page(offset = + filsz); >> error =3D copyout((caddr_t)sf_buf_kva(sf) + off, >> (caddr_t)map_addr, copy_len); >> vm_imgact_unmap_page(sf); >> if (error !=3D 0) >> return (error); >> } >>=20 >> I looked into the details of the DEV_BSIZE code after sending >> the original message and so realized that my provided example >> /sbin/init readelf material was a good example of the issue >> if I'd not missed something. >>=20 >>>> So, if, say, char**environ ends up at the start of .sbss >>>> consistently, does environ always end up zeroed independently >>>> of FileSz for the PT_LOAD that spans them? >>>=20 >>> It is required to be zeroed, yes. If not, there is a bug. If = FileSz >>> covers BSS, that's a bug in the linker. Either the trailing bytes = of >>> the corresponding page in the executable should be zero (wasteful; = on >>> amd64 ".comment" is packed in there instead), or the linker/loader >>> must zero them at initialization. I'm not familiar with the >>> particular details here, but if you are interested I would suggest >>> looking at __elfN(load_section) in sys/kern/imgact_elf.c. >>=20 >> I had looked at it some, see the material around the earlier quote >> above. >>=20 >>>> The following is not necessarily an example of problematical >>>> figures but is just for showing an example structure of what >>>> FileSiz covers vs. MemSiz for PT_LOAD's that involve .sbss >>>> and .bss : >>>> ... >>>=20 >>> Your 2nd LOAD phdr's FileSiz matches up exactly with Segment .sbss >>> Offset minus Segment .tdata Offset, i.e., none of the FileSiz >>> corresponds to the (s)bss regions. (Good! At least the static = linker >>> part looks sane.) That said, the boundary is not page-aligned and = the >>> section alignment requirement is much lower than page_size, so the >>> beginning of bss will share a file page with some data. Something >>> should zero it at image activation. >>=20 >> And, so far, I've not found anything in _start or before that does >> zero any "sub-DEV_BSIZE" part after FileSz for the PT_LOAD in >> question. >>=20 >> Thanks for checking my trace of the issue. It is good to have some >> confirmation that I'd not missed something. >>=20 >>> (Tangent: sbss/bss probably do not need to be RWE on PPC! On amd64, >>> init has three LOAD segments rather than two: one for rodata (R), = one >>> for .text, .init, etc (RX); and one for .data (RW).) >>=20 >> Yea, the section header flags indicate just WA for .sbss and .bss = (but >> WAX for .got). >>=20 >> But such is more general: for example, the beginning of .rodata >> (not executable) shares the tail part of a page with .fini >> (executable) in the example. .got has executable code but is in >> the middle of sections that do not. For something like /sbin/init it >> is so small that the middle of a page can be the only part that is >> executable, as in the example. (It is not forced onto its own page.) >>=20 >> The form of .got used is also writable: WAX for section header flags. >=20 >=20 >=20 > amd64's /sbin/init : >=20 > There are 9 program headers, starting at offset 64 >=20 > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz = MemSiz Flg Align > PHDR 0x000040 0x0000000000200040 0x0000000000200040 = 0x0001f8 0x0001f8 R 0x8 > LOAD 0x000000 0x0000000000200000 0x0000000000200000 = 0x039e94 0x039e94 R 0x1000 > LOAD 0x03a000 0x000000000023a000 0x000000000023a000 = 0x0e8e40 0x0e8e40 R E 0x1000 > LOAD 0x123000 0x0000000000323000 0x0000000000323000 = 0x005848 0x2381d9 RW 0x1000 > TLS 0x127000 0x0000000000327000 0x0000000000327000 = 0x001800 0x001820 R 0x10 > GNU_RELRO 0x127000 0x0000000000327000 0x0000000000327000 = 0x001848 0x001848 R 0x1 > GNU_EH_FRAME 0x01b270 0x000000000021b270 0x000000000021b270 = 0x00504c 0x00504c R 0x4 > GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 = 0x000000 0x000000 RW 0 > NOTE 0x000238 0x0000000000200238 0x0000000000200238 = 0x000048 0x000048 R 0x4 >=20 > Section to Segment mapping: > Segment Sections... > 00 =20 > 01 .note.tag .rela.plt .rodata .eh_frame_hdr .eh_frame=20 > 02 .text .init .fini .plt=20 > 03 .data .got.plt .tdata .tbss .ctors .dtors .jcr .init_array = .fini_array .bss=20 > 04 .tdata .tbss=20 > 05 .tdata .tbss .ctors .dtors .jcr .init_array .fini_array=20 > 06 .eh_frame_hdr=20 > 07 =20 > 08 .note.tag=20 > There are 27 section headers, starting at offset 0x157938: >=20 > Section Headers: > [Nr] Name Type Addr Off Size = ES Flg Lk Inf Al > [ 0] NULL 0000000000000000 000000 000000 = 00 0 0 0 > [ 1] .note.tag NOTE 0000000000200238 000238 000048 = 00 A 0 0 4 > [ 2] .rela.plt RELA 0000000000200280 000280 000030 = 18 AI 0 11 8 > [ 3] .rodata PROGBITS 00000000002002c0 0002c0 01afb0 = 00 AMS 0 0 64 > [ 4] .eh_frame_hdr PROGBITS 000000000021b270 01b270 00504c = 00 A 0 0 4 > [ 5] .eh_frame PROGBITS 00000000002202c0 0202c0 019bd4 = 00 A 0 0 8 > [ 6] .text PROGBITS 000000000023a000 03a000 0e8dfc = 00 AX 0 0 16 > [ 7] .init PROGBITS 0000000000322dfc 122dfc 00000e = 00 AX 0 0 4 > [ 8] .fini PROGBITS 0000000000322e0c 122e0c 00000e = 00 AX 0 0 4 > [ 9] .plt PROGBITS 0000000000322e20 122e20 000020 = 00 AX 0 0 16 > [10] .data PROGBITS 0000000000323000 123000 003a80 = 00 WA 0 0 16 > [11] .got.plt PROGBITS 0000000000326a80 126a80 000010 = 00 WA 0 0 8 > [12] .tdata PROGBITS 0000000000327000 127000 001800 = 00 WAT 0 0 16 > [13] .tbss NOBITS 0000000000328800 128800 000020 = 00 WAT 0 0 8 > [14] .ctors PROGBITS 0000000000328800 128800 000010 = 00 WA 0 0 8 > [15] .dtors PROGBITS 0000000000328810 128810 000010 = 00 WA 0 0 8 > [16] .jcr PROGBITS 0000000000328820 128820 000008 = 00 WA 0 0 8 > [17] .init_array INIT_ARRAY 0000000000328828 128828 000018 = 00 WA 0 0 8 > [18] .fini_array FINI_ARRAY 0000000000328840 128840 000008 = 00 WA 0 0 8 > [19] .bss NOBITS 0000000000329000 128848 2321d9 = 00 WA 0 0 64 > [20] .comment PROGBITS 0000000000000000 128848 0074d4 = 01 MS 0 0 1 > [21] .gnu.warning.mkte PROGBITS 0000000000000000 12fd1c 000043 = 00 0 0 1 > [22] .gnu.warning.f_pr PROGBITS 0000000000000000 12fd5f 000043 = 00 0 0 1 > [23] .gnu_debuglink PROGBITS 0000000000000000 1478b0 000010 = 00 0 0 1 > [24] .shstrtab STRTAB 0000000000000000 1478c0 0000f1 = 00 0 0 1 > [25] .symtab SYMTAB 0000000000000000 12fda8 017b08 = 18 26 1707 8 > [26] .strtab STRTAB 0000000000000000 1479b1 00ff84 = 00 0 0 1 >=20 > Note that there is space after .finit_array+8 before .bss starts > with a sizable alignment. The MemSiz for 03 does span .bss . >=20 > armv7's /sbin/init is different about MemSiz spanning .bss: >=20 > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg = Align > PHDR 0x000034 0x00010034 0x00010034 0x00120 0x00120 R 0x4 > LOAD 0x000000 0x00010000 0x00010000 0x10674 0x10674 R = 0x1000 > LOAD 0x011000 0x00021000 0x00021000 0xe9c54 0xe9c54 R E = 0x1000 > LOAD 0x0fb000 0x0010b000 0x0010b000 0x03b88 0x30ccd RW = 0x1000 > TLS 0x0fe000 0x0010e000 0x0010e000 0x00b60 0x00b70 R = 0x20 > GNU_RELRO 0x0fe000 0x0010e000 0x0010e000 0x00b88 0x00b88 R 0x1 > GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0 > NOTE 0x000154 0x00010154 0x00010154 0x00064 0x00064 R 0x4 > ARM_EXIDX 0x0001b8 0x000101b8 0x000101b8 0x00220 0x00220 R 0x4 >=20 > (NOTE: 0x0010b000+0x30ccd=3D=3D0x13BCCD . Compare this to the later = .bss > Addr of 0x10f000.) >=20 > Section to Segment mapping: > Segment Sections... > 00 =20 > 01 .note.tag .ARM.exidx .rodata .ARM.extab=20 > 02 .text .init .fini=20 > 03 .data .tdata .tbss .jcr .init_array .fini_array .got .bss=20 > 04 .tdata .tbss=20 > 05 .tdata .tbss .jcr .init_array .fini_array .got=20 > 06 =20 > 07 .note.tag=20 > 08 .ARM.exidx=20 > There are 24 section headers, starting at offset 0x12be3c: >=20 > Section Headers: > [Nr] Name Type Addr Off Size ES Flg = Lk Inf Al > [ 0] NULL 00000000 000000 000000 00 = 0 0 0 > [ 1] .note.tag NOTE 00010154 000154 000064 00 A = 0 0 4 > [ 2] .ARM.exidx ARM_EXIDX 000101b8 0001b8 000220 00 A = 5 0 4 > [ 3] .rodata PROGBITS 00010400 000400 01022c 00 AMS = 0 0 64 > [ 4] .ARM.extab PROGBITS 0002062c 01062c 000048 00 A = 0 0 4 > [ 5] .text PROGBITS 00021000 011000 0e9c14 00 AX = 0 0 128 > [ 6] .init PROGBITS 0010ac20 0fac20 000014 00 AX = 0 0 16 > [ 7] .fini PROGBITS 0010ac40 0fac40 000014 00 AX = 0 0 16 > [ 8] .data PROGBITS 0010b000 0fb000 002734 00 WA = 0 0 8 > [ 9] .tdata PROGBITS 0010e000 0fe000 000b60 00 WAT = 0 0 16 > [10] .tbss NOBITS 0010eb60 0feb60 000010 00 WAT = 0 0 4 > [11] .jcr PROGBITS 0010eb60 0feb60 000000 00 WA = 0 0 4 > [12] .init_array INIT_ARRAY 0010eb60 0feb60 000008 00 WA = 0 0 4 > [13] .fini_array FINI_ARRAY 0010eb68 0feb68 000004 00 WA = 0 0 4 > [14] .got PROGBITS 0010eb6c 0feb6c 00001c 00 WA = 0 0 4 > [15] .bss NOBITS 0010f000 0feb88 02cccd 00 WA = 0 0 64 > [16] .comment PROGBITS 00000000 0feb88 0074b6 01 MS = 0 0 1 > [17] .ARM.attributes ARM_ATTRIBUTES 00000000 10603e 00004f 00 = 0 0 1 > [18] .gnu.warning.mkte PROGBITS 00000000 10608d 000043 00 = 0 0 1 > [19] .gnu.warning.f_pr PROGBITS 00000000 1060d0 000043 00 = 0 0 1 > [20] .gnu_debuglink PROGBITS 00000000 11b314 000010 00 = 0 0 1 > [21] .shstrtab STRTAB 00000000 11b324 0000e3 00 = 0 0 1 > [22] .symtab SYMTAB 00000000 106114 015200 10 = 23 3063 4 > [23] .strtab STRTAB 00000000 11b407 010a32 00 = 0 0 1 >=20 > Note that there is space after .got+0x1c before .bss starts > with a sizable alignment. The MemSiz for 03 does *not* span > .bss , unlike for amd64 (and the rest). >=20 >=20 > aarch64's /sbin/init is similar to amd64 instead of armv7: >=20 > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz = MemSiz Flg Align > PHDR 0x000040 0x0000000000200040 0x0000000000200040 = 0x0001c0 0x0001c0 R 0x8 > LOAD 0x000000 0x0000000000200000 0x0000000000200000 = 0x01624f 0x01624f R 0x10000 > LOAD 0x020000 0x0000000000220000 0x0000000000220000 = 0x0dd354 0x0dd354 R E 0x10000 > LOAD 0x100000 0x0000000000300000 0x0000000000300000 = 0x011840 0x252111 RW 0x10000 > TLS 0x110000 0x0000000000310000 0x0000000000310000 = 0x001800 0x001820 R 0x40 > GNU_RELRO 0x110000 0x0000000000310000 0x0000000000310000 = 0x001840 0x001840 R 0x1 > GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 = 0x000000 0x000000 RW 0 > NOTE 0x000200 0x0000000000200200 0x0000000000200200 = 0x000048 0x000048 R 0x4 >=20 > Section to Segment mapping: > Segment Sections... > 00 =20 > 01 .note.tag .rodata=20 > 02 .text .init .fini=20 > 03 .data .tdata .tbss .jcr .init_array .fini_array .got .bss=20 > 04 .tdata .tbss=20 > 05 .tdata .tbss .jcr .init_array .fini_array .got=20 > 06 =20 > 07 .note.tag=20 > There are 21 section headers, starting at offset 0x14b6f0: >=20 > Section Headers: > [Nr] Name Type Addr Off Size = ES Flg Lk Inf Al > [ 0] NULL 0000000000000000 000000 000000 = 00 0 0 0 > [ 1] .note.tag NOTE 0000000000200200 000200 000048 = 00 A 0 0 4 > [ 2] .rodata PROGBITS 0000000000200280 000280 015fcf = 00 AMS 0 0 64 > [ 3] .text PROGBITS 0000000000220000 020000 0dd31c = 00 AX 0 0 64 > [ 4] .init PROGBITS 00000000002fd320 0fd320 000014 = 00 AX 0 0 16 > [ 5] .fini PROGBITS 00000000002fd340 0fd340 000014 = 00 AX 0 0 16 > [ 6] .data PROGBITS 0000000000300000 100000 003a20 = 00 WA 0 0 16 > [ 7] .tdata PROGBITS 0000000000310000 110000 001800 = 00 WAT 0 0 16 > [ 8] .tbss NOBITS 0000000000311800 111800 000020 = 00 WAT 0 0 8 > [ 9] .jcr PROGBITS 0000000000311800 111800 000000 = 00 WA 0 0 8 > [10] .init_array INIT_ARRAY 0000000000311800 111800 000018 = 00 WA 0 0 8 > [11] .fini_array FINI_ARRAY 0000000000311818 111818 000008 = 00 WA 0 0 8 > [12] .got PROGBITS 0000000000311820 111820 000020 = 00 WA 0 0 8 > [13] .bss NOBITS 0000000000320000 111840 232111 = 00 WA 0 0 64 > [14] .comment PROGBITS 0000000000000000 111840 007191 = 01 MS 0 0 1 > [15] .gnu.warning.mkte PROGBITS 0000000000000000 1189d1 000043 = 00 0 0 1 > [16] .gnu.warning.f_pr PROGBITS 0000000000000000 118a14 000043 = 00 0 0 1 > [17] .gnu_debuglink PROGBITS 0000000000000000 13b7f8 000010 = 00 0 0 1 > [18] .shstrtab STRTAB 0000000000000000 13b808 0000bd = 00 0 0 1 > [19] .symtab SYMTAB 0000000000000000 118a58 022da0 = 18 20 3621 8 > [20] .strtab STRTAB 0000000000000000 13b8c5 00fe2b = 00 0 0 1 >=20 > Note that there is space after .got+0x20 before .bss starts > with a sizable alignment. The MemSiz for 03 does span > .bss , like for amd64 (and all but armv7). >=20 > powerpc64's /sbin/init is similar to amd64 as well: >=20 > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz = MemSiz Flg Align > PHDR 0x000040 0x0000000000200040 0x0000000000200040 = 0x0001f8 0x0001f8 R 0x8 > LOAD 0x000000 0x0000000000200000 0x0000000000200000 = 0x039e94 0x039e94 R 0x1000 > LOAD 0x03a000 0x000000000023a000 0x000000000023a000 = 0x0e8e40 0x0e8e40 R E 0x1000 > LOAD 0x123000 0x0000000000323000 0x0000000000323000 = 0x005848 0x2381d9 RW 0x1000 > TLS 0x127000 0x0000000000327000 0x0000000000327000 = 0x001800 0x001820 R 0x10 > GNU_RELRO 0x127000 0x0000000000327000 0x0000000000327000 = 0x001848 0x001848 R 0x1 > GNU_EH_FRAME 0x01b270 0x000000000021b270 0x000000000021b270 = 0x00504c 0x00504c R 0x4 > GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 = 0x000000 0x000000 RW 0 > NOTE 0x000238 0x0000000000200238 0x0000000000200238 = 0x000048 0x000048 R 0x4 >=20 > Section to Segment mapping: > Segment Sections... > 00 =20 > 01 .note.tag .rela.plt .rodata .eh_frame_hdr .eh_frame=20 > 02 .text .init .fini .plt=20 > 03 .data .got.plt .tdata .tbss .ctors .dtors .jcr .init_array = .fini_array .bss=20 > 04 .tdata .tbss=20 > 05 .tdata .tbss .ctors .dtors .jcr .init_array .fini_array=20 > 06 .eh_frame_hdr=20 > 07 =20 > 08 .note.tag=20 > There are 27 section headers, starting at offset 0x157938: >=20 > Section Headers: > [Nr] Name Type Addr Off Size = ES Flg Lk Inf Al > [ 0] NULL 0000000000000000 000000 000000 = 00 0 0 0 > [ 1] .note.tag NOTE 0000000000200238 000238 000048 = 00 A 0 0 4 > [ 2] .rela.plt RELA 0000000000200280 000280 000030 = 18 AI 0 11 8 > [ 3] .rodata PROGBITS 00000000002002c0 0002c0 01afb0 = 00 AMS 0 0 64 > [ 4] .eh_frame_hdr PROGBITS 000000000021b270 01b270 00504c = 00 A 0 0 4 > [ 5] .eh_frame PROGBITS 00000000002202c0 0202c0 019bd4 = 00 A 0 0 8 > [ 6] .text PROGBITS 000000000023a000 03a000 0e8dfc = 00 AX 0 0 16 > [ 7] .init PROGBITS 0000000000322dfc 122dfc 00000e = 00 AX 0 0 4 > [ 8] .fini PROGBITS 0000000000322e0c 122e0c 00000e = 00 AX 0 0 4 > [ 9] .plt PROGBITS 0000000000322e20 122e20 000020 = 00 AX 0 0 16 > [10] .data PROGBITS 0000000000323000 123000 003a80 = 00 WA 0 0 16 > [11] .got.plt PROGBITS 0000000000326a80 126a80 000010 = 00 WA 0 0 8 > [12] .tdata PROGBITS 0000000000327000 127000 001800 = 00 WAT 0 0 16 > [13] .tbss NOBITS 0000000000328800 128800 000020 = 00 WAT 0 0 8 > [14] .ctors PROGBITS 0000000000328800 128800 000010 = 00 WA 0 0 8 > [15] .dtors PROGBITS 0000000000328810 128810 000010 = 00 WA 0 0 8 > [16] .jcr PROGBITS 0000000000328820 128820 000008 = 00 WA 0 0 8 > [17] .init_array INIT_ARRAY 0000000000328828 128828 000018 = 00 WA 0 0 8 > [18] .fini_array FINI_ARRAY 0000000000328840 128840 000008 = 00 WA 0 0 8 > [19] .bss NOBITS 0000000000329000 128848 2321d9 = 00 WA 0 0 64 > [20] .comment PROGBITS 0000000000000000 128848 0074d4 = 01 MS 0 0 1 > [21] .gnu.warning.mkte PROGBITS 0000000000000000 12fd1c 000043 = 00 0 0 1 > [22] .gnu.warning.f_pr PROGBITS 0000000000000000 12fd5f 000043 = 00 0 0 1 > [23] .gnu_debuglink PROGBITS 0000000000000000 1478b0 000010 = 00 0 0 1 > [24] .shstrtab STRTAB 0000000000000000 1478c0 0000f1 = 00 0 0 1 > [25] .symtab SYMTAB 0000000000000000 12fda8 017b08 = 18 26 1707 8 > [26] .strtab STRTAB 0000000000000000 1479b1 00ff84 = 00 0 0 1 >=20 >=20 > Note that there is space after .fini_array+8 before .bss starts > with a sizable alignment. The MemSiz for 03 does span > .bss , like for amd64 (and all but armv7). I temporarily forced my 32-bit powerpc /sbin/init to have: Section Headers: [Nr] Name Type Addr Off Size ES Flg = Lk Inf Al . . . [16] .got PROGBITS 0193845c 12845c 000010 04 WAX = 0 0 4 [17] .sbss NOBITS 01939000 12846c 0000b0 00 WA = 0 0 4 [18] .bss NOBITS 019390c0 12846c 02cc48 00 WA = 0 0 64 . . . It was not enough to avoid the problems I've elsewhere reported for *sp++ getting SIGSEGV ( environ related activity in _init_tls ). =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-hackers@freebsd.org Tue Jun 11 20:12:37 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DBEBA15C5A72 for ; Tue, 11 Jun 2019 20:12:36 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0CB177637A for ; Tue, 11 Jun 2019 20:12:35 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lj1-f178.google.com with SMTP id h10so7063273ljg.0 for ; Tue, 11 Jun 2019 13:12:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=aj4Sb4OSVtCIUgi5ykPzbFS/iEFCxUoroEPEdGQRqnQ=; b=fDOi2s8mhbyCUoGFwvGb5O+oci9dkgxXcVtjkqmOn2lVIVYAEmTY9eK37xAmRiGOvN cXFEUMgXgeO/yNU0s92aJNnyaanlYUijO8FNBlKdAVahtyqp0ZtGLE6qpPvYOoNtYALI iU629YfCbdxTUTfY2WEo9aXDujOPw2L+eFtCd0iXCxSR0KbNNkLd+p59W68XC9UDpxsn 9mg+34EyKHJrpKqvKmZLapdysc7Ga3+isTNpgk2EwQwUpvzby0KEnVK0e2sjxu4SCi+W V2Eyp0/cPhNiyzki1MMQ73D+Be2z7cZYtAb+ScHk1a3Spw0Svr3zSbeVwadXEe9HKvwS kNlw== X-Gm-Message-State: APjAAAVCThx1CtX75p02FpeHASqJX3XUPubt/bdOOc+GDXL3eFj5I09+ OAdBQsyRajgnEswhTl3aQoIToHhD44Z067YbCuhetOMP X-Google-Smtp-Source: APXvYqyh/9m/tHqmRpLXD6fMkUvN7eDbXRyFHEFNKPFk/4b/0GQNjoWeK9n+3i3GXDVqYUvn1D/2U7twxTmpfbVPOlY= X-Received: by 2002:a2e:12dc:: with SMTP id 89mr17883261ljs.40.1560283953759; Tue, 11 Jun 2019 13:12:33 -0700 (PDT) MIME-Version: 1.0 From: Alan Somers Date: Tue, 11 Jun 2019 14:12:22 -0600 Message-ID: Subject: panic: vm_fault_hold: fault on nofault entry in fusefs To: FreeBSD Hackers Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 0CB177637A X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of asomers@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=asomers@gmail.com X-Spamd-Result: default: False [-4.02 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; RCVD_COUNT_TWO(0.00)[2]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_TRACE(0.00)[0:+]; TO_DN_ALL(0.00)[]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.76)[-0.757,0]; RCVD_IN_DNSWL_NONE(0.00)[178.208.85.209.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; IP_SCORE(-1.25)[ip: (-0.50), ipnet: 209.85.128.0/17(-3.41), asn: 15169(-2.30), country: US(-0.06)]; TO_DOM_EQ_FROM_DOM(0.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jun 2019 20:12:37 -0000 Can somebody please help me to debug a fusefs problem? I have a 100% reproducible panic with the above message. Evidentially there's something I don't know about buf(9) and uiomove(9). The good news is that the panic is sufficiently reproducible and sufficiently instrumented that I know exactly what's happening; I just don't know why. Here's a summary of what happens. 1) fusefs's VOP_WRITE method gets called with a buffer that spans a logical block boundary, but does not extend the size of the file. 2) It splits the write into two parts. Each one calls getblk to allocate a struct buf, fills in the old data with a read, and fills the new data with uiomove. 3) After the file gets close()ed, VOP_INACTIVE calls vn_fsync_buf to flush dirty buffers. 4) VOP_STRATEGY successfully writes the first buffer and frees it with bufdone(). 5) VOP_STRATEGY tries to write the second buffer, but panics during uiomove. The address that caused the panic is always exactly 4KB into the buffer. So what am I doing wrong? The address that causes the panic in step 5 was successfully accessed in step 2, so this isn't some kind of buffer overrun. Does it have something to do with the fact that the read operation in step 2 called bufdone()? Seems unlikely because it did that for both buffers, yet only the second one panics. Or does the address actually fault during both VOP_WRITE and VOP_STRATEGY, but something low down handles the fault in the first case? I'd be grateful for any help that anyone can offer. -Alan P.S. Here's the panic's stack panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe0004591000 cpuid = 1 time = 1560283621 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0031c21f80 vpanic() at vpanic+0x19d/frame 0xfffffe0031c21fd0 panic() at panic+0x43/frame 0xfffffe0031c22030 vm_fault_hold() at vm_fault_hold+0x2064/frame 0xfffffe0031c22170 vm_fault() at vm_fault+0x60/frame 0xfffffe0031c221b0 trap_pfault() at trap_pfault+0x188/frame 0xfffffe0031c22200 trap() at trap+0x2b4/frame 0xfffffe0031c22310 calltrap() at calltrap+0x8/frame 0xfffffe0031c22310 --- trap 0xc, rip = 0xffffffff8108c9e6, rsp = 0xfffffe0031c223e0, rbp = 0xfffffe0031c223e0 --- memmove_erms() at memmove_erms+0x116/frame 0xfffffe0031c223e0 uiomove_faultflag() at uiomove_faultflag+0x146/frame 0xfffffe0031c22420 fuse_write_directbackend() at fuse_write_directbackend+0x1cd/frame 0xfffffe0031c224f0 fuse_io_strategy() at fuse_io_strategy+0x24d/frame 0xfffffe0031c22590 fuse_vnop_strategy() at fuse_vnop_strategy+0x2a/frame 0xfffffe0031c225a0 VOP_STRATEGY_APV() at VOP_STRATEGY_APV+0x63/frame 0xfffffe0031c225c0 bufstrategy() at bufstrategy+0x44/frame 0xfffffe0031c225f0 bufwrite() at bufwrite+0x259/frame 0xfffffe0031c22640 vn_fsync_buf() at vn_fsync_buf+0x23e/frame 0xfffffe0031c226a0 fuse_vnop_inactive() at fuse_vnop_inactive+0x7e/frame 0xfffffe0031c226e0 VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x63/frame 0xfffffe0031c22700 vinactive() at vinactive+0xcd/frame 0xfffffe0031c22750 vputx() at vputx+0x2d0/frame 0xfffffe0031c227b0 vn_close1() at vn_close1+0x116/frame 0xfffffe0031c22820 vn_closefile() at vn_closefile+0x4c/frame 0xfffffe0031c228a0 _fdrop() at _fdrop+0x1a/frame 0xfffffe0031c228c0 closef() at closef+0x1ec/frame 0xfffffe0031c22950 closefp() at closefp+0x9c/frame 0xfffffe0031c22990 amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe0031c22ab0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0031c22ab0 --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8006842ba, rsp = 0x7fffffffe748, rbp = 0x7fffffffe760 --- KDB: enter: panic From owner-freebsd-hackers@freebsd.org Tue Jun 11 20:30:26 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AA93515C6652 for ; Tue, 11 Jun 2019 20:30:26 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 16D94775D3; Tue, 11 Jun 2019 20:30:25 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x5BKUIYR006844 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 11 Jun 2019 23:30:21 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x5BKUIYR006844 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x5BKUIRW006842; Tue, 11 Jun 2019 23:30:18 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 11 Jun 2019 23:30:18 +0300 From: Konstantin Belousov To: Alan Somers Cc: FreeBSD Hackers Subject: Re: panic: vm_fault_hold: fault on nofault entry in fusefs Message-ID: <20190611203018.GC75280@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.0 (2019-05-25) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jun 2019 20:30:26 -0000 On Tue, Jun 11, 2019 at 02:12:22PM -0600, Alan Somers wrote: > Can somebody please help me to debug a fusefs problem? I have a 100% > reproducible panic with the above message. Evidentially there's > something I don't know about buf(9) and uiomove(9). The good news is > that the panic is sufficiently reproducible and sufficiently > instrumented that I know exactly what's happening; I just don't know > why. Here's a summary of what happens. > > 1) fusefs's VOP_WRITE method gets called with a buffer that spans a > logical block boundary, but does not extend the size of the file. > 2) It splits the write into two parts. Each one calls getblk to > allocate a struct buf, fills in the old data with a read, and fills > the new data with uiomove. > 3) After the file gets close()ed, VOP_INACTIVE calls vn_fsync_buf to > flush dirty buffers. > 4) VOP_STRATEGY successfully writes the first buffer and frees it with > bufdone(). > 5) VOP_STRATEGY tries to write the second buffer, but panics during > uiomove. The address that caused the panic is always exactly 4KB into > the buffer. > > So what am I doing wrong? The address that causes the panic in step 5 > was successfully accessed in step 2, so this isn't some kind of buffer > overrun. Does it have something to do with the fact that the read > operation in step 2 called bufdone()? Seems unlikely because it did > that for both buffers, yet only the second one panics. Or does the > address actually fault during both VOP_WRITE and VOP_STRATEGY, but > something low down handles the fault in the first case? I'd be > grateful for any help that anyone can offer. > -Alan > > P.S. > Here's the panic's stack > panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe0004591000 > cpuid = 1 > time = 1560283621 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0031c21f80 > vpanic() at vpanic+0x19d/frame 0xfffffe0031c21fd0 > panic() at panic+0x43/frame 0xfffffe0031c22030 > vm_fault_hold() at vm_fault_hold+0x2064/frame 0xfffffe0031c22170 > vm_fault() at vm_fault+0x60/frame 0xfffffe0031c221b0 > trap_pfault() at trap_pfault+0x188/frame 0xfffffe0031c22200 > trap() at trap+0x2b4/frame 0xfffffe0031c22310 > calltrap() at calltrap+0x8/frame 0xfffffe0031c22310 > --- trap 0xc, rip = 0xffffffff8108c9e6, rsp = 0xfffffe0031c223e0, rbp > = 0xfffffe0031c223e0 --- > memmove_erms() at memmove_erms+0x116/frame 0xfffffe0031c223e0 > uiomove_faultflag() at uiomove_faultflag+0x146/frame 0xfffffe0031c22420 > fuse_write_directbackend() at fuse_write_directbackend+0x1cd/frame > 0xfffffe0031c224f0 > fuse_io_strategy() at fuse_io_strategy+0x24d/frame 0xfffffe0031c22590 > fuse_vnop_strategy() at fuse_vnop_strategy+0x2a/frame 0xfffffe0031c225a0 > VOP_STRATEGY_APV() at VOP_STRATEGY_APV+0x63/frame 0xfffffe0031c225c0 > bufstrategy() at bufstrategy+0x44/frame 0xfffffe0031c225f0 > bufwrite() at bufwrite+0x259/frame 0xfffffe0031c22640 > vn_fsync_buf() at vn_fsync_buf+0x23e/frame 0xfffffe0031c226a0 > fuse_vnop_inactive() at fuse_vnop_inactive+0x7e/frame 0xfffffe0031c226e0 > VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x63/frame 0xfffffe0031c22700 > vinactive() at vinactive+0xcd/frame 0xfffffe0031c22750 > vputx() at vputx+0x2d0/frame 0xfffffe0031c227b0 > vn_close1() at vn_close1+0x116/frame 0xfffffe0031c22820 > vn_closefile() at vn_closefile+0x4c/frame 0xfffffe0031c228a0 > _fdrop() at _fdrop+0x1a/frame 0xfffffe0031c228c0 > closef() at closef+0x1ec/frame 0xfffffe0031c22950 > closefp() at closefp+0x9c/frame 0xfffffe0031c22990 > amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe0031c22ab0 > fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0031c22ab0 > --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8006842ba, rsp = > 0x7fffffffe748, rbp = 0x7fffffffe760 --- > KDB: enter: panic Start with dumping core. Then print out the struct buf and show it. From owner-freebsd-hackers@freebsd.org Tue Jun 11 21:47:02 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B537515C809F for ; Tue, 11 Jun 2019 21:47:02 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3379C81BFF for ; Tue, 11 Jun 2019 21:47:02 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lf1-f42.google.com with SMTP id y198so10464797lfa.1 for ; Tue, 11 Jun 2019 14:47:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cN5iYXVq5PUUCcd3LX3jWGQjvZRoyG6iMbVVrHDQ0gs=; b=dqCGcxNhpbnb5EhTAOIJw/7HkCH7rbmRU3xuyoY31Mu/7cG21k6F6Y1w6E992lV/nz WGThXYOJvdHWJH1ovciiXCxEp3SrfnMZ75eBcctRkbAwds3QzON/VRJIfWVvNKvDkyAm zBco9viDXUT73pdvOCwPD4Np9HeTXLkYrxpqLGeQYBeusAXDCyvlfuYLTrbuM1AY12Fb Sav/bbtLPrT8J09vrencurtgv4nDc+U7DzhhP/wmf/X0JfvH4/lFSAlH37XZ4GbLZL5X 5RbRrzX1f97ukeTvA0+h8qSU4KzFVzzzuQN/wm4ACRI8rrESEL1G4jztBEjlTQGs+C7u jKIw== X-Gm-Message-State: APjAAAW4hZvdOkwc8F6za1oH6ekmqE1GgprnoS3EOGpTeiA4WmEyBBoX I6PQlvl6zHu3XpQ5oO0UposRiwiWgJFekAtRhxE= X-Google-Smtp-Source: APXvYqz3JtNnU9li+L2zDajpvO4DW2QCJ6o3qxIvti3TPWVaS8TQqeIHc6DCDFHR1hqjqkZ8dDNw5imFGpUBRxa1Ia8= X-Received: by 2002:a19:5218:: with SMTP id m24mr26794394lfb.109.1560289614662; Tue, 11 Jun 2019 14:46:54 -0700 (PDT) MIME-Version: 1.0 References: <20190611203018.GC75280@kib.kiev.ua> In-Reply-To: <20190611203018.GC75280@kib.kiev.ua> From: Alan Somers Date: Tue, 11 Jun 2019 15:46:42 -0600 Message-ID: Subject: Re: panic: vm_fault_hold: fault on nofault entry in fusefs To: Konstantin Belousov Cc: FreeBSD Hackers Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 3379C81BFF X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.95 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.95)[-0.952,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jun 2019 21:47:03 -0000 On Tue, Jun 11, 2019 at 2:30 PM Konstantin Belousov wrote: > > On Tue, Jun 11, 2019 at 02:12:22PM -0600, Alan Somers wrote: > > Can somebody please help me to debug a fusefs problem? I have a 100% > > reproducible panic with the above message. Evidentially there's > > something I don't know about buf(9) and uiomove(9). The good news is > > that the panic is sufficiently reproducible and sufficiently > > instrumented that I know exactly what's happening; I just don't know > > why. Here's a summary of what happens. > > > > 1) fusefs's VOP_WRITE method gets called with a buffer that spans a > > logical block boundary, but does not extend the size of the file. > > 2) It splits the write into two parts. Each one calls getblk to > > allocate a struct buf, fills in the old data with a read, and fills > > the new data with uiomove. > > 3) After the file gets close()ed, VOP_INACTIVE calls vn_fsync_buf to > > flush dirty buffers. > > 4) VOP_STRATEGY successfully writes the first buffer and frees it with > > bufdone(). > > 5) VOP_STRATEGY tries to write the second buffer, but panics during > > uiomove. The address that caused the panic is always exactly 4KB into > > the buffer. > > > > So what am I doing wrong? The address that causes the panic in step 5 > > was successfully accessed in step 2, so this isn't some kind of buffer > > overrun. Does it have something to do with the fact that the read > > operation in step 2 called bufdone()? Seems unlikely because it did > > that for both buffers, yet only the second one panics. Or does the > > address actually fault during both VOP_WRITE and VOP_STRATEGY, but > > something low down handles the fault in the first case? I'd be > > grateful for any help that anyone can offer. > > -Alan > > > > P.S. > > Here's the panic's stack > > panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe0004591000 > > cpuid = 1 > > time = 1560283621 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0031c21f80 > > vpanic() at vpanic+0x19d/frame 0xfffffe0031c21fd0 > > panic() at panic+0x43/frame 0xfffffe0031c22030 > > vm_fault_hold() at vm_fault_hold+0x2064/frame 0xfffffe0031c22170 > > vm_fault() at vm_fault+0x60/frame 0xfffffe0031c221b0 > > trap_pfault() at trap_pfault+0x188/frame 0xfffffe0031c22200 > > trap() at trap+0x2b4/frame 0xfffffe0031c22310 > > calltrap() at calltrap+0x8/frame 0xfffffe0031c22310 > > --- trap 0xc, rip = 0xffffffff8108c9e6, rsp = 0xfffffe0031c223e0, rbp > > = 0xfffffe0031c223e0 --- > > memmove_erms() at memmove_erms+0x116/frame 0xfffffe0031c223e0 > > uiomove_faultflag() at uiomove_faultflag+0x146/frame 0xfffffe0031c22420 > > fuse_write_directbackend() at fuse_write_directbackend+0x1cd/frame > > 0xfffffe0031c224f0 > > fuse_io_strategy() at fuse_io_strategy+0x24d/frame 0xfffffe0031c22590 > > fuse_vnop_strategy() at fuse_vnop_strategy+0x2a/frame 0xfffffe0031c225a0 > > VOP_STRATEGY_APV() at VOP_STRATEGY_APV+0x63/frame 0xfffffe0031c225c0 > > bufstrategy() at bufstrategy+0x44/frame 0xfffffe0031c225f0 > > bufwrite() at bufwrite+0x259/frame 0xfffffe0031c22640 > > vn_fsync_buf() at vn_fsync_buf+0x23e/frame 0xfffffe0031c226a0 > > fuse_vnop_inactive() at fuse_vnop_inactive+0x7e/frame 0xfffffe0031c226e0 > > VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x63/frame 0xfffffe0031c22700 > > vinactive() at vinactive+0xcd/frame 0xfffffe0031c22750 > > vputx() at vputx+0x2d0/frame 0xfffffe0031c227b0 > > vn_close1() at vn_close1+0x116/frame 0xfffffe0031c22820 > > vn_closefile() at vn_closefile+0x4c/frame 0xfffffe0031c228a0 > > _fdrop() at _fdrop+0x1a/frame 0xfffffe0031c228c0 > > closef() at closef+0x1ec/frame 0xfffffe0031c22950 > > closefp() at closefp+0x9c/frame 0xfffffe0031c22990 > > amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe0031c22ab0 > > fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0031c22ab0 > > --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8006842ba, rsp = > > 0x7fffffffe748, rbp = 0x7fffffffe760 --- > > KDB: enter: panic > Start with dumping core. Then print out the struct buf and show it. Thanks for the tip. I think I've figured it out: after VOP_WRITE but before VOP_INACTIVE a VOP_SETATTR was truncating the file. And a legacy of fuse_io.c's origins as a copy/paste of the NFS client is that it has are two different ways to track the valid region of a buf. -Alan From owner-freebsd-hackers@freebsd.org Wed Jun 12 02:55:43 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 62E6B15CD10F for ; Wed, 12 Jun 2019 02:55:43 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic306-3.consmr.mail.bf2.yahoo.com (sonic306-3.consmr.mail.bf2.yahoo.com [74.6.132.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BDDB48A353 for ; Wed, 12 Jun 2019 02:55:41 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: ujxxOJkVM1klOzCC4HbQGNAx4wjwMjJpIAuHGeZgunXOeQiCknDIO7eX0VJJ9h4 vHwnyMU4w24JZAaFS8J8gsggPHcSAw1nfQFYAo1gthUJWX1IV4ygWKqOJP5GG6pYpTngSpnBQ.2F t_J.gOXsajBR.h8qNpPj5atURXZHtEkYcdH7V8VxqyqUA4zU1qN1c5hLV.I4AvGS8eBnWm5NbPml P.sOKWaBum4o5oUy2HGX64fWPt2HM_mR_eW900fDvtx6KIQ2KI0HNm0LHBTY0K0ixSdDTTIXZVPv cX5ifKAmihZir8hoV6wjNATPBaJ5X2zTVZpvyQrATvNu1Vgb9A.MO0QvRjBMUCPjdECIka39tmyi PIEfLmzKaVmWLSCOqG2eyVQ9YMpmfJRYrEOeHNJ.3jQNvQNhfRmx9Jt4tJl7IYdvcnhzLcidOTXl yg2TLg_ZzIVzZ3GSqsLtDQ0fCb_yUHinxTqeeBL9ymcg4xE1toKLP_vLhDpPV8QZoQn4P0j5ymB4 pfnMz1nR1gOgNd9I0vTgZmehMEG6_wVzKSXsFwXJA7BojAX1_u5Glrbkbj1ZWNv4ImjGyvmfcUeQ 2rBaEEuEM3kLiliTCfUXEU66r_m5uPZs.jKxthcQqrbm7sHH812xnvYq7QWnc.kz6Yne7EZtVt1m pNI83nM4aqOSfSNAl0Nu8PsyG4pUzDPIm05Vd0PfLPX7ugEAi9Qrbh2WbTc_nOffXSp4or4Ra9d0 i3A4R6aVY3FLadXD9zoEur3BeYGgBXekP6A3G6TWdJJdJoWvwjUT2sDoaNxBprSwez1CpUFA9fYm v3inTeA4WWdcpG_pHwo4e2fIZdDdt.6o1IJzoSB4HnkdeDEuKVKiUjtn72TY3w1amUJyQUU38a5h cZZl21v7Y0tnAykY8B9WcX7Zzoto6ThwyhDjmTbFBynMOlBi54nfuD49bZKfSwlOFVJhEIa3YUSb QRhMJE1FJm5ZQ7FdQgxUAcu_ZeVD84bGm5y59vLeH3GWmZ7ff4M.RIzqSm4VWKS7n1_Ft9kJDpGO LV2H2MnZ98n.Qul71PwEeWy_Po14V0R_wYjsJ0ghCEr7bp9pq0eqpZ1HW7eB8AVmaPvVBvPHJx0H avsKW_mE7UZdNtqjSWkBpHZqPkyVcIMWUWlyHDNw7pU8EnJ8_TzYM1zD3hA03vxTSLNUf_Jv.KCK cho2WGLoIYyDBZTXLhIn8k1cL0YqxKTivOyYxhWWkmZlCFXRSt9Nqy6Jp0b84qFGPJHHwsKc3QQ- - Received: from sonic.gate.mail.ne1.yahoo.com by sonic306.consmr.mail.bf2.yahoo.com with HTTP; Wed, 12 Jun 2019 02:55:34 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp429.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID d8f426307aa9de76d78aa62e4bebd608; Wed, 12 Jun 2019 02:55:33 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? From: Mark Millard In-Reply-To: <48148449-93B0-446C-AA28-F211FFAE1A8B@yahoo.com> Date: Tue, 11 Jun 2019 19:55:31 -0700 Cc: Alfredo Dal Ava Junior , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: <86F7C4C4-2BB6-40F0-B5D3-C80ECB4A97CF@yahoo.com> References: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com> <47E002B7-D4A1-4C4B-BFFD-D926263D895E@yahoo.com> <48148449-93B0-446C-AA28-F211FFAE1A8B@yahoo.com> To: FreeBSD Hackers , freeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: BDDB48A353 X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.94 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_SPAM_SHORT(0.91)[0.907,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.37)[ip: (4.15), ipnet: 74.6.128.0/21(1.52), asn: 26101(1.22), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.28)[0.279,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.90)[0.896,0]; RCVD_IN_DNSWL_NONE(0.00)[42.132.6.74.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jun 2019 02:55:43 -0000 [I have confirmed .sbss not being zero'd out and environ thereby starting out non-zero (garbage): a debug.minidump=3D0 style dump.] > On 2019-Jun-10, at 16:19, Mark Millard wrote: >=20 > [Forcing an appropriate large .sbss alignment was not enough > to avoid the clang-based problem for *sp++ related environ > code in _init_tls .] >=20 > On 2019-Jun-10, at 12:20, Mark Millard wrote: >=20 >> [I decided to compare some readelf information from some >> other architectures. I was surprised by some of it. But >> .bss seems to be forced to start with a large alignment >> to avoid such issues as I originally traced.] >>=20 >> On 2019-Jun-10, at 11:24, Mark Millard wrote: >>=20 >>> [Looks like Conrad M. is partially confirming my trace of the >>> issue is reasonable.] >>>=20 >>> On 2019-Jun-10, at 07:37, Conrad Meyer wrote: >>>=20 >>>> Hi Mark, >>>>=20 >>>> On Sun, Jun 9, 2019 at 11:17 PM Mark Millard via freebsd-hackers >>>> wrote: >>>>> ... >>>>> vm_pager_get_pages uses vm_page_zero_invalid >>>>> to "Zero out partially filled data". >>>>>=20 >>>>> But vm_page_zero_invalid does not zero every "invalid" >>>>> byte but works in terms of units of DEV_BSIZE : >>>>> ... >>>>> The comment indicates that areas of "sub-DEV_BSIZE" >>>>> should have been handled previously by >>>>> vm_page_set_validclean . >>>>=20 >>>> Or another VM routine, yes (e.g., vm_page_set_valid_range). The = valid >>>> and dirty bitmasks in vm_page only have a single bit per DEV_BSIZE >>>> region, so care must be taken when marking any sub-DEV_BSIZE region = as >>>> valid to zero out the rest of the DEV_BSIZE region. This is part = of >>>> the VM page contract. I'm not sure it's related to the BSS, = though. >>>=20 >>> Yea, I had written from what I'd seen in __elfN(load_section): >>>=20 >>> QUOTE >>> __elfN(load_section) uses vm_imgact_map_page >>> to set up for its copyout. This appears to be >>> how the FileSiz (not including .sbss or .bss) >>> vs. MemSiz (including .sbss and .bss) is >>> handled (attempted?). >>> END QUOTE >>>=20 >>> The copyout only copies through the last byte for filesz >>> but the vm_imgact_map_page does not zero out all the >>> bytes after that on that page: >>>=20 >>> /* >>> * We have to get the remaining bit of the file into the first = part >>> * of the oversized map segment. This is normally because the = .data >>> * segment in the file is extended to provide bss. It's a neat = idea >>> * to try and save a page, but it's a pain in the behind to = implement. >>> */ >>> copy_len =3D filsz =3D=3D 0 ? 0 : (offset + filsz) - = trunc_page(offset + >>> filsz); >>> map_addr =3D trunc_page((vm_offset_t)vmaddr + filsz); >>> map_len =3D round_page((vm_offset_t)vmaddr + memsz) - map_addr; >>> . . . >>> if (copy_len !=3D 0) { >>> sf =3D vm_imgact_map_page(object, offset + filsz); >>> if (sf =3D=3D NULL) >>> return (EIO); >>>=20 >>> /* send the page fragment to user space */ >>> off =3D trunc_page(offset + filsz) - trunc_page(offset = + filsz); >>> error =3D copyout((caddr_t)sf_buf_kva(sf) + off, >>> (caddr_t)map_addr, copy_len); >>> vm_imgact_unmap_page(sf); >>> if (error !=3D 0) >>> return (error); >>> } >>>=20 >>> I looked into the details of the DEV_BSIZE code after sending >>> the original message and so realized that my provided example >>> /sbin/init readelf material was a good example of the issue >>> if I'd not missed something. >>>=20 >>>>> So, if, say, char**environ ends up at the start of .sbss >>>>> consistently, does environ always end up zeroed independently >>>>> of FileSz for the PT_LOAD that spans them? >>>>=20 >>>> It is required to be zeroed, yes. If not, there is a bug. If = FileSz >>>> covers BSS, that's a bug in the linker. Either the trailing bytes = of >>>> the corresponding page in the executable should be zero (wasteful; = on >>>> amd64 ".comment" is packed in there instead), or the linker/loader >>>> must zero them at initialization. I'm not familiar with the >>>> particular details here, but if you are interested I would suggest >>>> looking at __elfN(load_section) in sys/kern/imgact_elf.c. >>>=20 >>> I had looked at it some, see the material around the earlier quote >>> above. >>>=20 >>>>> The following is not necessarily an example of problematical >>>>> figures but is just for showing an example structure of what >>>>> FileSiz covers vs. MemSiz for PT_LOAD's that involve .sbss >>>>> and .bss : >>>>> ... >>>>=20 >>>> Your 2nd LOAD phdr's FileSiz matches up exactly with Segment .sbss >>>> Offset minus Segment .tdata Offset, i.e., none of the FileSiz >>>> corresponds to the (s)bss regions. (Good! At least the static = linker >>>> part looks sane.) That said, the boundary is not page-aligned and = the >>>> section alignment requirement is much lower than page_size, so the >>>> beginning of bss will share a file page with some data. Something >>>> should zero it at image activation. >>>=20 >>> And, so far, I've not found anything in _start or before that does >>> zero any "sub-DEV_BSIZE" part after FileSz for the PT_LOAD in >>> question. >>>=20 >>> Thanks for checking my trace of the issue. It is good to have some >>> confirmation that I'd not missed something. >>>=20 >>>> (Tangent: sbss/bss probably do not need to be RWE on PPC! On = amd64, >>>> init has three LOAD segments rather than two: one for rodata (R), = one >>>> for .text, .init, etc (RX); and one for .data (RW).) >>>=20 >>> Yea, the section header flags indicate just WA for .sbss and .bss = (but >>> WAX for .got). >>>=20 >>> But such is more general: for example, the beginning of .rodata >>> (not executable) shares the tail part of a page with .fini >>> (executable) in the example. .got has executable code but is in >>> the middle of sections that do not. For something like /sbin/init it >>> is so small that the middle of a page can be the only part that is >>> executable, as in the example. (It is not forced onto its own page.) >>>=20 >>> The form of .got used is also writable: WAX for section header = flags. >>=20 >>=20 >>=20 >> amd64's /sbin/init : >>=20 >> There are 9 program headers, starting at offset 64 >>=20 >> Program Headers: >> Type Offset VirtAddr PhysAddr FileSiz = MemSiz Flg Align >> PHDR 0x000040 0x0000000000200040 0x0000000000200040 = 0x0001f8 0x0001f8 R 0x8 >> LOAD 0x000000 0x0000000000200000 0x0000000000200000 = 0x039e94 0x039e94 R 0x1000 >> LOAD 0x03a000 0x000000000023a000 0x000000000023a000 = 0x0e8e40 0x0e8e40 R E 0x1000 >> LOAD 0x123000 0x0000000000323000 0x0000000000323000 = 0x005848 0x2381d9 RW 0x1000 >> TLS 0x127000 0x0000000000327000 0x0000000000327000 = 0x001800 0x001820 R 0x10 >> GNU_RELRO 0x127000 0x0000000000327000 0x0000000000327000 = 0x001848 0x001848 R 0x1 >> GNU_EH_FRAME 0x01b270 0x000000000021b270 0x000000000021b270 = 0x00504c 0x00504c R 0x4 >> GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 = 0x000000 0x000000 RW 0 >> NOTE 0x000238 0x0000000000200238 0x0000000000200238 = 0x000048 0x000048 R 0x4 >>=20 >> Section to Segment mapping: >> Segment Sections... >> 00 =20 >> 01 .note.tag .rela.plt .rodata .eh_frame_hdr .eh_frame=20 >> 02 .text .init .fini .plt=20 >> 03 .data .got.plt .tdata .tbss .ctors .dtors .jcr .init_array = .fini_array .bss=20 >> 04 .tdata .tbss=20 >> 05 .tdata .tbss .ctors .dtors .jcr .init_array .fini_array=20 >> 06 .eh_frame_hdr=20 >> 07 =20 >> 08 .note.tag=20 >> There are 27 section headers, starting at offset 0x157938: >>=20 >> Section Headers: >> [Nr] Name Type Addr Off Size = ES Flg Lk Inf Al >> [ 0] NULL 0000000000000000 000000 000000 = 00 0 0 0 >> [ 1] .note.tag NOTE 0000000000200238 000238 000048 = 00 A 0 0 4 >> [ 2] .rela.plt RELA 0000000000200280 000280 000030 = 18 AI 0 11 8 >> [ 3] .rodata PROGBITS 00000000002002c0 0002c0 01afb0 = 00 AMS 0 0 64 >> [ 4] .eh_frame_hdr PROGBITS 000000000021b270 01b270 00504c = 00 A 0 0 4 >> [ 5] .eh_frame PROGBITS 00000000002202c0 0202c0 019bd4 = 00 A 0 0 8 >> [ 6] .text PROGBITS 000000000023a000 03a000 0e8dfc = 00 AX 0 0 16 >> [ 7] .init PROGBITS 0000000000322dfc 122dfc 00000e = 00 AX 0 0 4 >> [ 8] .fini PROGBITS 0000000000322e0c 122e0c 00000e = 00 AX 0 0 4 >> [ 9] .plt PROGBITS 0000000000322e20 122e20 000020 = 00 AX 0 0 16 >> [10] .data PROGBITS 0000000000323000 123000 003a80 = 00 WA 0 0 16 >> [11] .got.plt PROGBITS 0000000000326a80 126a80 000010 = 00 WA 0 0 8 >> [12] .tdata PROGBITS 0000000000327000 127000 001800 = 00 WAT 0 0 16 >> [13] .tbss NOBITS 0000000000328800 128800 000020 = 00 WAT 0 0 8 >> [14] .ctors PROGBITS 0000000000328800 128800 000010 = 00 WA 0 0 8 >> [15] .dtors PROGBITS 0000000000328810 128810 000010 = 00 WA 0 0 8 >> [16] .jcr PROGBITS 0000000000328820 128820 000008 = 00 WA 0 0 8 >> [17] .init_array INIT_ARRAY 0000000000328828 128828 000018 = 00 WA 0 0 8 >> [18] .fini_array FINI_ARRAY 0000000000328840 128840 000008 = 00 WA 0 0 8 >> [19] .bss NOBITS 0000000000329000 128848 2321d9 = 00 WA 0 0 64 >> [20] .comment PROGBITS 0000000000000000 128848 0074d4 = 01 MS 0 0 1 >> [21] .gnu.warning.mkte PROGBITS 0000000000000000 12fd1c 000043 = 00 0 0 1 >> [22] .gnu.warning.f_pr PROGBITS 0000000000000000 12fd5f 000043 = 00 0 0 1 >> [23] .gnu_debuglink PROGBITS 0000000000000000 1478b0 000010 = 00 0 0 1 >> [24] .shstrtab STRTAB 0000000000000000 1478c0 0000f1 = 00 0 0 1 >> [25] .symtab SYMTAB 0000000000000000 12fda8 017b08 = 18 26 1707 8 >> [26] .strtab STRTAB 0000000000000000 1479b1 00ff84 = 00 0 0 1 >>=20 >> Note that there is space after .finit_array+8 before .bss starts >> with a sizable alignment. The MemSiz for 03 does span .bss . >>=20 >> armv7's /sbin/init is different about MemSiz spanning .bss: >>=20 >> Program Headers: >> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg = Align >> PHDR 0x000034 0x00010034 0x00010034 0x00120 0x00120 R 0x4 >> LOAD 0x000000 0x00010000 0x00010000 0x10674 0x10674 R = 0x1000 >> LOAD 0x011000 0x00021000 0x00021000 0xe9c54 0xe9c54 R E = 0x1000 >> LOAD 0x0fb000 0x0010b000 0x0010b000 0x03b88 0x30ccd RW = 0x1000 >> TLS 0x0fe000 0x0010e000 0x0010e000 0x00b60 0x00b70 R = 0x20 >> GNU_RELRO 0x0fe000 0x0010e000 0x0010e000 0x00b88 0x00b88 R 0x1 >> GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0 >> NOTE 0x000154 0x00010154 0x00010154 0x00064 0x00064 R 0x4 >> ARM_EXIDX 0x0001b8 0x000101b8 0x000101b8 0x00220 0x00220 R 0x4 >>=20 >> (NOTE: 0x0010b000+0x30ccd=3D=3D0x13BCCD . Compare this to the later = .bss >> Addr of 0x10f000.) >>=20 >> Section to Segment mapping: >> Segment Sections... >> 00 =20 >> 01 .note.tag .ARM.exidx .rodata .ARM.extab=20 >> 02 .text .init .fini=20 >> 03 .data .tdata .tbss .jcr .init_array .fini_array .got .bss=20 >> 04 .tdata .tbss=20 >> 05 .tdata .tbss .jcr .init_array .fini_array .got=20 >> 06 =20 >> 07 .note.tag=20 >> 08 .ARM.exidx=20 >> There are 24 section headers, starting at offset 0x12be3c: >>=20 >> Section Headers: >> [Nr] Name Type Addr Off Size ES Flg = Lk Inf Al >> [ 0] NULL 00000000 000000 000000 00 = 0 0 0 >> [ 1] .note.tag NOTE 00010154 000154 000064 00 A = 0 0 4 >> [ 2] .ARM.exidx ARM_EXIDX 000101b8 0001b8 000220 00 A = 5 0 4 >> [ 3] .rodata PROGBITS 00010400 000400 01022c 00 AMS = 0 0 64 >> [ 4] .ARM.extab PROGBITS 0002062c 01062c 000048 00 A = 0 0 4 >> [ 5] .text PROGBITS 00021000 011000 0e9c14 00 AX = 0 0 128 >> [ 6] .init PROGBITS 0010ac20 0fac20 000014 00 AX = 0 0 16 >> [ 7] .fini PROGBITS 0010ac40 0fac40 000014 00 AX = 0 0 16 >> [ 8] .data PROGBITS 0010b000 0fb000 002734 00 WA = 0 0 8 >> [ 9] .tdata PROGBITS 0010e000 0fe000 000b60 00 WAT = 0 0 16 >> [10] .tbss NOBITS 0010eb60 0feb60 000010 00 WAT = 0 0 4 >> [11] .jcr PROGBITS 0010eb60 0feb60 000000 00 WA = 0 0 4 >> [12] .init_array INIT_ARRAY 0010eb60 0feb60 000008 00 WA = 0 0 4 >> [13] .fini_array FINI_ARRAY 0010eb68 0feb68 000004 00 WA = 0 0 4 >> [14] .got PROGBITS 0010eb6c 0feb6c 00001c 00 WA = 0 0 4 >> [15] .bss NOBITS 0010f000 0feb88 02cccd 00 WA = 0 0 64 >> [16] .comment PROGBITS 00000000 0feb88 0074b6 01 MS = 0 0 1 >> [17] .ARM.attributes ARM_ATTRIBUTES 00000000 10603e 00004f 00 = 0 0 1 >> [18] .gnu.warning.mkte PROGBITS 00000000 10608d 000043 00 = 0 0 1 >> [19] .gnu.warning.f_pr PROGBITS 00000000 1060d0 000043 00 = 0 0 1 >> [20] .gnu_debuglink PROGBITS 00000000 11b314 000010 00 = 0 0 1 >> [21] .shstrtab STRTAB 00000000 11b324 0000e3 00 = 0 0 1 >> [22] .symtab SYMTAB 00000000 106114 015200 10 = 23 3063 4 >> [23] .strtab STRTAB 00000000 11b407 010a32 00 = 0 0 1 >>=20 >> Note that there is space after .got+0x1c before .bss starts >> with a sizable alignment. The MemSiz for 03 does *not* span >> .bss , unlike for amd64 (and the rest). >>=20 >>=20 >> aarch64's /sbin/init is similar to amd64 instead of armv7: >>=20 >> Program Headers: >> Type Offset VirtAddr PhysAddr FileSiz = MemSiz Flg Align >> PHDR 0x000040 0x0000000000200040 0x0000000000200040 = 0x0001c0 0x0001c0 R 0x8 >> LOAD 0x000000 0x0000000000200000 0x0000000000200000 = 0x01624f 0x01624f R 0x10000 >> LOAD 0x020000 0x0000000000220000 0x0000000000220000 = 0x0dd354 0x0dd354 R E 0x10000 >> LOAD 0x100000 0x0000000000300000 0x0000000000300000 = 0x011840 0x252111 RW 0x10000 >> TLS 0x110000 0x0000000000310000 0x0000000000310000 = 0x001800 0x001820 R 0x40 >> GNU_RELRO 0x110000 0x0000000000310000 0x0000000000310000 = 0x001840 0x001840 R 0x1 >> GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 = 0x000000 0x000000 RW 0 >> NOTE 0x000200 0x0000000000200200 0x0000000000200200 = 0x000048 0x000048 R 0x4 >>=20 >> Section to Segment mapping: >> Segment Sections... >> 00 =20 >> 01 .note.tag .rodata=20 >> 02 .text .init .fini=20 >> 03 .data .tdata .tbss .jcr .init_array .fini_array .got .bss=20 >> 04 .tdata .tbss=20 >> 05 .tdata .tbss .jcr .init_array .fini_array .got=20 >> 06 =20 >> 07 .note.tag=20 >> There are 21 section headers, starting at offset 0x14b6f0: >>=20 >> Section Headers: >> [Nr] Name Type Addr Off Size = ES Flg Lk Inf Al >> [ 0] NULL 0000000000000000 000000 000000 = 00 0 0 0 >> [ 1] .note.tag NOTE 0000000000200200 000200 000048 = 00 A 0 0 4 >> [ 2] .rodata PROGBITS 0000000000200280 000280 015fcf = 00 AMS 0 0 64 >> [ 3] .text PROGBITS 0000000000220000 020000 0dd31c = 00 AX 0 0 64 >> [ 4] .init PROGBITS 00000000002fd320 0fd320 000014 = 00 AX 0 0 16 >> [ 5] .fini PROGBITS 00000000002fd340 0fd340 000014 = 00 AX 0 0 16 >> [ 6] .data PROGBITS 0000000000300000 100000 003a20 = 00 WA 0 0 16 >> [ 7] .tdata PROGBITS 0000000000310000 110000 001800 = 00 WAT 0 0 16 >> [ 8] .tbss NOBITS 0000000000311800 111800 000020 = 00 WAT 0 0 8 >> [ 9] .jcr PROGBITS 0000000000311800 111800 000000 = 00 WA 0 0 8 >> [10] .init_array INIT_ARRAY 0000000000311800 111800 000018 = 00 WA 0 0 8 >> [11] .fini_array FINI_ARRAY 0000000000311818 111818 000008 = 00 WA 0 0 8 >> [12] .got PROGBITS 0000000000311820 111820 000020 = 00 WA 0 0 8 >> [13] .bss NOBITS 0000000000320000 111840 232111 = 00 WA 0 0 64 >> [14] .comment PROGBITS 0000000000000000 111840 007191 = 01 MS 0 0 1 >> [15] .gnu.warning.mkte PROGBITS 0000000000000000 1189d1 000043 = 00 0 0 1 >> [16] .gnu.warning.f_pr PROGBITS 0000000000000000 118a14 000043 = 00 0 0 1 >> [17] .gnu_debuglink PROGBITS 0000000000000000 13b7f8 000010 = 00 0 0 1 >> [18] .shstrtab STRTAB 0000000000000000 13b808 0000bd = 00 0 0 1 >> [19] .symtab SYMTAB 0000000000000000 118a58 022da0 = 18 20 3621 8 >> [20] .strtab STRTAB 0000000000000000 13b8c5 00fe2b = 00 0 0 1 >>=20 >> Note that there is space after .got+0x20 before .bss starts >> with a sizable alignment. The MemSiz for 03 does span >> .bss , like for amd64 (and all but armv7). >>=20 >> powerpc64's /sbin/init is similar to amd64 as well: >>=20 >> Program Headers: >> Type Offset VirtAddr PhysAddr FileSiz = MemSiz Flg Align >> PHDR 0x000040 0x0000000000200040 0x0000000000200040 = 0x0001f8 0x0001f8 R 0x8 >> LOAD 0x000000 0x0000000000200000 0x0000000000200000 = 0x039e94 0x039e94 R 0x1000 >> LOAD 0x03a000 0x000000000023a000 0x000000000023a000 = 0x0e8e40 0x0e8e40 R E 0x1000 >> LOAD 0x123000 0x0000000000323000 0x0000000000323000 = 0x005848 0x2381d9 RW 0x1000 >> TLS 0x127000 0x0000000000327000 0x0000000000327000 = 0x001800 0x001820 R 0x10 >> GNU_RELRO 0x127000 0x0000000000327000 0x0000000000327000 = 0x001848 0x001848 R 0x1 >> GNU_EH_FRAME 0x01b270 0x000000000021b270 0x000000000021b270 = 0x00504c 0x00504c R 0x4 >> GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 = 0x000000 0x000000 RW 0 >> NOTE 0x000238 0x0000000000200238 0x0000000000200238 = 0x000048 0x000048 R 0x4 >>=20 >> Section to Segment mapping: >> Segment Sections... >> 00 =20 >> 01 .note.tag .rela.plt .rodata .eh_frame_hdr .eh_frame=20 >> 02 .text .init .fini .plt=20 >> 03 .data .got.plt .tdata .tbss .ctors .dtors .jcr .init_array = .fini_array .bss=20 >> 04 .tdata .tbss=20 >> 05 .tdata .tbss .ctors .dtors .jcr .init_array .fini_array=20 >> 06 .eh_frame_hdr=20 >> 07 =20 >> 08 .note.tag=20 >> There are 27 section headers, starting at offset 0x157938: >>=20 >> Section Headers: >> [Nr] Name Type Addr Off Size = ES Flg Lk Inf Al >> [ 0] NULL 0000000000000000 000000 000000 = 00 0 0 0 >> [ 1] .note.tag NOTE 0000000000200238 000238 000048 = 00 A 0 0 4 >> [ 2] .rela.plt RELA 0000000000200280 000280 000030 = 18 AI 0 11 8 >> [ 3] .rodata PROGBITS 00000000002002c0 0002c0 01afb0 = 00 AMS 0 0 64 >> [ 4] .eh_frame_hdr PROGBITS 000000000021b270 01b270 00504c = 00 A 0 0 4 >> [ 5] .eh_frame PROGBITS 00000000002202c0 0202c0 019bd4 = 00 A 0 0 8 >> [ 6] .text PROGBITS 000000000023a000 03a000 0e8dfc = 00 AX 0 0 16 >> [ 7] .init PROGBITS 0000000000322dfc 122dfc 00000e = 00 AX 0 0 4 >> [ 8] .fini PROGBITS 0000000000322e0c 122e0c 00000e = 00 AX 0 0 4 >> [ 9] .plt PROGBITS 0000000000322e20 122e20 000020 = 00 AX 0 0 16 >> [10] .data PROGBITS 0000000000323000 123000 003a80 = 00 WA 0 0 16 >> [11] .got.plt PROGBITS 0000000000326a80 126a80 000010 = 00 WA 0 0 8 >> [12] .tdata PROGBITS 0000000000327000 127000 001800 = 00 WAT 0 0 16 >> [13] .tbss NOBITS 0000000000328800 128800 000020 = 00 WAT 0 0 8 >> [14] .ctors PROGBITS 0000000000328800 128800 000010 = 00 WA 0 0 8 >> [15] .dtors PROGBITS 0000000000328810 128810 000010 = 00 WA 0 0 8 >> [16] .jcr PROGBITS 0000000000328820 128820 000008 = 00 WA 0 0 8 >> [17] .init_array INIT_ARRAY 0000000000328828 128828 000018 = 00 WA 0 0 8 >> [18] .fini_array FINI_ARRAY 0000000000328840 128840 000008 = 00 WA 0 0 8 >> [19] .bss NOBITS 0000000000329000 128848 2321d9 = 00 WA 0 0 64 >> [20] .comment PROGBITS 0000000000000000 128848 0074d4 = 01 MS 0 0 1 >> [21] .gnu.warning.mkte PROGBITS 0000000000000000 12fd1c 000043 = 00 0 0 1 >> [22] .gnu.warning.f_pr PROGBITS 0000000000000000 12fd5f 000043 = 00 0 0 1 >> [23] .gnu_debuglink PROGBITS 0000000000000000 1478b0 000010 = 00 0 0 1 >> [24] .shstrtab STRTAB 0000000000000000 1478c0 0000f1 = 00 0 0 1 >> [25] .symtab SYMTAB 0000000000000000 12fda8 017b08 = 18 26 1707 8 >> [26] .strtab STRTAB 0000000000000000 1479b1 00ff84 = 00 0 0 1 >>=20 >>=20 >> Note that there is space after .fini_array+8 before .bss starts >> with a sizable alignment. The MemSiz for 03 does span >> .bss , like for amd64 (and all but armv7). >=20 > I temporarily forced my 32-bit powerpc /sbin/init to have: >=20 > Section Headers: > [Nr] Name Type Addr Off Size ES Flg = Lk Inf Al > . . . > [16] .got PROGBITS 0193845c 12845c 000010 04 WAX = 0 0 4 > [17] .sbss NOBITS 01939000 12846c 0000b0 00 WA = 0 0 4 > [18] .bss NOBITS 019390c0 12846c 02cc48 00 WA = 0 0 64 > . . . >=20 > It was not enough to avoid the problems I've elsewhere > reported for *sp++ getting SIGSEGV ( environ related > activity in _init_tls ). I used debug.minidump=3D0 in /boot/loader.conf for cusing a dump for the crash and a libkvm modified enough for my working boot environment to allow me to examine the the memory-image bytes of such a dump, with libkvm used via /usr/local/bin/kgdb . (No support of automatically translating user-space addresses or other such.) For the clang based debug buildworld and debug buildkernel context with /sbin/init having: [16] .got PROGBITS 01956ccc 146ccc 000010 04 WAX = 0 0 4 [17] .sbss NOBITS 01956cdc 146cdc 0000b0 00 WA = 0 0 4 [18] .bss NOBITS 01956dc0 146cdc 02ee28 00 WA = 0 0 64 I confirmed that .sbss in /sbin/init's address space is not zeroed (so environ is not assigned by handle_argv ). I also confirmed that _start was given a good env value (in %r5) based on where the value was stored on the stack. It is just that the value was not used. The detailed obvious-failure point (crash) can change based on the garbage in the .sbss and, for the build that I used this time, that happened in __je_arean_malloc_hard instead of before _init_tls called _libc_allocate_tls . (I traced the call chain in the dump.) =46rom what I've seen in the dump there seem to be special uses of some values (that also have normal uses, of course): 0xfa5005af: as yet invalid page content. 0x1c000020: as yet unassigned user-space-stack memory for /sbin/init. These are the same locations that I previously reported as showing up in the DSI read trap reports for /sbin/init failing. The specific build here failed with a different value. For reference relative to libkvm: # svnlite diff /usr/src/lib/libkvm/ Index: /usr/src/lib/libkvm/kvm_powerpc.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/lib/libkvm/kvm_powerpc.c (revision 347549) +++ /usr/src/lib/libkvm/kvm_powerpc.c (working copy) @@ -211,6 +211,53 @@ if (be32toh(vm->ph->p_paddr) =3D=3D 0xffffffff) return ((int)powerpc_va2off(kd, va, ofs)); =20 + // HACK in something for what I observe in + // a debug.minidump=3D0 vmcore.* for 32-bit powerpc + // + if ( be32toh(vm->ph->p_vaddr) =3D=3D 0xffffffff + && be32toh(vm->ph->p_paddr) =3D=3D 0 + && be16toh(vm->eh->e_phnum) =3D=3D 1 + ) { + // Presumes p_memsz is either unsigned + // 32-bit or is 64-bit, same for va . + + if (be32toh(vm->ph->p_memsz) <=3D va) + return 0; // Like powerpc_va2off + + // If ofs was (signed) 32-bit there + // would be a problem for sufficiently + // large postive memsz's and va's + // near the end --because of p_offset + // and dmphdrsz causing overflow/wrapping + // for some large va values. + // Presumes 64-bit ofs for such cases. + // Also presumes dmphdrsz+p_offset + // is non-negative so that small + // non-negative va values have no + // problems with ofs going negative. + + *ofs =3D vm->dmphdrsz + + be32toh(vm->ph->p_offset) + + va; + + // The normal return value overflows/wraps + // for p_memsz =3D=3D 0x80000000u when va =3D=3D 0 . + // Avoid this by depending on calling code's + // loop for sufficiently large cases. + // This code presumes p_memsz/2 <=3D MAX_INT . + // 32-bit powerpc FreeBSD does not allow + // using more than 2 GiBytes of RAM but + // does allow using 2 GiBytes on 64-bit + // hardware. + // + if ( (int)be32toh(vm->ph->p_memsz) < 0 + && va < be32toh(vm->ph->p_memsz)/2 + ) + return be32toh(vm->ph->p_memsz)/2; + + return be32toh(vm->ph->p_memsz) - va; + } + _kvm_err(kd, kd->program, "Raw corefile not supported"); return (0); } Index: /usr/src/lib/libkvm/kvm_private.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/lib/libkvm/kvm_private.c (revision 347549) +++ /usr/src/lib/libkvm/kvm_private.c (working copy) @@ -131,7 +131,9 @@ { =20 return (kd->nlehdr.e_ident[EI_CLASS] =3D=3D class && - kd->nlehdr.e_type =3D=3D ET_EXEC && + ( kd->nlehdr.e_type =3D=3D ET_EXEC || + kd->nlehdr.e_type =3D=3D ET_DYN + ) && kd->nlehdr.e_machine =3D=3D machine); } =20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-hackers@freebsd.org Wed Jun 12 04:54:10 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B20E15CED0D for ; Wed, 12 Jun 2019 04:54:10 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic301-20.consmr.mail.gq1.yahoo.com (sonic301-20.consmr.mail.gq1.yahoo.com [98.137.64.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CFC598D232 for ; Wed, 12 Jun 2019 04:54:08 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: H5DNLScVM1l5heb0xBT7HURupzJeie6g6OLNXYQIbXu..WCbtPOPqXWKgUFrH5x zsyraFkKd_R8XSEj3HIuMY2gF8cd6U4Qzf57pqTkzkFKZXb403y_JTUBN6L7.YIO7Z7j2WUBTtqs JxWAbMrnOaWnPfqli1NiGMXeqwkREzn1BK6GCS9sK_EzWzQLZtm7dYtlD6OcVCCJqwzJDIOMOZj2 o3svJKjNNEbYCn3joEv1l.GqRtVbfCXoy5DoXqUbzljWss3pNYStuOUUS73sMAheCSQ6TgMW_Bx1 lapP4bmeHNBqKdfyBEJuYKj5y6tGyE.7m3s1aRARO6x3WIRRCMoD1uQwChOwsTkkRTdjetMAhMzs 5ckZOpMyf8aMHEwqCFogC1m9ECYSBCoruSwPU8W4ru3Z3MbSE1T0FpR7HD51jEO0CVkF7acKF04g g_E1BDlweN4S2A.IbaNsHOwMDnM7CMyQaXqbgC5PArn4ax9dosmeJg3x36pE0OKW.liT.4DObvUn 5xblIatTL2C7CgTRgjhGUqC9n9bvARPUiypzdhmtwlooRNbBANHBCNV.QihNMSlDxDbhqgwkJCTQ jJH9d03_ztHA5Ak7ppiE4OEbXnXyzKOEGnnJLrSMYA7MlStiJ0aPX5YdLdDxRuTS0BoxHtlUrs.5 1Xvwr.ShY6HeuVMgV4Oouncavpkk_..xNj9QkB.v3t3wnv7vYXcplfqpwBnzTxXd8o_udJlDJEO6 390QH0IDI_1onyHCoPNWgA2MmUUhjF7hnVZkWThljqOiw4xyd6kHgTcLqz_O__fwuoaBGQ2ZAMyJ hyuIyk9H0HMzwO4Gg79p.HxLgyM916tGdDSxThq92qXfaJcD8Dd.Gd1NaJ8zevOUrZZ34T0f0OBO RjI4RJeuVwUN6n6rohJLCMY8bN.suVyIzgOjbE19wacQtMtT9Zxx2vxqX1EFFtc12r5wMOAeLcXy U1FtqLSuBQ8tsljjbSxPuctv6CpznSF3qvB9h_EV6dARVCENlH8rqqGYhiGwXRilywTfYQ48vuy7 WJuWR7iSzrTboU6WQxTlQW_cGA65YgpcC_KVae5fr.2ClssRmQoKicmyhZlNQ0q0jkcEJg_R.WTG Fncne7.2InJR_.HPZTEuakOQwkESW.Aj.LG7jBu7KjvulBsTOMFTv_2.kyhqcGYnL3ef7YYqp0mV DDjdRkeXeTq8cN_q0WgMiHczdPP44GVEfA7_5V5UIu35Cq1cGCrgn7uKkM3I- Received: from sonic.gate.mail.ne1.yahoo.com by sonic301.consmr.mail.gq1.yahoo.com with HTTP; Wed, 12 Jun 2019 04:54:00 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp416.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID dce79d6d04f209e2c5bbde054d31f817; Wed, 12 Jun 2019 04:53:56 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? From: Mark Millard In-Reply-To: <86F7C4C4-2BB6-40F0-B5D3-C80ECB4A97CF@yahoo.com> Date: Tue, 11 Jun 2019 21:53:55 -0700 Cc: Alfredo Dal Ava Junior , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: References: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com> <47E002B7-D4A1-4C4B-BFFD-D926263D895E@yahoo.com> <48148449-93B0-446C-AA28-F211FFAE1A8B@yahoo.com> <86F7C4C4-2BB6-40F0-B5D3-C80ECB4A97CF@yahoo.com> To: FreeBSD Toolchain , FreeBSD Hackers , freeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: CFC598D232 X-Spamd-Bar: ++++ X-Spamd-Result: default: False [4.34 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.96)[0.958,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(1.49)[ip: (5.79), ipnet: 98.137.64.0/21(0.95), asn: 36647(0.76), country: US(-0.06)]; NEURAL_SPAM_MEDIUM(0.50)[0.503,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.90)[0.904,0]; RCVD_IN_DNSWL_NONE(0.00)[146.64.137.98.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jun 2019 04:54:10 -0000 [The garbage after .got up to the page boundary is .comment section strings. The context here is targeting 32-bit powerpc via system-clang-8 and devel/powerpc64-binutils for buildworld and buildkernel . ] On 2019-Jun-11, at 19:55, Mark Millard wrote: > [I have confirmed .sbss not being zero'd out and environ > thereby starting out non-zero (garbage): a > debug.minidump=3D0 style dump.] >=20 >> On 2019-Jun-10, at 16:19, Mark Millard wrote: >>=20 >> . . . (omitted) . . . >=20 > I used debug.minidump=3D0 in /boot/loader.conf for > cusing a dump for the crash and a libkvm modified > enough for my working boot environment to allow me > to examine the the memory-image bytes of such a dump, > with libkvm used via /usr/local/bin/kgdb . (No support > of automatically translating user-space addresses > or other such.) >=20 > For the clang based debug buildworld and debug buildkernel > context with /sbin/init having: >=20 > [16] .got PROGBITS 01956ccc 146ccc 000010 04 WAX = 0 0 4 > [17] .sbss NOBITS 01956cdc 146cdc 0000b0 00 WA = 0 0 4 > [18] .bss NOBITS 01956dc0 146cdc 02ee28 00 WA = 0 0 64 >=20 > I confirmed that .sbss in /sbin/init's address space > is not zeroed (so environ is not assigned by handle_argv ). > I also confirmed that _start was given a good env value > (in %r5) based on where the value was stored on the > stack. It is just that the value was not used. >=20 > The detailed obvious-failure point (crash) can change based > on the garbage in the .sbss and, for the build that I used > this time, that happened in __je_arean_malloc_hard instead > of before _init_tls called _libc_allocate_tls . (I traced > the call chain in the dump.) >=20 >=20 > =46rom what I've seen in the dump there seem to be special > uses of some values (that also have normal uses, of > course): >=20 > 0xfa5005af: as yet invalid page content. > 0x1c000020: as yet unassigned user-space-stack memory for /sbin/init. >=20 > These are the same locations that I previously reported as > showing up in the DSI read trap reports for /sbin/init failing. > The specific build here failed with a different value. >=20 > For reference relative to libkvm: >=20 > # svnlite diff /usr/src/lib/libkvm/ > Index: /usr/src/lib/libkvm/kvm_powerpc.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/lib/libkvm/kvm_powerpc.c (revision 347549) > +++ /usr/src/lib/libkvm/kvm_powerpc.c (working copy) > @@ -211,6 +211,53 @@ > if (be32toh(vm->ph->p_paddr) =3D=3D 0xffffffff) > return ((int)powerpc_va2off(kd, va, ofs)); >=20 > + // HACK in something for what I observe in > + // a debug.minidump=3D0 vmcore.* for 32-bit powerpc > + // > + if ( be32toh(vm->ph->p_vaddr) =3D=3D 0xffffffff > + && be32toh(vm->ph->p_paddr) =3D=3D 0 > + && be16toh(vm->eh->e_phnum) =3D=3D 1 > + ) { > + // Presumes p_memsz is either unsigned > + // 32-bit or is 64-bit, same for va . > + > + if (be32toh(vm->ph->p_memsz) <=3D va) > + return 0; // Like powerpc_va2off > + > + // If ofs was (signed) 32-bit there > + // would be a problem for sufficiently > + // large postive memsz's and va's > + // near the end --because of p_offset > + // and dmphdrsz causing overflow/wrapping > + // for some large va values. > + // Presumes 64-bit ofs for such cases. > + // Also presumes dmphdrsz+p_offset > + // is non-negative so that small > + // non-negative va values have no > + // problems with ofs going negative. > + > + *ofs =3D vm->dmphdrsz > + + be32toh(vm->ph->p_offset) > + + va; > + > + // The normal return value overflows/wraps > + // for p_memsz =3D=3D 0x80000000u when va =3D=3D 0 . > + // Avoid this by depending on calling code's > + // loop for sufficiently large cases. > + // This code presumes p_memsz/2 <=3D MAX_INT . > + // 32-bit powerpc FreeBSD does not allow > + // using more than 2 GiBytes of RAM but > + // does allow using 2 GiBytes on 64-bit > + // hardware. > + // > + if ( (int)be32toh(vm->ph->p_memsz) < 0 > + && va < be32toh(vm->ph->p_memsz)/2 > + ) > + return be32toh(vm->ph->p_memsz)/2; > + > + return be32toh(vm->ph->p_memsz) - va; > + } > + > _kvm_err(kd, kd->program, "Raw corefile not supported"); > return (0); > } > Index: /usr/src/lib/libkvm/kvm_private.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- /usr/src/lib/libkvm/kvm_private.c (revision 347549) > +++ /usr/src/lib/libkvm/kvm_private.c (working copy) > @@ -131,7 +131,9 @@ > { >=20 > return (kd->nlehdr.e_ident[EI_CLASS] =3D=3D class && > - kd->nlehdr.e_type =3D=3D ET_EXEC && > + ( kd->nlehdr.e_type =3D=3D ET_EXEC || > + kd->nlehdr.e_type =3D=3D ET_DYN > + ) && > kd->nlehdr.e_machine =3D=3D machine); > } >=20 >=20 >=20 The following is was is in the .sbss/.bss up to the page boundry (after the .got bytes): (kgdb) x/s 0x2a66cdc 0x2a66cdc: "$FreeBSD: head/lib/csu/powerpc/crt1.c 326219 2017-11-26 = 02:00:33Z pfg $" (kgdb) x/s 0x2a66d24 0x2a66d24: "$FreeBSD: head/lib/csu/common/crtbrand.c 340701 = 2018-11-20 20:59:49Z emaste $" (kgdb) x/s 0x2a66d72 0x2a66d72: "$FreeBSD: head/lib/csu/common/ignore_init.c 340702 = 2018-11-20 21:04:20Z emaste $" (kgdb) x/s 0x2a66dc3 0x2a66dc3: "FreeBSD clang version 8.0.0 (tags/RELEASE_800/final = 356365) (based on LLVM 8.0.0)" (kgdb) x/s 0x2a66e15 0x2a66e15: "$FreeBSD: head/lib/csu/powerpc/crti.S 217399 2011-01-14 = 11:34:58Z kib $" (kgdb) x/s 0x2a66e5d 0x2a66e5d: "$FreeBSD: head/sbin/mount/getmntopts.c 326025 = 2017-11-20 19:49:47Z pfg $" (kgdb) x/s 0x2a66ea6 0x2a66ea6: "$FreeBSD: head/lib/libutil/login_tty.c 334106 = 2018-05-23 17:02:12Z jhb $" (kgdb) x/s 0x2a66eef 0x2a66eef: "$FreeBSD: head/lib/libutil/login_class.c 296723 = 2016-03-12 14:54:34Z kib $" (kgdb) x/s 0x2a66f83 0x2a66f83: "$FreeBSD: head/lib/libutil/_secure_path.c 139012 = 2004-12-18 12:31:12Z ru $" (kgdb) x/s 0x2a66fce 0x2a66fce: "$FreeBSD: head/lib/libcrypt/crypt.c 326219 2017-11 (I truncated that last to avoid the 0xfa5005af's on the next page in RAM.) Compare ( from readelf /sbin/init ): String dump of section '.comment': [ 0] $FreeBSD: head/lib/csu/powerpc/crt1.c 326219 2017-11-26 = 02:00:33Z pfg $ [ 48] $FreeBSD: head/lib/csu/common/crtbrand.c 340701 2018-11-20 = 20:59:49Z emaste $ [ 96] $FreeBSD: head/lib/csu/common/ignore_init.c 340702 = 2018-11-20 21:04:20Z emaste $ [ e7] FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) = (based on LLVM 8.0.0) [ 139] $FreeBSD: head/lib/csu/powerpc/crti.S 217399 2011-01-14 = 11:34:58Z kib $ [ 181] $FreeBSD: head/sbin/mount/getmntopts.c 326025 2017-11-20 = 19:49:47Z pfg $ [ 1ca] $FreeBSD: head/lib/libutil/login_tty.c 334106 2018-05-23 = 17:02:12Z jhb $ [ 213] $FreeBSD: head/lib/libutil/login_class.c 296723 2016-03-12 = 14:54:34Z kib $ [ 25e] $FreeBSD: head/lib/libutil/login_cap.c 317265 2017-04-21 = 19:27:33Z pfg $ [ 2a7] $FreeBSD: head/lib/libutil/_secure_path.c 139012 2004-12-18 = 12:31:12Z ru $ [ 2f2] $FreeBSD: head/lib/libcrypt/crypt.c 326219 2017-11-26 = 02:00:33Z pfg $ . . . Note: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg = Align LOAD 0x000000 0x01800000 0x01800000 0x140ad4 0x140ad4 R E = 0x10000 LOAD 0x140ae0 0x01950ae0 0x01950ae0 0x061fc 0x35108 RWE = 0x10000 NOTE 0x0000d4 0x018000d4 0x018000d4 0x00048 0x00048 R 0x4 TLS 0x140ae0 0x01950ae0 0x01950ae0 0x00b10 0x00b1d R 0x10 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10 Section to Segment mapping: Segment Sections... 00 .note.tag .init .text .fini .rodata .eh_frame=20 01 .tdata .tbss .init_array .fini_array .ctors .dtors .jcr = .data.rel.ro .data .got .sbss .bss=20 02 .note.tag=20 03 .tdata .tbss=20 04 =20 There are 24 section headers, starting at offset 0x16cec8: Section Headers: [Nr] Name Type Addr Off Size ES Flg = Lk Inf Al . . . [16] .got PROGBITS 01956ccc 146ccc 000010 04 WAX = 0 0 4 [17] .sbss NOBITS 01956cdc 146cdc 0000b0 00 WA = 0 0 4 [18] .bss NOBITS 01956dc0 146cdc 02ee28 00 WA = 0 0 64 [19] .comment PROGBITS 00000000 146cdc 0073d4 01 MS = 0 0 1 It looks like material after the .got is being copied, spanning the in-file-empty .sbss and .bss sections and implicitly initializing (the first part of) those sections. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-hackers@freebsd.org Wed Jun 12 13:01:05 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 744FC15B55A1 for ; Wed, 12 Jun 2019 13:01:05 +0000 (UTC) (envelope-from huangfq.daxian@gmail.com) Received: from mail-yb1-xb36.google.com (mail-yb1-xb36.google.com [IPv6:2607:f8b0:4864:20::b36]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9E210751C1 for ; Wed, 12 Jun 2019 13:01:04 +0000 (UTC) (envelope-from huangfq.daxian@gmail.com) Received: by mail-yb1-xb36.google.com with SMTP id d2so6353274ybh.8 for ; Wed, 12 Jun 2019 06:01:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=J/vCnni50Fqj+2qkKQkc1tAQ6DCGBWSlTNndCVOXgrU=; b=Tu0cagbgX3814sRrpqL/HuiFdIsB/Ez0DJMGbuGTzXCFVan9xEywtSvlGjh3oM4TX7 OKhv4QEEWtaB5ASNfHk0SFKLr47jmeO4/LGIs1VvUDIUNcnFgRIP2Ire5zarCTvXhz16 WQhX8H/nPW44q5f7p83hNtf0Dxf6lzRQsIHatS+EFUR8xR+PhO0rFET39DYkMOR3u1Uv GyBKxrV7KQXXYnKX2V3aKWM0G0UPJZJzDiaNBsCHufWXAM5TOqshMbtTgNYaz1AIbKBo F2csnAIHxjDWAy6fzQCtiybi0rIz6QJXnZE+SOBk6BIPW1TnWStXR9J/kvDk+iwE1e+R cugw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=J/vCnni50Fqj+2qkKQkc1tAQ6DCGBWSlTNndCVOXgrU=; b=MRsceHBNxJTfkWl1q8XsiceOGOBaiVhNiqMfh7DTzBm5HViklB158pqT4bvMNkYNPj vh8yIOJff4OGzrXQ6wU1yK9ZZVm5y0HTDWxLzxVtFK1NPgS0G/9+Sko9ldWPRvoz/OLF ZarVSSLwX7rHqBWw2KNCaV+a2H4DsTw8Y+MZP+2657nkRiD8q31NlvscLpn35uc5tL1l esXIEBPNDyR/UQ8UQ01lN0tEB6JUvkYNMHhGd/fBT8xK9LiicuZpONdMWcwYCt6sl+AV CCEycvVver+M3fiBpvwDQW630/EePrLi+9YWMODXomY2e5+fxhBJpXIDV2iQ5+l459Ih 1PeA== X-Gm-Message-State: APjAAAUcqIzDTlkil+wTaE4vsy/KzDjCOlz8/mUvQwBUKxW1g+BLBKfV Vg8N64Hd9R+dP9Sv1/Q6osbUpiYSiBZhAwYTIN+T+tXjQSk= X-Google-Smtp-Source: APXvYqxnE8/V8znEbbagY+vtwXvUktvZk27RFylcKBsSkH1Coxw5uzmYLwR30dJXPb1VSN60TM4Hq9iaa67FP7ILCBI= X-Received: by 2002:a25:d113:: with SMTP id i19mr40753817ybg.277.1560344463822; Wed, 12 Jun 2019 06:01:03 -0700 (PDT) MIME-Version: 1.0 From: Fuqian Huang Date: Wed, 12 Jun 2019 21:00:52 +0800 Message-ID: Subject: Dev:Ciss: A kernel address leakage in sys/dev/ciss/ciss.c To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 9E210751C1 X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=Tu0cagbg; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of huangfqdaxian@gmail.com designates 2607:f8b0:4864:20::b36 as permitted sender) smtp.mailfrom=huangfqdaxian@gmail.com X-Spamd-Result: default: False [-6.89 / 15.00]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; TO_DN_NONE(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-0.93)[-0.932,0]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; IP_SCORE(-2.94)[ip: (-9.20), ipnet: 2607:f8b0::/32(-3.17), asn: 15169(-2.30), country: US(-0.06)]; RCVD_IN_DNSWL_NONE(0.00)[6.3.b.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jun 2019 13:01:05 -0000 In freebsd/sys/dev/ciss/ciss.c, function ciss_print_request will dump the address of a kernel object cr to user space. Each time when a device is detached, it will call ciss_free->ciss_notify_abort->ciss_print_request, and this finally dump a kernel address to user space. static int ciss_detach(device_t dev) { struct ciss_softc *sc = device_get_softc(dev); ... ciss_free(sc); return (0); } static void ciss_free(struct ciss_softc *sc) { ... -> ciss_notify_abort(sc); ... } static int ciss_notify_abort(struct ciss_softc *sc) { struct ciss_request *cr; ... if ((error = ciss_get_request(sc, &cr)) goto out; ... -> ciss_print_request(cr); ... } static void ciss_print_request(struct ciss_request *cr) { struct ciss_softc *sc; ... sc = cr->cr_sc; ... -> ciss_printf(sc, "REQUEST @ %p\n", cr); ciss_printf(sc, " data %p/%d tag %d flags %b\n", cr->cr_data, cr->cr_length, cr->cr_tag, cr->cr_flags, "\20\1mapped\2sleep\3poll\4dataout\5datain\n"); } From owner-freebsd-hackers@freebsd.org Wed Jun 12 19:13:11 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B58615BF875 for ; Wed, 12 Jun 2019 19:13:11 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic308-2.consmr.mail.bf2.yahoo.com (sonic308-2.consmr.mail.bf2.yahoo.com [74.6.130.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C2C7189E9C for ; Wed, 12 Jun 2019 19:13:09 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 9KsIYDQVM1lTOuaOOgcDyV586bXcOdOjtwHYpRu97YJa_ZXT1wkBNex9YGPaqVX D3FRVvHL9SBLpE5GOSXbv3S20URAIUDkHZtHWYAOYuM8EEvBV47.zyHnVaqnx1a7rPFYwDwAEMXp pg5vpieQaBlepNYg8oiFbBOGm8MjZD17pL2mMN4wjgksJhafKiwVKMHTeAfp2gfpEdpPTIcGyCip SeuPfuOHD5HGItm6nonxRSho_UUMdPJgGmNnug6UhKXpcvZxuODJdYp7tM_JMlPl3osfX8uW71um Clx3uF38cBOt.4PTMiauu.a4OYoxmwbQUgkfuTea5ShTY2H3USwlnUVk6xOBuiyfmyLgTq9f3cND eg3M22dx.6ejaVccCNfFiVPi4pMuq2c.OAJPU5HXZHu5VmUyBxtUskz742AhnXxADnqmx03MJAjJ pEObTL_fddwumVcbp0dLh4ME.3QXGSESOmTiwtJOFwWEWWJCE.7nWIh1iRXj4W5W3IIC6r7SliYM NLm4Pb2ELGhnGNTI4W8mdGdSqG5fpRujH7_UhZkbZuiuVHuAdFq1de3A3XUNHLeRo2yAD_n23f1A _5shw35Nx2qJPW3Gg96WBCgeD.gmjV4grQEnPiR57pxx3eb.FJi2DPigXVwCAfETiWjydZ9D9iMt AEdAWvLfkRqgCN2s2nPmWcrrJ8EjyfIG.dMX9MzVqKK7SedaqglYFKZPmUCJi8EtITKjyjy9YURo j3zK7nf_A..W6AsSwAE._7dQxTdRiYx6zU7UPcvBYe5Ojjelb0foejhbXVvoFDSja2b_kWWOnqa0 RNJ4q5Drr.cjwDDRtee_dqtl0_MXk_V8MsxuRzw0UzRKck6wLupkXHQMOdkWbG0dk4g9psroHoSL 16CkhmPWXNV_Pw6RUI_YHquMbbx4FCHSAvG579CuNhEcose1QM8Z.GAU.whwPjkLv_mculFhd43j sfv.zTYVZOIPa6_AFMpiJe2S2Umc6EBWxDCYZHGlptVAPjJ8UuYj8fuGkGn6u3.KpobN8rlgHlWR tQr37LZ__ZppyCNS57FU18SnugCfa7uwWjqYScV2b1ubbjWr.HfFK2lrXI9CHhg1SUEi.TYQi87d QbauPib_ZYUY31SOmhMNFio5QtG6dckUEwHnzsrx1KflcOkdscssJyGJMQMSKTc1LcaPSJdK5KHO u154H2XgpNH6OAFtCDyk9snOFq7ogNE41ocCW4WcL77Y2lApMKH0Q0pI9jTPIeA-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic308.consmr.mail.bf2.yahoo.com with HTTP; Wed, 12 Jun 2019 19:13:03 +0000 Received: from c-67-170-167-181.hsd1.or.comcast.net (EHLO [192.168.1.115]) ([67.170.167.181]) by smtp415.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 61b8c36a736c57820c4825611c6738d1; Wed, 12 Jun 2019 19:13:00 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: kern_execve using vm_page_zero_invalid but not vm_page_set_validclean to load /sbin/init ? From: Mark Millard In-Reply-To: Date: Wed, 12 Jun 2019 12:12:57 -0700 Cc: Alfredo Dal Ava Junior , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: References: <1464D960-A1D6-404A-BB10-E615E2D14C1D@yahoo.com> <4003198F-C11B-4587-910B-2001DC09F538@yahoo.com> <47E002B7-D4A1-4C4B-BFFD-D926263D895E@yahoo.com> <48148449-93B0-446C-AA28-F211FFAE1A8B@yahoo.com> <86F7C4C4-2BB6-40F0-B5D3-C80ECB4A97CF@yahoo.com> To: FreeBSD Toolchain , FreeBSD Hackers , freeBSD PowerPC ML , Conrad Meyer X-Mailer: Apple Mail (2.3445.104.11) X-Rspamd-Queue-Id: C2C7189E9C X-Spamd-Bar: +++ X-Spamd-Result: default: False [3.35 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; MV_CASE(0.50)[]; FREEMAIL_FROM(0.00)[yahoo.com]; RCPT_COUNT_FIVE(0.00)[6]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; MX_GOOD(-0.01)[cached: mta6.am0.yahoodns.net]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:26101, ipnet:74.6.128.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(0.09)[0.086,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; NEURAL_SPAM_MEDIUM(0.29)[0.293,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.96)[0.961,0]; RCVD_IN_DNSWL_NONE(0.00)[41.130.6.74.list.dnswl.org : 127.0.5.0]; IP_SCORE(1.52)[ip: (4.91), ipnet: 74.6.128.0/21(1.52), asn: 26101(1.21), country: US(-0.06)]; RWL_MAILSPIKE_POSSIBLE(0.00)[41.130.6.74.rep.mailspike.net : 127.0.0.17] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jun 2019 19:13:11 -0000 [Looks to me like the ->valid mask only is used for the last page of the /sbin/init file, not based on the size and alignment of the data requested for the PT_LOAD.] On 2019-Jun-11, at 21:53, Mark Millard wrote: > [The garbage after .got up to the page boundary is > .comment section strings. The context here is > targeting 32-bit powerpc via system-clang-8 and > devel/powerpc64-binutils for buildworld and > buildkernel . ] >=20 > On 2019-Jun-11, at 19:55, Mark Millard wrote: >=20 >> [I have confirmed .sbss not being zero'd out and environ >> thereby starting out non-zero (garbage): a >> debug.minidump=3D0 style dump.] >>=20 >>> On 2019-Jun-10, at 16:19, Mark Millard wrote: >>>=20 >>> . . . (omitted) . . . >>=20 >> I used debug.minidump=3D0 in /boot/loader.conf for >> cusing a dump for the crash and a libkvm modified >> enough for my working boot environment to allow me >> to examine the the memory-image bytes of such a dump, >> with libkvm used via /usr/local/bin/kgdb . (No support >> of automatically translating user-space addresses >> or other such.) >>=20 >> For the clang based debug buildworld and debug buildkernel >> context with /sbin/init having: >>=20 >> [16] .got PROGBITS 01956ccc 146ccc 000010 04 WAX = 0 0 4 >> [17] .sbss NOBITS 01956cdc 146cdc 0000b0 00 WA = 0 0 4 >> [18] .bss NOBITS 01956dc0 146cdc 02ee28 00 WA = 0 0 64 >>=20 >> I confirmed that .sbss in /sbin/init's address space >> is not zeroed (so environ is not assigned by handle_argv ). >> I also confirmed that _start was given a good env value >> (in %r5) based on where the value was stored on the >> stack. It is just that the value was not used. >>=20 >> The detailed obvious-failure point (crash) can change based >> on the garbage in the .sbss and, for the build that I used >> this time, that happened in __je_arean_malloc_hard instead >> of before _init_tls called _libc_allocate_tls . (I traced >> the call chain in the dump.) >>=20 >>=20 >> =46rom what I've seen in the dump there seem to be special >> uses of some values (that also have normal uses, of >> course): >>=20 >> 0xfa5005af: as yet invalid page content. >> 0x1c000020: as yet unassigned user-space-stack memory for /sbin/init. >>=20 >> These are the same locations that I previously reported as >> showing up in the DSI read trap reports for /sbin/init failing. >> The specific build here failed with a different value. >>=20 >> For reference relative to libkvm: >>=20 >> # svnlite diff /usr/src/lib/libkvm/ >> Index: /usr/src/lib/libkvm/kvm_powerpc.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- /usr/src/lib/libkvm/kvm_powerpc.c (revision 347549) >> +++ /usr/src/lib/libkvm/kvm_powerpc.c (working copy) >> @@ -211,6 +211,53 @@ >> if (be32toh(vm->ph->p_paddr) =3D=3D 0xffffffff) >> return ((int)powerpc_va2off(kd, va, ofs)); >>=20 >> + // HACK in something for what I observe in >> + // a debug.minidump=3D0 vmcore.* for 32-bit powerpc >> + // >> + if ( be32toh(vm->ph->p_vaddr) =3D=3D 0xffffffff >> + && be32toh(vm->ph->p_paddr) =3D=3D 0 >> + && be16toh(vm->eh->e_phnum) =3D=3D 1 >> + ) { >> + // Presumes p_memsz is either unsigned >> + // 32-bit or is 64-bit, same for va . >> + >> + if (be32toh(vm->ph->p_memsz) <=3D va) >> + return 0; // Like powerpc_va2off >> + >> + // If ofs was (signed) 32-bit there >> + // would be a problem for sufficiently >> + // large postive memsz's and va's >> + // near the end --because of p_offset >> + // and dmphdrsz causing overflow/wrapping >> + // for some large va values. >> + // Presumes 64-bit ofs for such cases. >> + // Also presumes dmphdrsz+p_offset >> + // is non-negative so that small >> + // non-negative va values have no >> + // problems with ofs going negative. >> + >> + *ofs =3D vm->dmphdrsz >> + + be32toh(vm->ph->p_offset) >> + + va; >> + >> + // The normal return value overflows/wraps >> + // for p_memsz =3D=3D 0x80000000u when va =3D=3D 0 . >> + // Avoid this by depending on calling code's >> + // loop for sufficiently large cases. >> + // This code presumes p_memsz/2 <=3D MAX_INT . >> + // 32-bit powerpc FreeBSD does not allow >> + // using more than 2 GiBytes of RAM but >> + // does allow using 2 GiBytes on 64-bit >> + // hardware. >> + // >> + if ( (int)be32toh(vm->ph->p_memsz) < 0 >> + && va < be32toh(vm->ph->p_memsz)/2 >> + ) >> + return be32toh(vm->ph->p_memsz)/2; >> + >> + return be32toh(vm->ph->p_memsz) - va; >> + } >> + >> _kvm_err(kd, kd->program, "Raw corefile not supported"); >> return (0); >> } >> Index: /usr/src/lib/libkvm/kvm_private.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- /usr/src/lib/libkvm/kvm_private.c (revision 347549) >> +++ /usr/src/lib/libkvm/kvm_private.c (working copy) >> @@ -131,7 +131,9 @@ >> { >>=20 >> return (kd->nlehdr.e_ident[EI_CLASS] =3D=3D class && >> - kd->nlehdr.e_type =3D=3D ET_EXEC && >> + ( kd->nlehdr.e_type =3D=3D ET_EXEC || >> + kd->nlehdr.e_type =3D=3D ET_DYN >> + ) && >> kd->nlehdr.e_machine =3D=3D machine); >> } >>=20 >>=20 >>=20 >=20 > The following is was is in the .sbss/.bss up to the page > boundry (after the .got bytes): >=20 > (kgdb) x/s 0x2a66cdc > 0x2a66cdc: "$FreeBSD: head/lib/csu/powerpc/crt1.c 326219 2017-11-26 = 02:00:33Z pfg $" >=20 > (kgdb) x/s 0x2a66d24 > 0x2a66d24: "$FreeBSD: head/lib/csu/common/crtbrand.c 340701 = 2018-11-20 20:59:49Z emaste $" >=20 > (kgdb) x/s 0x2a66d72 > 0x2a66d72: "$FreeBSD: head/lib/csu/common/ignore_init.c 340702 = 2018-11-20 21:04:20Z emaste $" >=20 > (kgdb) x/s 0x2a66dc3 > 0x2a66dc3: "FreeBSD clang version 8.0.0 (tags/RELEASE_800/final = 356365) (based on LLVM 8.0.0)" >=20 > (kgdb) x/s 0x2a66e15 > 0x2a66e15: "$FreeBSD: head/lib/csu/powerpc/crti.S 217399 2011-01-14 = 11:34:58Z kib $" >=20 > (kgdb) x/s 0x2a66e5d > 0x2a66e5d: "$FreeBSD: head/sbin/mount/getmntopts.c 326025 = 2017-11-20 19:49:47Z pfg $" >=20 > (kgdb) x/s 0x2a66ea6 > 0x2a66ea6: "$FreeBSD: head/lib/libutil/login_tty.c 334106 = 2018-05-23 17:02:12Z jhb $" >=20 > (kgdb) x/s 0x2a66eef > 0x2a66eef: "$FreeBSD: head/lib/libutil/login_class.c 296723 = 2016-03-12 14:54:34Z kib $" >=20 > (kgdb) x/s 0x2a66f83 > 0x2a66f83: "$FreeBSD: head/lib/libutil/_secure_path.c 139012 = 2004-12-18 12:31:12Z ru $" >=20 > (kgdb) x/s 0x2a66fce > 0x2a66fce: "$FreeBSD: head/lib/libcrypt/crypt.c 326219 2017-11 >=20 > (I truncated that last to avoid the 0xfa5005af's on the next page > in RAM.) >=20 > Compare ( from readelf /sbin/init ): >=20 > String dump of section '.comment': > [ 0] $FreeBSD: head/lib/csu/powerpc/crt1.c 326219 2017-11-26 = 02:00:33Z pfg $ > [ 48] $FreeBSD: head/lib/csu/common/crtbrand.c 340701 2018-11-20 = 20:59:49Z emaste $ > [ 96] $FreeBSD: head/lib/csu/common/ignore_init.c 340702 = 2018-11-20 21:04:20Z emaste $ > [ e7] FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) = (based on LLVM 8.0.0) > [ 139] $FreeBSD: head/lib/csu/powerpc/crti.S 217399 2011-01-14 = 11:34:58Z kib $ > [ 181] $FreeBSD: head/sbin/mount/getmntopts.c 326025 2017-11-20 = 19:49:47Z pfg $ > [ 1ca] $FreeBSD: head/lib/libutil/login_tty.c 334106 2018-05-23 = 17:02:12Z jhb $ > [ 213] $FreeBSD: head/lib/libutil/login_class.c 296723 2016-03-12 = 14:54:34Z kib $ > [ 25e] $FreeBSD: head/lib/libutil/login_cap.c 317265 2017-04-21 = 19:27:33Z pfg $ > [ 2a7] $FreeBSD: head/lib/libutil/_secure_path.c 139012 2004-12-18 = 12:31:12Z ru $ > [ 2f2] $FreeBSD: head/lib/libcrypt/crypt.c 326219 2017-11-26 = 02:00:33Z pfg $ > . . . >=20 > Note: >=20 > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg = Align > LOAD 0x000000 0x01800000 0x01800000 0x140ad4 0x140ad4 R E = 0x10000 > LOAD 0x140ae0 0x01950ae0 0x01950ae0 0x061fc 0x35108 RWE = 0x10000 > NOTE 0x0000d4 0x018000d4 0x018000d4 0x00048 0x00048 R 0x4 > TLS 0x140ae0 0x01950ae0 0x01950ae0 0x00b10 0x00b1d R = 0x10 > GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW = 0x10 >=20 > Section to Segment mapping: > Segment Sections... > 00 .note.tag .init .text .fini .rodata .eh_frame=20 > 01 .tdata .tbss .init_array .fini_array .ctors .dtors .jcr = .data.rel.ro .data .got .sbss .bss=20 > 02 .note.tag=20 > 03 .tdata .tbss=20 > 04 =20 > There are 24 section headers, starting at offset 0x16cec8: >=20 > Section Headers: > [Nr] Name Type Addr Off Size ES Flg = Lk Inf Al > . . . > [16] .got PROGBITS 01956ccc 146ccc 000010 04 WAX = 0 0 4 > [17] .sbss NOBITS 01956cdc 146cdc 0000b0 00 WA = 0 0 4 > [18] .bss NOBITS 01956dc0 146cdc 02ee28 00 WA = 0 0 64 > [19] .comment PROGBITS 00000000 146cdc 0073d4 01 MS = 0 0 1 >=20 > It looks like material after the .got is being copied, > spanning the in-file-empty .sbss and .bss sections and > implicitly initializing (the first part of) those > sections. The ->valid assignments appears to trace to code like: /* * The last page has valid blocks. Invalid part can only * exist at the end of file, and the page is made fully valid * by zeroing in vm_pager_get_pages(). */ if (m[count - 1]->valid !=3D 0 && --count =3D=3D 0) { if (iodone !=3D NULL) iodone(arg, m, 1, 0); return (VM_PAGER_OK); } independent of if the requested data does not span into the last page but does not span to the end of a page. So it appears that the use of: QUOTE vm_imgact_map_page uses vm_imgact_hold_page. vm_imgact_hold_page uses vm_pager_get_pages. vm_pager_get_pages uses vm_page_zero_invalid to "Zero out partially filled data" END QUOTE simply does not do the right thing for .sbss or .bss handling. The m->valid related code for zeroing is basically irrelevant to .sbss and .bss. Note that the below code requires a m->valid bit to be asserted in order to do any pmap_zero_page_area operations. Thus it does not zero out pages that are completely invalid either. This explains why I see 0xfa5005af on the full pages in the .sbss/.bss area for debug builds: nothing is zeroing the full pages either. void vm_page_zero_invalid(vm_page_t m, boolean_t setvalid) { int b; int i; VM_OBJECT_ASSERT_WLOCKED(m->object); /* * Scan the valid bits looking for invalid sections that * must be zeroed. Invalid sub-DEV_BSIZE'd areas ( where the * valid bit may be set ) have already been zeroed by * vm_page_set_validclean(). */ for (b =3D i =3D 0; i <=3D PAGE_SIZE / DEV_BSIZE; ++i) { if (i =3D=3D (PAGE_SIZE / DEV_BSIZE) || (m->valid & ((vm_page_bits_t)1 << i))) { if (i > b) { pmap_zero_page_area(m, b << DEV_BSHIFT, (i - b) << = DEV_BSHIFT); } b =3D i + 1; } } /* * setvalid is TRUE when we can safely set the zero'd areas * as being valid. We can do this if there are no cache = consistancy * issues. e.g. it is ok to do with UFS, but not ok to do with = NFS. */ if (setvalid) m->valid =3D VM_PAGE_BITS_ALL; } This code simply does not do the right thing for .sbss and .bss handling. __start in /sbin/init (for example) expects .sbss and .bss to have already been initialized to zero (and possibly further adjusted after that for something like environ). So far I find nothing to cover that. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-hackers@freebsd.org Wed Jun 12 21:51:54 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3C54D15C2B9D for ; Wed, 12 Jun 2019 21:51:54 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-qt1-x836.google.com (mail-qt1-x836.google.com [IPv6:2607:f8b0:4864:20::836]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5C0BD8EC46 for ; Wed, 12 Jun 2019 21:51:53 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-qt1-x836.google.com with SMTP id z24so7014976qtj.10 for ; Wed, 12 Jun 2019 14:51:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=IHX6q6EL4PE01N4FF6E7UZlIvkC/Klw5xpXONy0dEqo=; b=U7aNrfWu8OIJ2TPkszwUPcMu0wi/lXcUmBbQcJzcO3gnDeLRWtIBAodMRgTNDmUyII b3VfmxHcfefe+uxM4joUky6xU/zBjYJ1iYCbqbCtFV1EAKrCcCk2VpJbUNUKH4Ui+uyd h5miu1jtm74Wa3iCpIIRTfFUEsV7DZqVE/6vFcI+kM75jAxvECMcmgJ+iq5hYo7nzeqM W8uFK0MIdj7wWnXPlVUPLvneLHoOKg1V98rENl4EYDJD7PiY+yP6KYj35DOmgCXUJT3G duxoYUUJWHEZI1A2mrcp4GJ9zLQ1OjnCtb/VagtaOOH+IiRLLCypldTvV2HxQ+sVl0iZ LyLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=IHX6q6EL4PE01N4FF6E7UZlIvkC/Klw5xpXONy0dEqo=; b=MNGvBu/t57y0qtzIA1F5W8B/7ocLLLYoc+bXHsI6PYtzG5S1S+lUDB7CNRYVWTN5XO 1AejupuxWoMWCdhyan9Gpj048lLcV69IJH89P/EpO7tsF/thYNW8lAeFfecu3YtyXtHd /cXRQTPA8dnAHfR9P+wyaqkzCdurFK71gb07mnUcYPqgLH6m40B42ZgsO618MytSIVWH hq2NXJc+f1SZVJwEPNxv5gW2zOg+qkvG+KZlz+W5n2BgYdC0VLAHyZTPIW9vJey4duo2 mBZpz/I7L1gryEmgi32v7Czv+u8T5mwampQK06p2VEmTPL/x1Q9oj5zIuc8rNhCbM/om wpTA== X-Gm-Message-State: APjAAAW4W/Qcmn3j67CyDRQrXhI6v5wksFdcxjKk9a/0SlKg9c+Eq7XX AuI0CgbJ8Fnk5eZQXhnbg/stq8w/d7Qder5H2XXTzw== X-Google-Smtp-Source: APXvYqzaUycQN0Gng/T/WLZA4/NhfpHZjdhABdDb4H1lYrix2SLqUZsl67LSZqELrhgbIMAvADWsAYEMSypP+gJ6x5Q= X-Received: by 2002:aed:3e1d:: with SMTP id l29mr59029491qtf.175.1560376312459; Wed, 12 Jun 2019 14:51:52 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Warner Losh Date: Wed, 12 Jun 2019 15:51:41 -0600 Message-ID: Subject: Re: Dev:Ciss: A kernel address leakage in sys/dev/ciss/ciss.c To: Fuqian Huang Cc: "freebsd-hackers@freebsd.org" X-Rspamd-Queue-Id: 5C0BD8EC46 X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bsdimp-com.20150623.gappssmtp.com header.s=20150623 header.b=U7aNrfWu X-Spamd-Result: default: False [-5.94 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20150623.gappssmtp.com:+]; RCPT_COUNT_TWO(0.00)[2]; MX_GOOD(-0.01)[cached: ALT1.aspmx.l.google.com]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; FREEMAIL_TO(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+,1:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20150623.gappssmtp.com:s=20150623]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_SHORT(-0.97)[-0.967,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TAGGED_RCPT(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; DMARC_NA(0.00)[bsdimp.com]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[6.3.8.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; R_SPF_NA(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; IP_SCORE(-2.97)[ip: (-9.30), ipnet: 2607:f8b0::/32(-3.16), asn: 15169(-2.30), country: US(-0.06)] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Jun 2019 21:51:54 -0000 On Wed, Jun 12, 2019 at 7:02 AM Fuqian Huang wrote: > In freebsd/sys/dev/ciss/ciss.c, function ciss_print_request will dump > the address of a kernel object cr to user space. Each time when a > device is detached, it will call > ciss_free->ciss_notify_abort->ciss_print_request, and this finally > dump a kernel address to user space. > This is, at best, a theoretical concern. ciss_detach isn't called except when detaching the device. This only happens if you are unloading the module or using devctl to detach it. Second, the bit you chopped out of ciss_detach ensure that the controller isn't open. Close is only called when there's no pending requests from geom to the device, and we get called for the LAST close, meaning nothing else has it open. This means there will be no commands to abort when ciss_notify_abort() is called. Since there's no commands to abort, there will be no commands that are printed, so no user address will be disclosed. Having said that, do you have a test case that can trigger this? It would be most unexpected indeed... Warner > static int > ciss_detach(device_t dev) > { > struct ciss_softc *sc = device_get_softc(dev); > ... > ciss_free(sc); > return (0); > } > > static void > ciss_free(struct ciss_softc *sc) > { > ... > -> ciss_notify_abort(sc); > ... > } > > static int > ciss_notify_abort(struct ciss_softc *sc) > { > struct ciss_request *cr; > ... > if ((error = ciss_get_request(sc, &cr)) > goto out; > ... > -> ciss_print_request(cr); > ... > } > > static void > ciss_print_request(struct ciss_request *cr) > { > struct ciss_softc *sc; > ... > sc = cr->cr_sc; > ... > -> ciss_printf(sc, "REQUEST @ %p\n", cr); > ciss_printf(sc, " data %p/%d tag %d flags %b\n", > cr->cr_data, cr->cr_length, cr->cr_tag, cr->cr_flags, > "\20\1mapped\2sleep\3poll\4dataout\5datain\n"); > } > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From owner-freebsd-hackers@freebsd.org Thu Jun 13 02:54:59 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2A40315C92D8 for ; Thu, 13 Jun 2019 02:54:59 +0000 (UTC) (envelope-from huangfq.daxian@gmail.com) Received: from mail-yb1-xb44.google.com (mail-yb1-xb44.google.com [IPv6:2607:f8b0:4864:20::b44]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3D25F96A67 for ; Thu, 13 Jun 2019 02:54:58 +0000 (UTC) (envelope-from huangfq.daxian@gmail.com) Received: by mail-yb1-xb44.google.com with SMTP id h17so2414682ybm.0 for ; Wed, 12 Jun 2019 19:54:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=y4Ewf1aCEXBlE6VJa4PxW0CyWV8AlbJVCvP1Mpa1foM=; b=NxahEAp3H6kjP+yCctqBbSH4OzKQuzOsfoHQ7Rh2NxKJ6o6VPtflo+D01Dnr5NMnwP K96mIdr7dVnuBv8zIQ3Pfxm5HL4h3UMbBGle4/EKhYEVNc+pv2Qbgwt50t34EQWySzMN yEAJdu2uyW2EyGSS6c4lazUhsk87oeyzCG+kVcqCoajVQfTHu8WivvvtEU05JbpQ7lq/ fKaiud3xMRdlgg24Ikj0H3HsJJVBZsF5ZAqU+8pNmaA4BDgp4HnJA9qn0PV3643u80AC ogiNb7ldSf2+W/NTL4sgwH2wtJVRCMjnOxtS72z9Mkvpo9VA3ehq4BF1Q9f8sf89xu/b 1eIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=y4Ewf1aCEXBlE6VJa4PxW0CyWV8AlbJVCvP1Mpa1foM=; b=hX0/+Nn2wMg/zhM4OlIPwYCc8EU9z17OLeCqClLn8PoMmZD2KFRAkWYfl7iQ+qhlRJ xYilKe/JxCERUB6Vr6RU/BlxdXApGk0wXAS8wZFt9CfRregOs6e7o/eMsjmhD97c5nN3 unh3UB+3mv5JBJI6Cr8i7bMCoVd9Gjtveb4EP1U/foaQUXUp6HkrZloAedr3iXnz7pLt AZiQ5ryW4ASSOk5wXXoohW2keT+lqU134ar935mfliTXVULBPGXJFRxHUqPLpJjJH44e TB4TBou40cs4oCdFZhawbNWfpNO/AXGEuiTJK6mrHTQvkinqTGmCXCiS/iLn4sM09BjF H2eA== X-Gm-Message-State: APjAAAUgIjW2SGFXMQZyVFY4cxd4mZ66OdGyrVkTgNliH5Ro5aqp+47/ VXIZ+8ERCVeKRcmavHrj5HlTxurC0G/+i3uXiAlFQQ== X-Google-Smtp-Source: APXvYqyNszGRrR5biyAOGV1QYFeAE/fJTfb0AVxijOolnCj8Bwtxio9n0ys2iLZQklPudazK6WP1Tb6PzwlQiTToxOE= X-Received: by 2002:a25:bd91:: with SMTP id f17mr41448396ybh.509.1560394497574; Wed, 12 Jun 2019 19:54:57 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Fuqian Huang Date: Thu, 13 Jun 2019 10:54:46 +0800 Message-ID: Subject: Re: Dev:Ciss: A kernel address leakage in sys/dev/ciss/ciss.c To: Warner Losh Cc: "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3D25F96A67 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=NxahEAp3; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of huangfqdaxian@gmail.com designates 2607:f8b0:4864:20::b44 as permitted sender) smtp.mailfrom=huangfqdaxian@gmail.com X-Spamd-Result: default: False [-4.04 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.76)[-0.756,0]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[4.4.b.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; IP_SCORE(-0.28)[ip: (4.14), ipnet: 2607:f8b0::/32(-3.16), asn: 15169(-2.31), country: US(-0.06)]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jun 2019 02:54:59 -0000 But, why there will be no commands that are printed? 'cr' is get from ciss_get_request and 'cr->cr_data' is the result of malloc in ciss_notify_abort, and they are freed after the 'out' label. At the printing point, some address has been printed out. I know what you mean that this only happens when detaching the device. But it seems that some address is printed out before the free operation, and is it necessary to print the address? Warner Losh =E6=96=BC 2019=E5=B9=B46=E6=9C=8813=E6=97=A5= =E9=80=B1=E5=9B=9B =E4=B8=8A=E5=8D=885:51=E5=AF=AB=E9=81=93=EF=BC=9A > > > > On Wed, Jun 12, 2019 at 7:02 AM Fuqian Huang w= rote: >> >> In freebsd/sys/dev/ciss/ciss.c, function ciss_print_request will dump >> the address of a kernel object cr to user space. Each time when a >> device is detached, it will call >> ciss_free->ciss_notify_abort->ciss_print_request, and this finally >> dump a kernel address to user space. > > > This is, at best, a theoretical concern. ciss_detach isn't called except = when detaching the device. This only happens if you are unloading the modul= e or using devctl to detach it. Second, the bit you chopped out of ciss_det= ach ensure that the controller isn't open. Close is only called when there'= s no pending requests from geom to the device, and we get called for the LA= ST close, meaning nothing else has it open. This means there will be no com= mands to abort when ciss_notify_abort() is called. Since there's no command= s to abort, there will be no commands that are printed, so no user address = will be disclosed. > > Having said that, do you have a test case that can trigger this? It would= be most unexpected indeed... > > Warner > >> >> static int >> ciss_detach(device_t dev) >> { >> struct ciss_softc *sc =3D device_get_softc(dev); >> ... >> ciss_free(sc); >> return (0); >> } >> >> static void >> ciss_free(struct ciss_softc *sc) >> { >> ... >> -> ciss_notify_abort(sc); >> ... >> } >> >> static int >> ciss_notify_abort(struct ciss_softc *sc) >> { >> struct ciss_request *cr; >> ... >> if ((error =3D ciss_get_request(sc, &cr)) >> goto out; >> ... >> -> ciss_print_request(cr); >> ... >> } >> >> static void >> ciss_print_request(struct ciss_request *cr) >> { >> struct ciss_softc *sc; >> ... >> sc =3D cr->cr_sc; >> ... >> -> ciss_printf(sc, "REQUEST @ %p\n", cr); >> ciss_printf(sc, " data %p/%d tag %d flags %b\n", >> cr->cr_data, cr->cr_length, cr->cr_tag, cr->cr_flags, >> "\20\1mapped\2sleep\3poll\4dataout\5datain\n"); >> } >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.or= g" From owner-freebsd-hackers@freebsd.org Thu Jun 13 05:02:31 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 862B015CBB57 for ; Thu, 13 Jun 2019 05:02:31 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 38D466B668 for ; Thu, 13 Jun 2019 05:02:30 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-qt1-x835.google.com with SMTP id p15so1755537qtl.3 for ; Wed, 12 Jun 2019 22:02:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cjEEqS1T1py9UYbvil/IjCXRJQfgZqfldpkYSVBpdQE=; b=qlMiqEBY1a5VVQBoO79t+0EVI85xl20uidNN0GdlEwjwRl3Lnc9ZE+cY1BoatLGrDY if03QuBXB/PCi6CxpEEPr/PDjbPtk4AvcBIJmKC7wHxJrmNku5SP9l+HlgSy3BY2bifQ fyUwNSnKU+idGWxDU9kBMTSUoPQs6d5wg3neBui/vOKcb3Ok04vK66hTosBhixcE0iwd AodQOsTfyGJnQWIjid7J73PaxrkWepmNdAOxhjN8dm5qkx8mMARQ2P+c10lUJiS0wbJY 33HwYsVV7rIP1iaeLULmo4onLwfZ5KnM8YWbKNEmFDHV8RWbDtuda6p9exgBZvPo18R0 UlMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cjEEqS1T1py9UYbvil/IjCXRJQfgZqfldpkYSVBpdQE=; b=OTOpazGpde4VbN0Q+o7hkayVVnFolEVyoDlTwOxOdeNGU3Caszwj5OOXCedUz/AV97 5Z19nrVnwnLCc4Hb3OGkrdVNt7D4IYNXWoLi7uPyHxfsuEn7sbOLU8pK+WSurQCDfOHA 9EAUsduMUyJgtmLECSz7GHVnT3bu4NAS/n9wRwcbRKyj5cnNKOxbmYkYaRFreHsZeMjR YexRranHi0rbb0iDsekNIECdpL6vC3Ljz1/IU7werv6C/Lch1zmePZLhFB4h2zCCTuEz c54UyxmFvd5t31tGgzJd35iGHUFb/UxePkSqthaoaF/8j5Qv+edjROhwIqz+MUg7kXRW CWnQ== X-Gm-Message-State: APjAAAUTW+rr9dFcT9OCU4bcMB8sWylRvSbaGbJdBHiSmB6XkpGdJE0V n89Y7qWj3jUONlH7Lp22yy7G7+rew7YNDHIZiQySLooXxEQ= X-Google-Smtp-Source: APXvYqzNpygNkSl+JvEdTeCCxON8pVPJJJajLf8T+9Bvm6WELuaHe5NWOn9sQxriQg0CwyphB1jdBb4bxllFz27CKPM= X-Received: by 2002:a0c:d91b:: with SMTP id p27mr1818877qvj.236.1560402148512; Wed, 12 Jun 2019 22:02:28 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Warner Losh Date: Wed, 12 Jun 2019 23:02:16 -0600 Message-ID: Subject: Re: Dev:Ciss: A kernel address leakage in sys/dev/ciss/ciss.c To: Fuqian Huang Cc: FreeBSD Hackers X-Rspamd-Queue-Id: 38D466B668 X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bsdimp-com.20150623.gappssmtp.com header.s=20150623 header.b=qlMiqEBY X-Spamd-Result: default: False [-5.98 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20150623.gappssmtp.com:s=20150623]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_SHORT(-0.97)[-0.974,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TAGGED_RCPT(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; DMARC_NA(0.00)[bsdimp.com]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20150623.gappssmtp.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[5.3.8.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; MX_GOOD(-0.01)[cached: ALT1.aspmx.l.google.com]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; FREEMAIL_TO(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+,1:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com]; IP_SCORE(-2.99)[ip: (-9.44), ipnet: 2607:f8b0::/32(-3.16), asn: 15169(-2.31), country: US(-0.06)]; RCVD_COUNT_TWO(0.00)[2] Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jun 2019 05:02:31 -0000 Because CISS_FLAG_NOTIFY_OK will be clear in sc->ciss_flags. It's only set when we have the notifier armed, which is only armed when we have outanding commands in the CISS hardware. Since we know that all the outstanding requests have completed before the close, it should be clear. But again, this only gets called at the end of ciss_attach (on the error path before the notifier is setup) or in ciss_detach (which is only called when the ciss driver is unloaded with the device attached, but unused, or if the user has administratively put the interface down). So even if I've read things and a notifier is running, the only way to provoke it is to do a root-only operation. Disclosing kernel addresses to root doesn't seem like a big deal, especially since the dmesg can be secured from non-root users. This is why I've said this is more theoretical than actual. You need root permissions to provoke it, and then root permissions to read the dmesg. I'll commit a #if 0 out of an abundance of caution, but I'm having trouble seeing how non-root users could provoke it. Then again, the ciss driver is for older hardware that is somewhat rare these days. Thanks for the report... Warner On Wed, Jun 12, 2019 at 8:54 PM Fuqian Huang wrote: > But, why there will be no commands that are printed? > 'cr' is get from ciss_get_request and 'cr->cr_data' is the result of > malloc in ciss_notify_abort, and they are freed after the 'out' label. > At the printing point, some address has been printed out. > I know what you mean that this only happens when detaching the device. > But it seems that some address is printed out before the free > operation, and is it necessary to print the address? > > Warner Losh =E6=96=BC 2019=E5=B9=B46=E6=9C=8813=E6=97=A5= =E9=80=B1=E5=9B=9B =E4=B8=8A=E5=8D=885:51=E5=AF=AB=E9=81=93=EF=BC=9A > > > > > > > > On Wed, Jun 12, 2019 at 7:02 AM Fuqian Huang > wrote: > >> > >> In freebsd/sys/dev/ciss/ciss.c, function ciss_print_request will dump > >> the address of a kernel object cr to user space. Each time when a > >> device is detached, it will call > >> ciss_free->ciss_notify_abort->ciss_print_request, and this finally > >> dump a kernel address to user space. > > > > > > This is, at best, a theoretical concern. ciss_detach isn't called excep= t > when detaching the device. This only happens if you are unloading the > module or using devctl to detach it. Second, the bit you chopped out of > ciss_detach ensure that the controller isn't open. Close is only called > when there's no pending requests from geom to the device, and we get call= ed > for the LAST close, meaning nothing else has it open. This means there wi= ll > be no commands to abort when ciss_notify_abort() is called. Since there's > no commands to abort, there will be no commands that are printed, so no > user address will be disclosed. > > > > Having said that, do you have a test case that can trigger this? It > would be most unexpected indeed... > > > > Warner > > > >> > >> static int > >> ciss_detach(device_t dev) > >> { > >> struct ciss_softc *sc =3D device_get_softc(dev); > >> ... > >> ciss_free(sc); > >> return (0); > >> } > >> > >> static void > >> ciss_free(struct ciss_softc *sc) > >> { > >> ... > >> -> ciss_notify_abort(sc); > >> ... > >> } > >> > >> static int > >> ciss_notify_abort(struct ciss_softc *sc) > >> { > >> struct ciss_request *cr; > >> ... > >> if ((error =3D ciss_get_request(sc, &cr)) > >> goto out; > >> ... > >> -> ciss_print_request(cr); > >> ... > >> } > >> > >> static void > >> ciss_print_request(struct ciss_request *cr) > >> { > >> struct ciss_softc *sc; > >> ... > >> sc =3D cr->cr_sc; > >> ... > >> -> ciss_printf(sc, "REQUEST @ %p\n", cr); > >> ciss_printf(sc, " data %p/%d tag %d flags %b\n", > >> cr->cr_data, cr->cr_length, cr->cr_tag, cr->cr_flags, > >> "\20\1mapped\2sleep\3poll\4dataout\5datain\n"); > >> } > >> _______________________________________________ > >> freebsd-hackers@freebsd.org mailing list > >> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > >> To unsubscribe, send any mail to " > freebsd-hackers-unsubscribe@freebsd.org" > From owner-freebsd-hackers@freebsd.org Thu Jun 13 06:52:37 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 014C115CDE47 for ; Thu, 13 Jun 2019 06:52:37 +0000 (UTC) (envelope-from huangfq.daxian@gmail.com) Received: from mail-yb1-xb44.google.com (mail-yb1-xb44.google.com [IPv6:2607:f8b0:4864:20::b44]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1D22B6EFB5 for ; Thu, 13 Jun 2019 06:52:36 +0000 (UTC) (envelope-from huangfq.daxian@gmail.com) Received: by mail-yb1-xb44.google.com with SMTP id p8so7381850ybo.13 for ; Wed, 12 Jun 2019 23:52:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=JLOUZSRzCZHUWSt/jzHBk0waG5P9EjuEfJM1JGLjV10=; b=flA7XmGBd44CraXSvRVnU1IIiN7rZAwkQLpzaM9nFCdrJVeyy9FJMBbvintM8DdMa4 6t7eJEUo0p9w32UnhJv1hYHpD/GXiNLoNs7pKKSvVnJ/UFkS2dZGdFE27jDRn940rxfJ J4ehSKIw7v2Tq9q3TXU0z6ysmqwWhovg+y4x6JKTwEp1pSbsnbzsY9XK7IFt8CRJpKxS OGpE+cskVdT9BdD7MeIS9kEkixSmDGiGqH2iIijD0Y1dS/rD+jM+GK2u3cdazAxy96a1 qSD0/pCWd4zyuZN0ePcArlxwXnyv6/DXk6r8aFihRdfy9K+oeRO6Lla7QCYFiRcE8Z56 lHkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=JLOUZSRzCZHUWSt/jzHBk0waG5P9EjuEfJM1JGLjV10=; b=BWFvsdnopc6x2utDKcxmIrYjgsJjQruF0QyVY+9iHxxLFOcH35C1l8NeL8gX5ak9uT BK6T6sRbMju9ROk0YCGQUbqPFQ/ZvslrerTkFdiD9xfYNOLHs6AOo0h92vmx0Y9ODL2R nBhI+HG7fIV0y4lqy4JEKjMKaxVdCtqZeHHAy+czE0wOu/q3ppAjgliawQCfYgPDHiOi /8Nlh1aAQFXGKCIL1OZ4QjHGHMV+HA9sbhpBCYhz+knsyB44HRDaeGY/MmzxTfI4QnWD I/TheaSUJkR9qjGMcskHuq53eCaHH5KZlVe1HUKsjPMmr3JfsQ/a5OCOWWzcSIlsQF2R PECw== X-Gm-Message-State: APjAAAUXsEYmlg+1HI1mxuOp7gxsHkkW7bMIGLAlZ560YDT8zZHrYbN1 VOXv/u0ZtxyAV70Dze/gK4+lqrAiZU4sWFGa0YX/IxZv X-Google-Smtp-Source: APXvYqxw72YlsgXyEIXmFd7nSH05Vwc9g7wbVzNUfb+3Lnk5lEPXDCto8UI9EfRe22KDBtMHnO/yo/eaP3bPNd52wsU= X-Received: by 2002:a5b:9c9:: with SMTP id y9mr42154268ybq.500.1560408755506; Wed, 12 Jun 2019 23:52:35 -0700 (PDT) MIME-Version: 1.0 From: Fuqian Huang Date: Thu, 13 Jun 2019 14:52:24 +0800 Message-ID: Subject: dev:md: A kernel address leakage in sys/dev/md/md.c To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 1D22B6EFB5 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=flA7XmGB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of huangfqdaxian@gmail.com designates 2607:f8b0:4864:20::b44 as permitted sender) smtp.mailfrom=huangfqdaxian@gmail.com X-Spamd-Result: default: False [-3.95 / 15.00]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; TO_DN_NONE(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.64)[-0.639,0]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; TAGGED_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.998,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; IP_SCORE(-0.30)[ip: (4.01), ipnet: 2607:f8b0::/32(-3.17), asn: 15169(-2.31), country: US(-0.06)]; RCVD_IN_DNSWL_NONE(0.00)[4.4.b.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Jun 2019 06:52:37 -0000 In freebsd/sys/dev/md/md.c if the kernel is created with option MD_ROOT, g_md_init will call md_preload and use mfs_root as the image. In function md_preload, address of image will be printed out, in this case, the address of image is the address of a global object mfs_root. A kernel address leakage happens. Patch suggestion: use macro like #ifdef DEBUG to wrap the printf statement. u_char mfs_root[MD_ROOT_SIZE*1024] __attribute__ ((section("oldmfs"))); static void g_md_init(struct g_class *mp __unused) { ... #ifdef MD_ROOT ... #ifdef MD_ROOT_MEM md_preload(mfs_root, mfs_root_size, NULL); #else md_preload(__DEVOLATILE(u_char *, &mfs_root), mfs_root_size, NULL); #endif ... #endif } static void md_preload(u_char *image, size_t length, const char *name) { ... if (name != NULL) { printf("%s%d: Preloaded image <%s> %zd bytes at %p\n", MD_NAME, sc->unit, name, length, image); } else { printf("%s%d: Embedded image %zd bytes at %p\n", MD_NAME, sc->unit, length, image); } }