From owner-freebsd-hackers@FreeBSD.ORG Tue Oct 29 22:25:17 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 893A1C87 for ; Tue, 29 Oct 2013 22:25:17 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0B86B2CE1 for ; Tue, 29 Oct 2013 22:25:16 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r9TMP8X3011752; Wed, 30 Oct 2013 00:25:08 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r9TMP8X3011752 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r9TMP8Q9011746; Wed, 30 Oct 2013 00:25:08 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 30 Oct 2013 00:25:08 +0200 From: Konstantin Belousov To: Sebastian Huber Subject: Re: soo_close() vs. filt_soread() Message-ID: <20131029222508.GA59496@kib.kiev.ua> References: <526F8F58.1090307@embedded-brains.de> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="NgLNC/STdjXvE6Rt" Content-Disposition: inline In-Reply-To: <526F8F58.1090307@embedded-brains.de> User-Agent: Mutt/1.5.22 (2013-10-16) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Oct 2013 22:25:17 -0000 --NgLNC/STdjXvE6Rt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Oct 29, 2013 at 11:35:04AM +0100, Sebastian Huber wrote: > Hello, >=20 > I port currently the FreeBSD network stack to a real-time operating syste= m.=20 > The problem described below probably does not happen in a real FreeBSD ke= rnel.=20 > I have the following test case: >=20 > static void > test_kqueue_close(test_context *ctx) > { > /* The cfd is some socket with a connected TCP stream */ > int cfd =3D ctx->cfd; > int kq; > struct kevent change; > struct kevent event; > const struct timespec *timeout =3D NULL; > int rv; > ssize_t n; >=20 > puts("test kqueue close"); >=20 > assert(ctx->cfd >=3D 0); >=20 > kq =3D kqueue(); > assert(kq >=3D 0); >=20 > EV_SET(&change, cfd, EVFILT_READ, EV_ADD | EV_ENABLE, 0, 0, > TEST_UDATA); >=20 > rv =3D kevent(kq, &change, 1, NULL, 0, timeout); > assert(rv =3D=3D 0); >=20 > set_non_blocking(cfd, 1); >=20 > do { > errno =3D 0; > n =3D read(cfd, &ctx->buf[0], sizeof(ctx->buf)); > if (n =3D=3D -1) { > assert(errno =3D EAGAIN); > } > } while (n > 0); >=20 >=20 > /* This tells some background entity that we want to close cfd once keve= nt=20 > blocks */ > send_events(ctx, EVENT_CLOSE); >=20 > assert(ctx->cfd >=3D 0); >=20 > rv =3D kevent(kq, NULL, 0, &event, 1, timeout); > assert(rv =3D=3D 1); > assert(event.ident =3D=3D cfd); > assert(event.filter =3D=3D EVFILT_READ); > assert(event.flags =3D=3D 0); > assert(event.fflags =3D=3D 0); > assert(event.data =3D=3D 0); > assert(event.udata =3D=3D TEST_UDATA); >=20 > assert(ctx->cfd =3D=3D -1); >=20 > rv =3D close(kq); > assert(rv =3D=3D 0); > } >=20 > This test registers a read event and then blocks for this event. Once th= e=20 > current thread blocked, someone will delete the corresponding socket. I = have=20 > now a NULL pointer access. The socket close looks like this: >=20 > /* > * API socket close on file pointer. We call soclose() to close the soc= ket > * (including initiating closing protocols). soclose() will sorele() the > * file reference but the actual socket will not go away until the socke= t's > * ref count hits 0. > */ > /* ARGSUSED */ > int > soo_close(struct file *fp, struct thread *td) > { > int error =3D 0; > struct socket *so; >=20 > so =3D fp->f_data; > fp->f_ops =3D &badfileops; > fp->f_data =3D NULL; >=20 > if (so) > error =3D soclose(so); > return (error); > } >=20 > Please note that fp->f_data is set to NULL. >=20 > The close operation will end up in: >=20 > /*ARGSUSED*/ > static int > filt_soread(struct knote *kn, long hint) > { > struct socket *so; >=20 > so =3D kn->kn_fp->f_data; > SOCKBUF_LOCK_ASSERT(&so->so_rcv); >=20 > kn->kn_data =3D so->so_rcv.sb_cc - so->so_rcv.sb_ctl; > if (so->so_rcv.sb_state & SBS_CANTRCVMORE) { > kn->kn_flags |=3D EV_EOF; > kn->kn_fflags =3D so->so_error; > return (1); > } else if (so->so_error) /* temporary udp error */ > return (1); > else if (kn->kn_sfflags & NOTE_LOWAT) > return (kn->kn_data >=3D kn->kn_sdata); > else > return (so->so_rcv.sb_cc >=3D so->so_rcv.sb_lowat); > } >=20 > Here so =3D=3D NULL, due to soo_close(). >=20 > Breakpoint 11, filt_soread (kn=3D0x46ff74, hint=3D0) at=20 > freebsd/sys/kern/uipc_socket.c:3147 > 3147 so =3D kn->kn_fp->f_data; > (gdb) p so > $4 =3D (struct socket *) 0x0 > (gdb) p kn->kn_ptr.p_fp > $5 =3D (struct file *) 0x32cec8 > (gdb) where > #0 filt_soread (kn=3D0x46ff74, hint=3D0) at freebsd/sys/kern/uipc_socket= =2Ec:3147 > #1 0x0010506e in knote (list=3D0x3f1190, hint=3D0, lockflags=3D1) at=20 > freebsd/sys/kern/kern_event.c:1957 > #2 0x00191a60 in sowakeup (so=3D0x3f1134, sb=3D0x3f1188) at=20 > freebsd/sys/kern/uipc_sockbuf.c:191 > #3 0x00127afe in soisdisconnecting (so=3D0x3f1134) at=20 > freebsd/sys/kern/uipc_socket.c:3341 > #4 0x00153cca in tcp_disconnect (tp=3D0x3f75ac) at=20 > freebsd/sys/netinet/tcp_usrreq.c:1508 > #5 0x00152b30 in tcp_usr_disconnect (so=3D0x3f1134) at=20 > freebsd/sys/netinet/tcp_usrreq.c:556 > #6 0x00124754 in sodisconnect (so=3D0x3f1134) at freebsd/sys/kern/uipc_s= ocket.c:816 > #7 0x00124384 in soclose (so=3D0x3f1134) at freebsd/sys/kern/uipc_socket= =2Ec:664 > #8 0x00128886 in soo_close (fp=3D0x32cec8, td=3D0x0) at=20 > freebsd/sys/kern/sys_socket.c:452 >=20 > Is this an illegal kevent() use case? Are there some means that prevent = this=20 > sequence in a real FreeBSD kernel? When file descriptor gets closed, kevent subsystem is notified first and clears knotes related to the filedescriptor (important, not the file) being closed. You could see this in the closefp() function calling knote_fdclose(). Since files are referenced and destroyed only when the last reference vanishes, and filedescriptor holds a reference on the file, file (which is socket in your case) cannot go away until knotes are expunged. There are some corner cases, when knotes are really attached to objects with different life-cycle, like vnodes, but I believe the description above should be accurate enough for sockets. If there is a problem in the FreeBSD code and you could produce a test case, do not hesitate to send it to me. Thank you. --NgLNC/STdjXvE6Rt Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iQIcBAEBAgAGBQJScDXDAAoJEJDCuSvBvK1BTT8P/jK5Kfv5IzM3to+MIZ0u8mVR 5BsYD49Gh4HR/l7/ZQs3a4JnmUAuMtiRItTgXFFgEAFWOgFXHlc3H9zxD3HN4a51 8bozYg+3IIh111RqEUYEfpN+SBEr9LggPGl349O2oAVw5ueLxWdWdMV7IIT63prR aSqp+Xb8XDJ8SitIeB7lyetGQfdpn7iQEQyuVu6NsK0ZNRngAIlUCD6cYTWBtEG4 ESCoCj9Jx9WoJhCcJigF2yyX1NWrkgdFaSXiK/Ob1L4kYH2Z/AOmEC9E1fnRdAxJ d7Xcdy4lW+9IQ5syPE1qZn7wJX7AF7rcJUdFWIm+tlMv2Jgybs3LJq9Hdirugws1 IIeCIyw8Yk14xkN8Gh7OGW9shh1tdpQRvXrSt0xoVXbMy3NusnR6yXwAgyCE7Lv1 csGQK5GanC0aoKY47Uq2xrAimd4Op75CT/mxh545SbGjSV6lAKGUS5Z2H6zVvclI 84NujYrqjjj/GtetS34SvaJinSQE20O74OxVd/KT9jkmHNLjBi5LaMAmMJJgFGPE MC08Ef4zcSqNN6WmZmghkGW7gLPZ5O+GvLoIxdKIx8SzQlXX6wTW6pHXuN1Tgv+w /SUEityppiRiDe8Wihb5zNfeQFjbLzMDBdTpZW6CtyyhtNSif89WtmpsK9XCY4Fq p57UxPOxeqyokeiYMf2O =ex/F -----END PGP SIGNATURE----- --NgLNC/STdjXvE6Rt--