From owner-freebsd-current@FreeBSD.ORG Mon Aug 26 19:08:18 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 6B9397A2; Mon, 26 Aug 2013 19:08:18 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0DF4A28D4; Mon, 26 Aug 2013 19:08:18 +0000 (UTC) Received: from jhbbsd.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id AB4C5B97F; Mon, 26 Aug 2013 15:08:16 -0400 (EDT) From: John Baldwin To: Davide Italiano Subject: Re: Question about socket timeouts Date: Mon, 26 Aug 2013 15:05:06 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p28; KDE/4.5.5; amd64; ; ) References: <201308231129.03990.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Message-Id: <201308261505.06342.jhb@freebsd.org> Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 26 Aug 2013 15:08:16 -0400 (EDT) Cc: Vitja Makarov , freebsd-current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 19:08:18 -0000 On Monday, August 26, 2013 2:23:44 pm Davide Italiano wrote: > On Fri, Aug 23, 2013 at 5:29 PM, John Baldwin wrote: > > On Friday, August 23, 2013 9:53:12 am Davide Italiano wrote: > >> On Fri, Aug 23, 2013 at 3:45 PM, John Baldwin wrote: > >> > On Friday, August 23, 2013 2:27:58 am Vitja Makarov wrote: > >> >> 2013/8/22 John Baldwin : > >> >> > On Thursday, August 22, 2013 12:18:48 am Vitja Makarov wrote: > >> >> >> 2013/8/21 John Baldwin : > >> >> >> > On Monday, August 19, 2013 11:13:02 pm Daniel Eischen wrote: > >> >> >> >> On Mon, 19 Aug 2013, Adrian Chadd wrote: > >> >> >> >> > >> >> >> >> > Yes! Please file a PR! > >> >> >> >> > >> >> >> >> This sorta implies that both are acceptable (although, > >> >> >> >> the Linux behavior seems more desirable). > >> >> >> >> > >> >> >> >> http://austingroupbugs.net/view.php?id=369 > >> >> >> > > >> >> >> > No, that says "round up", so it does mean that the requested timeout > >> >> >> > should be the minimum amount slept. tvtohz() does this. Really odd > >> >> >> > that the socket code is using its own version of this rather than > >> >> >> > tvtohz(). > >> >> >> > > >> >> >> > Oh, I bet this just predates tvtohz(). Interesting that it keeps > > getting > >> >> >> > bug fixes in its history that simply using tvtohz() would have > > solved. > >> >> >> > > >> >> >> > Try this: > >> >> >> > > >> >> >> > Index: uipc_socket.c > >> >> >> > =================================================================== > >> >> >> > --- uipc_socket.c (revision 254570) > >> >> >> > +++ uipc_socket.c (working copy) > >> >> >> > @@ -2699,21 +2699,16 @@ sosetopt(struct socket *so, struct sockopt > > *sopt) > >> >> >> > if (error) > >> >> >> > goto bad; > >> >> >> > > >> >> >> > - /* assert(hz > 0); */ > >> >> >> > if (tv.tv_sec < 0 || tv.tv_sec > INT_MAX / > > hz || > >> >> >> > tv.tv_usec < 0 || tv.tv_usec >= 1000000) > > { > >> >> >> > error = EDOM; > >> >> >> > goto bad; > >> >> >> > } > >> >> >> > - /* assert(tick > 0); */ > >> >> >> > - /* assert(ULONG_MAX - INT_MAX >= 1000000); > > */ > >> >> >> > - val = (u_long)(tv.tv_sec * hz) + tv.tv_usec > > / tick; > >> >> >> > - if (val > INT_MAX) { > >> >> >> > + val = tvtohz(&tv); > >> >> >> > + if (val == INT_MAX) { > >> >> >> > error = EDOM; > >> >> >> > goto bad; > >> >> >> > } > >> >> >> > - if (val == 0 && tv.tv_usec != 0) > >> >> >> > - val = 1; > >> >> >> > > >> >> >> > switch (sopt->sopt_name) { > >> >> >> > case SO_SNDTIMEO: > >> >> >> > > >> >> >> > >> >> >> That must help. But I want to see the issue solved in the next > >> >> >> release. I can't apply patch to the production system. Btw in > >> >> >> production environment we have kern.hz set to 1000 so it's not a > >> >> >> problem there. > >> >> > > >> >> > Can you test this in some way in a test environment? > >> >> > > >> >> > >> >> Ok, sorry for posting out of the list. > >> >> > >> >> Simple test program is attached. Without your patch timeout expires in > >> >> about 20ms. With it it's ~40ms. > >> >> > >> >> 40 instead of 30 is beacuse of odd tick added by tvtohz(). > >> > > >> > Ok, thanks. tvtohz() will be good to MFC (and I will do that), but for > >> > HEAD I think we can fix this to use a precise timeout. I've cc'd davide@ > >> > so he can take a look at that. > >> > > >> > -- > >> > John Baldwin > >> > >> Hi, > >> I think I can switch this code to new timeout KPI, but this will > >> require the timeout field of 'struct sockbuf' to be changed from 'int' > >> to 'sbintime_t' which breaks binary compatibility. Do you have any > >> strong objections about this? If any, I would like this to happen ABI > >> freeze, so it looks like this is the right moment. > > > > This should be fine to change now, it just can't be MFC'd (which we can't > > do anyway). > > > > -- > > John Baldwin > > Please consider the following patch: > http://people.freebsd.org/~davide/review/socket_timeout.diff > I've tested it and it works OK. I got a timeout which is ~= 25ms using > the testcase provided by the user. > The only doubt I have is about the range check, I've changed a bit > because the 'integer' part of sbintime_t fits in 32-bits, but I'm not > sure it's the best way of doing this. Nice! Bruce actually wants me to adjust the range check a bit (which will fit in well with your changes I think). Please let me get that fix in (so it can be part of the future MFC) and then you can commit this. Thanks! Actually, I think you still need to patch the sogetopt() case to work correctly (it is still doing a manual conversion from 'val' to a timeval assuming it is in 'hz' units). -- John Baldwin