From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 14:21:19 2012 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AA4BF480; Wed, 19 Dec 2012 14:21:19 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx07.syd.optusnet.com.au (fallbackmx07.syd.optusnet.com.au [211.29.132.9]) by mx1.freebsd.org (Postfix) with ESMTP id 2C8CC8FC17; Wed, 19 Dec 2012 14:21:18 +0000 (UTC) Received: from mail03.syd.optusnet.com.au (mail03.syd.optusnet.com.au [211.29.132.184]) by fallbackmx07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id qBJELHww025480; Thu, 20 Dec 2012 01:21:17 +1100 Received: from c122-106-175-26.carlnfd1.nsw.optusnet.com.au (c122-106-175-26.carlnfd1.nsw.optusnet.com.au [122.106.175.26]) by mail03.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id qBJEKum1011107 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 20 Dec 2012 01:20:57 +1100 Date: Thu, 20 Dec 2012 01:20:56 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Davide Italiano Subject: Re: API explosion (Re: [RFC/RFT] calloutng) In-Reply-To: Message-ID: <20121220010702.B1675@besplex.bde.org> References: <50CF88B9.6040004@FreeBSD.org> <20121218173643.GA94266@onelab2.iet.unipi.it> <50D0B00D.8090002@FreeBSD.org> <50D0E42B.6030605@FreeBSD.org> <20121218225823.GA96962@onelab2.iet.unipi.it> <1355873265.1198.183.camel@revolution.hippie.lan> <14604.1355910848@critter.freebsd.dk> <15882.1355914308@critter.freebsd.dk> <20121219221518.E1082@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=L9pF2Jv8 c=1 sm=1 a=5xuQJAhXp8AA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=o_YUZdvV9usA:10 a=vXJ0KzY1-sXwhEubLOkA:9 a=CjuIK1q_8ugA:10 a=bxQHXO5Py4tHmhUgaywp5w==:117 Cc: Ian Lepore , Alexander Motin , phk@onelab2.iet.unipi.it, Poul-Henning Kamp , freebsd-current , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2012 14:21:19 -0000 On Wed, 19 Dec 2012, Davide Italiano wrote: > On Wed, Dec 19, 2012 at 4:18 AM, Bruce Evans wrote: >> I would have tried a 32 bit format with a variable named 'ticks'. >> Something like: >> - ticks >= 0. Same meaning as now. No changes in ABIs or APIs to use >> this. The tick period would be constant but for virtual ticks and >> not too small. hz = 1000 now makes the period too small, and not a >> power of 2. So make the period 1/128 second. This gives a 1.24.7 >> binary format. 2**24 seconds is 194 days. >> - ticks < 0. The 31 value bits are now a cookie (descriptor) referring >> to a bintime or whatever. This case should rarely be used. I don't >> like it that a tickless kernel, which is needed mainly for power >> saving, has expanded into complications to support short timeouts >> which should rarely be used. > > Bruce, I don't really agree with this. > The data addressed by cookie should be still stored somewhere, and KBI > will result broken. This, indeed, is not real problem as long as > current calloutng code heavily breaks KBI, but if that was your point, > I don't see how your proposed change could help. In the old API, it is an error to pass ticks < 0, so only broken old callers are affected. Of course, if there are any then it would be hard to detect their garbage cookies. Anywy, it's too later to change to this, and maybe also to a 32.32 format. [32.32 format] >> This would make a better general format than timevals, timespecs and >> of course bintimes :-). It is a bit wasteful for timeouts since >> its extremes are rarely used. Malicious and broken callers can >> still cause overflow at 68 years, so you have to check for it and >> handle it. The limit of 194 days is just as good for timeouts. > > I think the phk's proposal is better. About your overflow objection, > I think is really unlikely to happen, but better safe than sorry. It's very easy for applications to cause kernel overflow using valid syscall args like tv_sec = TIME_T_MAX for a relative time in nanosleep(). Adding TIME_T_MAX to the current time in seconds overflow for all current times except for the first second after the Epoch. There is no difference between the overflow for 32-bit and 64-bit time_t's for this. This is now mostly handled so that the behaviour is harmless although wrong. E.g., the timeout might become negative, and then since it is not a cookie it is silently replaced by a timeout of 1 tick. In nanosleep(), IIRC there are further overflows that result in returning early instead of retrying the 1-tick timeouts endlessly. Bruce