From owner-svn-src-all@FreeBSD.ORG Mon Nov 3 17:51:42 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0FCE0722; Mon, 3 Nov 2014 17:51:42 +0000 (UTC) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id B05E91E5; Mon, 3 Nov 2014 17:51:41 +0000 (UTC) Received: from c122-106-147-133.carlnfd1.nsw.optusnet.com.au (c122-106-147-133.carlnfd1.nsw.optusnet.com.au [122.106.147.133]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 20F97426C75; Tue, 4 Nov 2014 04:51:33 +1100 (AEDT) Date: Tue, 4 Nov 2014 04:51:32 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Julian Elischer Subject: Re: svn commit: r274017 - head/sys/kern In-Reply-To: <54573AEE.9010602@freebsd.org> Message-ID: <20141104041847.K1605@besplex.bde.org> References: <201411030746.sA37kpPu037113@svn.freebsd.org> <54573AEE.9010602@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=dMCfxopb c=1 sm=1 tr=0 a=7NqvjVvQucbO2RlWB8PEog==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=6I5d2MoRAAAA:8 a=Qpqu-2hLh1yy2NhiEhkA:9 a=CjuIK1q_8ugA:10 Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Mateusz Guzik X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Nov 2014 17:51:42 -0000 On Mon, 3 Nov 2014, Julian Elischer wrote: > On 11/3/14, 3:46 PM, Mateusz Guzik wrote: >> Author: mjg >> Date: Mon Nov 3 07:46:51 2014 >> New Revision: 274017 >> URL: https://svnweb.freebsd.org/changeset/base/274017 >> >> Log: >> Provide an on-stack temporary buffer for small ioctl requests. > I'm not sure I like this. We don't know how many more levels > of stack may be needed. 128 bytes is nothing. select() always used a 512-byte array of 2048 bits for similar purposes. poll() uses an array of 32 struct pollfd's similarly. (The names of these seem to have rotted. I think select() used to use the name smallbits, and this was a good name for an array of bits. Now select() uses the name s_selbits for its small array of bits, and poll() uses the name smallbits for its small array of non-bits). The struct is subject to uncontrolled bloat, unlike the bit array which is limited by its hard-coded 2048. struct pollfd actually has size just 8 on all arches, so 32 of it is small. > I know that machines are getting faster with more memory, > but the current move towards bloating out the > worries me. we started out with a single page of stack (SHARED > with the U-area!). I think we are now at several pages.. I forget, is it 8? It is still only 2 plus 1 guard page on i386. KSTACK_PAGES is a bogus option (any use of the option to change it breaks it when the options header is not included). KSTACK_GUARD_PAGES isn't a bogus option, so it can only be changed by editing the source code; then changing it only breaks things previously compiled with the old value. On amd64, it is 4 plus 1. The depth of syscalls is indeed horrifying. Similar techniques were used in Slowaris to blow out its register windows on Sparc many years ago. > I'm open to being persuaded but I think we need to have a discussion > about stack usage. We used to say that anything greater that, say > 64 bytes should probably be allocated. The top level of a syscall should be allowed to use more like a full page (leaving 1 page for everything else). Only the top level can resonably control its usage. >> Modified: >> head/sys/kern/sys_generic.c Excessive quoting not deleted. I will complain about style bugs here instead of in a reply to the original mail. >> Modified: head/sys/kern/sys_generic.c >> ============================================================================== >> --- head/sys/kern/sys_generic.c Mon Nov 3 07:18:42 2014 >> (r274016) >> +++ head/sys/kern/sys_generic.c Mon Nov 3 07:46:51 2014 >> (r274017) >> @@ -649,6 +649,7 @@ sys_ioctl(struct thread *td, struct ioct >> u_long com; >> int arg, error; >> u_int size; >> + u_char smalldata[128]; >> caddr_t data; The bogus type caddr_t is still used. >> if (uap->com > 0xffffffff) { >> @@ -680,17 +681,18 @@ sys_ioctl(struct thread *td, struct ioct >> arg = (intptr_t)uap->data; Bogus cast. intptr_t is only valid on void *, but uap->data has the bogus opaque type caddr_t. >> data = (void *)&arg; The cast is correct. Then it is assumed that the opaque type caddr_t is a pointer. If 'data; had type void *, then no assumptions would be needed here (assumptions and conversions would be needed later, since the KPI has some caddr_t mistakes built in). >> size = 0; >> - } else >> - data = malloc((u_long)size, M_IOCTLOPS, M_WAITOK); >> + } else { >> + if (size <= sizeof(smalldata)) >> + data = smalldata; >> + else >> + data = malloc((u_long)size, M_IOCTLOPS, >> M_WAITOK); Style bugs: - old: unnecessary cast to u_long. It was to support unprototyped code. - new: line too long, by blind re-indentation. >> + } >> } else >> data = (void *)&uap->data; Many more bogus casts like this. >> if (com & IOC_IN) { >> error = copyin(uap->data, data, (u_int)size); A more bogus cast to u_int. It was to support unprotyped code, but has been wrong for that for more than 20 years, since copyin()'s KPI was changed to take a size_t. >> - if (error) { >> - if (size > 0) >> - free(data, M_IOCTLOPS); >> - return (error); >> - } >> + if (error != 0) >> + goto out; >> } else if (com & IOC_OUT) { >> /* >> * Zero the buffer so the user always >> @@ -704,7 +706,8 @@ sys_ioctl(struct thread *td, struct ioct >> if (error == 0 && (com & IOC_OUT)) >> error = copyout(data, uap->data, (u_int)size); Another bogus cast for the copyin() family. >> - if (size > 0) >> +out: >> + if (size > 0 && data != (caddr_t)&smalldata) >> free(data, M_IOCTLOPS); 'size' is unsigned, so (size > 0) should be written as (size != 0). Using caddr_t requires another bogus cast here. &smalldata should be written as &smalldata[0] or simply smalldata. The address of the array is subtly different and not wanted here, but reduces to the same in the explicit conversion to caddr_t, assuming that caddr_t is char * like it is, or void *; otherwise, the subtle difference might have an effect. >> return (error); >> } Bruce