Date: Thu, 11 Aug 2011 03:00:33 GMT From: Bruce Evans <brde@optusnet.com.au> To: freebsd-bugs@FreeBSD.org Subject: Re: misc/159654: 46 kernel headers use register_t but don't #include <sys/types.h> Message-ID: <201108110300.p7B30XX5004382@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR misc/159654; it has been noted by GNATS. From: Bruce Evans <brde@optusnet.com.au> To: Robert Millan <rmh@debian.org> Cc: freebsd-gnats-submit@freebsd.org, freebsd-bugs@freebsd.org, kib@freebsd.org Subject: Re: misc/159654: 46 kernel headers use register_t but don't #include <sys/types.h> Date: Thu, 11 Aug 2011 12:57:33 +1000 (EST) On Wed, 10 Aug 2011, Robert Millan wrote: >> Description: > The following headers use register_t but don't #include <sys/types.h>: This mostly intentional. <sys/types.h>, or more often both <sys/param.h> and <sys/systm.h>, is a prerequisite for all kernel headers. > arm/include/frame.h > arm/include/profile.h <machine/profile.h> is not a user header. It is an error to include this header directly (except in the kernel). The user interface is <sys/gmon.h>, which also doesn't include <sys/types.h>. The latter should be fixed someday, but <sys/types.h> has always been a documented prerequisite for gmon (see moncontrol(3)). > arm/include/proc.h Just an appendage of <sys/proc.h>, so it should and does assume that <sys/proc.h> has done the necessary including (except for pure MD things). <sys/proc.h> is a kernel header and is polluted with includes of almost everything _except_ <sys/types.h> (except possibly by indirect pollution). > powerpc/include/ucontext.h <machine/ucontext.h> is not a kernel header, but it is not a user header either. It is an error to include this header directly. The only supported includes of it are indirectly via <ucontext.h> in applications and via <sys/ucontext.h> in the kernel (<ucontext.h> is just a copy or symlink of <sys/ucontext.h>). The amd64 and i386 <machine/ucontext.h>'s spell register_t as __register_t, but they don't include <machine/_types.h>, so they still have <sys/ucontext.h> as a prerequisite (<sys/_types.h> or any header that is documented to include that would work too, but only accidentally since no types declared in <sys/ucontext.h are used in <machine/ucontext.h>). The i386 <machine/ucontext.h> used to spell register_t as int, and it now has a rotted comment that depends on this spelling for easy checking. It says that the first 20 of __mcontext fields MUST match the definition of sigcontext, but for sigcontext the fields are spelled with an int. Also, the number '20' is confusing and rotted. It is the first 20 fields of __mcontext that used to have to match the not the first 20 fields of sigcontext, but the second through 21st fields of sigcontext (since the first field of mcontext and sigcontext is not in __mcontext). Both structures have been expanded, and binary compatibility seems to have been perfected, so there are now 28 fields in __mcontext that all seem to be binary compatible with sigcontext, giving perfect binary compatibility of mcontext with sigcontext. The i386 <machine/ucontext.h> also declares struct mcontext4. The magic 20 is correct for it, since binary compatibility is lost at the "new" 21st field (there is mc_len but no sc_len, and the FP data structures have different sizes). But the comment is only attached to struct __mcontext where it doesn't apply at all -- it was easy to implement perfect compatibility of the whole structs when the ABI was changed. The i386 <machine/signal.h> also declares struct osigcontext. This is even older than struct mcontext4. The magic 20 applies to it to, even more so (osigcontext ends after the 1+20'th field in it, where mcontext4 just becomes incompatitible after the 1+20'th field). The amd64 <machine/ucontext.h> has even larger bugs in the comment. amd64 has at least 8 more registers, so it should have at least 8 more fields, but the comment only says 24. The code seems to be correct, giving perfect binary compatibility for all 36 fields. The amd64 <machine/signal.h> spells register_t as long where the i386 version spells it as int. This difference is of course necessary if a basic type is used, but it gives larger diffs than if [__]register_t is used. Another annoying lexical style bug in these files is that the fields are hard to count and hard to see all at once due to vertical whitespace being used to separate blocks that are only of historical interest, but with much the same bugs as the comment -- the 20 special fields used to be nicely blocked off, but now they are nastily blocked off because the magic 20 has no significance in the current ABI. kib@ should look at this. > powerpc/include/pcb.h > powerpc/include/spr.h > powerpc/include/reg.h This is not a kernel header, but I think it is not a user header either. Applications should use <sys/procfs.h> or <sys/ptrace.h> (preferably the latter) and never include this directly. These user headers include <sys/types.h> and much more (all of <sys/param.h> :-(). > powerpc/include/_align.h No way including this directly is supported. It doesn't even use register_t, since it only defines macros that use register_t. OTOH, it should use __register_t in these macros. The existence of these _align.h headers is another bug. Old versions of FreeBSD implemented the alignment macros correctly using ifdefs in <machine/param.h> (<machine/param.h> normally gives lots of pollution, but ifdefs were used to prevent that when only the alignment macros are needed. It's stupid to have lots of little headers like <machine/_align.h> when there are nearby enormous kitchen sink headers like <sys/cdefs.h> and <machine/_types.h> that are almost always needed. The messy ifdefs for this in <machine/param.h were only needed because <machine/param.h> is not properly structured and there is no nearby kitchen sink header to use. When <machine/_types.h was restructured and de-polluted, we lost the nearby <machine/ansi.h> which was more suitable. <machine/ansi.h> mostly got moved to definitions of _FOO_DECLARED in many headers. This move went a bit too far). > powerpc/include/profile.h > powerpc/include/pcpu.h > powerpc/include/pmap.h > powerpc/include/proc.h > sparc64/include/smp.h > sparc64/include/profile.h > sparc64/include/cpufunc.h Mostly as for arm and powerpc. <machine/cpufunc.h> is not supported as a user header, but abusing it as a user header is quite useful. Anyone abusing it must know that it is a kernel header and has a prerequisite of <sys/types.h>. Its APIs are undocumented even in the kernel, so this prerequisite is undocmented too, but all kernel headers have it (or more -- see above). > sparc64/include/proc.h > ia64/include/profile.h > ia64/include/proc.h > mips/include/sigframe.h > mips/include/ucontext.h > mips/include/pcb.h > mips/include/db_machdep.h This is an appendage of <ddb/ddb.h>, but I once cleaned up the ddb headers on i386, so they don't have so many inter-dependencies. This unfortunately gave lots of MD pollution in the i386 db_machdep.h, but not the MI <sys/types.h> pollution (that is intentionally left out of ddb.h). > mips/include/reg.h > mips/include/frame.h > mips/include/proc.h > mips/include/trap.h > ofed/include/linux/sched.h > x86/include/_align.h > i386/include/sigframe.h > i386/include/apicvar.h > i386/include/profile.h > i386/include/cpufunc.h > i386/include/proc.h > amd64/include/pcb.h > amd64/include/reg.h > amd64/include/apicvar.h > amd64/include/frame.h > amd64/include/intr_machdep.h > amd64/include/profile.h > amd64/include/cpufunc.h > amd64/include/pcpu.h > amd64/include/proc.h > sys/sysproto.h Kernel-only, and also machine-generated. The machine already generates lots of pollution. About half of it is just wrong (e.g., use of siginfo_t instead of a forward declaration of struct __siginfo) and most of it is avoided in my version. But the pollution intentionally doesn't include <sys/types.h>. > sys/sysent.h > sys/proc.h > sys/ktrace.h > >> How-To-Repeat: > >> Fix: > Please consider attached patch to add the missing #include. Too much pollution for me. BTW, I have a 42K shell script for checking that headers don't grow _new_ dependencies. It essentially just lists all the dependencies of all important user headers. There are far too many of these -- <sys/types.h> is the least of the problems. Unfortunately, this rewards pollution instead of detecting it, and bugs grep faster than I could fix them, so I stopped maintaining this in 1999. Many improvements were made in core POSIX headers mainly by mike@ in ~2002, but this stalled when mike departed (?) in ~2003. The mess in the i386 <machine> directory in 1999 can be read off from the script: % test $i != ./machine/atomic.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/bootinfo.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/bus.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/cpu.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/cpufunc.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/db_machdep.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/dvcfg.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/elf.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/frame.h || echo "#include <sys/types.h>" >>z.c Almost everything depends on <sys/types.h>, as is its right. `test' is a function that many compiles the header after including it to z.c (after the above has echoed some #include's there) with strict CFLAGS. If <sys/types.h> is not needed, then it is left out of the above echo; this happens rarely. The echo list was built up startng with no echos and adding includes until z.c compiles. Nested pollution makes it very difficult to see which includes are actually needed or actually used. It is easier but more disgusting after fixing thousands of failed compiles when building up the list. <sys/param.h> is often needed, but my test doesn't cover many files that need it. % test $i != ./machine/globaldata.h || % echo "#include <sys/time.h> % #include <machine/segments.h> % #include <machine/tss.h>" >>z.c In the kernel, it is a style bug to include <sys/time.h> directly, so not doing it here is correct. <sys/time.h> is standard pollution in <sys/param.h>. % test $i != ./machine/i4b_debug.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/i4b_ioctl.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/i4b_rbch_ioctl.h || % echo "#include <sys/types.h> % #include <machine/i4b_ioctl.h>" >>z.c % test $i != ./machine/i4b_trace.h || echo "#include <sys/time.h>" >>z.c % test $i != ./machine/if_wavelan_ieee.h || % echo "#include <sys/types.h>" >>z.c % test $i != ./machine/iic.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/in_cksum.h || % echo "#include <sys/types.h> % #include <netinet/in.h> % #include <netinet/in_systm.h> % #include <netinet/ip.h>" >>z.c Messier headers depend on more things. Networking headers are especially bad. % test $i != ./machine/ioctl_bt848.h || % echo "#include <sys/types.h>" >>z.c % test $i != ./machine/md_var.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/npx.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/pc/bios.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/pc/vesa.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/pcb.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/pcb_ext.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/pmap.h || % echo "#include <sys/types.h> % #include <vm/vm.h> % #include <vm/pmap.h>" >>z.c % test $i != ./machine/profile.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/si.h || % echo "#include <sys/types.h> % #include <sys/tty.h>" >>z.c % test $i != ./machine/signal.h || % echo "#include <sys/types.h> % #include <sys/signal.h>" >>z.c % test $i != ./machine/smb.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/specialreg.h || % echo "#include <sys/types.h> % #include <machine/cpufunc.h>" >>z.c % test $i != ./machine/types.h || echo "#include <sys/types.h>" >>z.c % test $i != ./machine/vm86.h || echo "#include <sys/types.h>" >>z.c Example of a bad networking header. % test $i != ./netinet/icmp_var.h || % echo "#include <sys/types.h> % #include <netinet/in.h> % #include <netinet/in_systm.h> % #include <netinet/ip.h> % #include <netinet/ip_icmp.h>" >>z.c Although core POSIX headers have been cleaned up, the pollution and prerequisites for kernel headers have been cleaned down. Everything is more complicated and bloated now. Almost everything uses mutexes, so needs <sys/mutex.h>. Mutexes sometimes use locking of a slightly different type, or lock profiling, so almost everything needs <sys/lock.h> too... However, this complexity is mostly in <sys> and networking headers and .c files, since <machine> files are mostly too primitive to need mutexes. For networking headers, mainly <net/if_var.h> is polluted with an include of <sys/mutex.h>. This header is otherwise disgustingly polluted, and lots of the pollution including kernel mutexes escapes to userland. There are new complexities and pollution involving rwlocks which I haven't kept track of. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201108110300.p7B30XX5004382>