Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Aug 2011 03:00:33 GMT
From:      Bruce Evans <brde@optusnet.com.au>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: misc/159654: 46 kernel headers use register_t but don't #include <sys/types.h>
Message-ID:  <201108110300.p7B30XX5004382@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR misc/159654; it has been noted by GNATS.

From: Bruce Evans <brde@optusnet.com.au>
To: Robert Millan <rmh@debian.org>
Cc: freebsd-gnats-submit@freebsd.org, freebsd-bugs@freebsd.org,
        kib@freebsd.org
Subject: Re: misc/159654: 46 kernel headers use register_t but don't #include
 <sys/types.h>
Date: Thu, 11 Aug 2011 12:57:33 +1000 (EST)

 On Wed, 10 Aug 2011, Robert Millan wrote:
 
 >> Description:
 > The following headers use register_t but don't #include <sys/types.h>:
 
 This mostly intentional.  <sys/types.h>, or more often both <sys/param.h>
 and <sys/systm.h>, is a prerequisite for all kernel headers.
 
 > arm/include/frame.h
 > arm/include/profile.h
 
 <machine/profile.h> is not a user header.  It is an error to include
 this header directly (except in the kernel).  The user interface is
 <sys/gmon.h>, which also doesn't include <sys/types.h>.  The latter
 should be fixed someday, but <sys/types.h> has always been a documented
 prerequisite for gmon (see moncontrol(3)).
 
 > arm/include/proc.h
 
 Just an appendage of <sys/proc.h>, so it should and does assume that
 <sys/proc.h> has done the necessary including (except for pure MD things).
 <sys/proc.h> is a kernel header and is polluted with includes of almost
 everything _except_ <sys/types.h> (except possibly by indirect pollution).
 
 > powerpc/include/ucontext.h
 
 <machine/ucontext.h> is not a kernel header, but it is not a user
 header either.  It is an error to include this header directly.  The
 only supported includes of it are indirectly via <ucontext.h> in
 applications and via <sys/ucontext.h> in the kernel (<ucontext.h>
 is just a copy or symlink of <sys/ucontext.h>).
 
 The amd64 and i386 <machine/ucontext.h>'s spell register_t as __register_t,
 but they don't include <machine/_types.h>, so they still have
 <sys/ucontext.h> as a prerequisite (<sys/_types.h> or any header that
 is documented to include that would work too, but only accidentally
 since no types declared in <sys/ucontext.h are used in <machine/ucontext.h>).
 
 The i386 <machine/ucontext.h> used to spell register_t as int, and it
 now has a rotted comment that depends on this spelling for easy checking.
 It says that the first 20 of __mcontext fields MUST match the definition
 of sigcontext, but for sigcontext the fields are spelled with an int.
 Also, the number '20' is confusing and rotted.  It is the first 20 fields
 of __mcontext that used to have to match the not the first 20 fields of
 sigcontext, but the second through 21st fields of sigcontext (since the
 first field of mcontext and sigcontext is not in __mcontext).  Both
 structures have been expanded, and binary compatibility seems to have
 been perfected, so there are now 28 fields in __mcontext that all seem
 to be binary compatible with sigcontext, giving perfect binary
 compatibility of mcontext with sigcontext.
 
 The i386 <machine/ucontext.h> also declares struct mcontext4.  The magic
 20 is correct for it, since binary compatibility is lost at the "new"
 21st field (there is mc_len but no sc_len, and the FP data structures
 have different sizes).  But the comment is only attached to struct
 __mcontext where it doesn't apply at all -- it was easy to implement
 perfect compatibility of the whole structs when the ABI was changed.
 
 The i386 <machine/signal.h> also declares struct osigcontext.  This is
 even older than struct mcontext4.  The magic 20 applies to it to, even
 more so (osigcontext ends after the 1+20'th field in it, where
 mcontext4 just becomes incompatitible after the 1+20'th field).
 
 The amd64 <machine/ucontext.h> has even larger bugs in the comment.
 amd64 has at least 8 more registers, so it should have at least 8 more
 fields, but the comment only says 24.  The code seems to be correct,
 giving perfect binary compatibility for all 36 fields.
 
 The amd64 <machine/signal.h> spells register_t as long where the i386
 version spells it as int.  This difference is of course necessary if
 a basic type is used, but it gives larger diffs than if [__]register_t
 is used.  Another annoying lexical style bug in these files is that
 the fields are hard to count and hard to see all at once due to
 vertical whitespace being used to separate blocks that are only of
 historical interest, but with much the same bugs as the comment -- the
 20 special fields used to be nicely blocked off, but now they are
 nastily blocked off because the magic 20 has no significance in the
 current ABI.
 
 kib@ should look at this.
 
 > powerpc/include/pcb.h
 > powerpc/include/spr.h
 > powerpc/include/reg.h
 
 This is not a kernel header, but I think it is not a user header either.
 Applications should use <sys/procfs.h> or <sys/ptrace.h> (preferably the
 latter) and never include this directly.  These user headers include
 <sys/types.h> and much more (all of <sys/param.h> :-().
 
 > powerpc/include/_align.h
 
 No way including this directly is supported.  It doesn't even use
 register_t, since it only defines macros that use register_t.  OTOH,
 it should use __register_t in these macros.
 
 The existence of these _align.h headers is another bug.  Old versions
 of FreeBSD implemented the alignment macros correctly using ifdefs in
 <machine/param.h> (<machine/param.h> normally gives lots of pollution,
 but ifdefs were used to prevent that when only the alignment macros
 are needed.  It's stupid to have lots of little headers like
 <machine/_align.h> when there are nearby enormous kitchen sink headers
 like <sys/cdefs.h> and <machine/_types.h> that are almost always
 needed.  The messy ifdefs for this in <machine/param.h were only needed
 because <machine/param.h> is not properly structured and there is no
 nearby kitchen sink header to use.  When <machine/_types.h was restructured
 and de-polluted, we lost the nearby <machine/ansi.h> which was more suitable.
 <machine/ansi.h> mostly got moved to definitions of _FOO_DECLARED in many
 headers.  This move went a bit too far).
 
 > powerpc/include/profile.h
 > powerpc/include/pcpu.h
 > powerpc/include/pmap.h
 > powerpc/include/proc.h
 > sparc64/include/smp.h
 > sparc64/include/profile.h
 > sparc64/include/cpufunc.h
 
 Mostly as for arm and powerpc.
 
 <machine/cpufunc.h> is not supported as a user header, but abusing it
 as a user header is quite useful.  Anyone abusing it must know that it
 is a kernel header and has a prerequisite of <sys/types.h>.  Its APIs
 are undocumented even in the kernel, so this prerequisite is undocmented
 too, but all kernel headers have it (or more -- see above).
 
 > sparc64/include/proc.h
 > ia64/include/profile.h
 > ia64/include/proc.h
 > mips/include/sigframe.h
 > mips/include/ucontext.h
 > mips/include/pcb.h
 > mips/include/db_machdep.h
 
 This is an appendage of <ddb/ddb.h>, but I once cleaned up the ddb headers
 on i386, so they don't have so many inter-dependencies.  This unfortunately
 gave lots of MD pollution in the i386 db_machdep.h, but not the MI
 <sys/types.h> pollution (that is intentionally left out of ddb.h).
 
 > mips/include/reg.h
 > mips/include/frame.h
 > mips/include/proc.h
 > mips/include/trap.h
 > ofed/include/linux/sched.h
 > x86/include/_align.h
 > i386/include/sigframe.h
 > i386/include/apicvar.h
 > i386/include/profile.h
 > i386/include/cpufunc.h
 > i386/include/proc.h
 > amd64/include/pcb.h
 > amd64/include/reg.h
 > amd64/include/apicvar.h
 > amd64/include/frame.h
 > amd64/include/intr_machdep.h
 > amd64/include/profile.h
 > amd64/include/cpufunc.h
 > amd64/include/pcpu.h
 > amd64/include/proc.h
 > sys/sysproto.h
 
 Kernel-only, and also machine-generated.  The machine already generates
 lots of pollution.  About half of it is just wrong (e.g., use of
 siginfo_t instead of a forward declaration of struct __siginfo) and most
 of it is avoided in my version.  But the pollution intentionally doesn't
 include <sys/types.h>.
 
 > sys/sysent.h
 > sys/proc.h
 > sys/ktrace.h
 >
 >> How-To-Repeat:
 >
 >> Fix:
 > Please consider attached patch to add the missing #include.
 
 Too much pollution for me.
 
 BTW, I have a 42K shell script for checking that headers don't grow
 _new_ dependencies.  It essentially just lists all the dependencies
 of all important user headers.  There are far too many of these --
 <sys/types.h> is the least of the problems.  Unfortunately, this rewards
 pollution instead of detecting it, and bugs grep faster than I could
 fix them, so I stopped maintaining this in 1999.  Many improvements
 were made in core POSIX headers mainly by mike@ in ~2002, but this stalled
 when mike departed (?) in ~2003.
 
 The mess in the i386 <machine> directory in 1999 can be read off from the
 script:
 
 % 	test $i != ./machine/atomic.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/bootinfo.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/bus.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/cpu.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/cpufunc.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/db_machdep.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/dvcfg.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/elf.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/frame.h || echo "#include <sys/types.h>" >>z.c
 
 Almost everything depends on <sys/types.h>, as is its right.  `test'
 is a function that many compiles the header after including it to z.c
 (after the above has echoed some #include's there) with strict CFLAGS.
 If <sys/types.h> is not needed, then it is left out of the above echo;
 this happens rarely.  The echo list was built up startng with no echos
 and adding includes until z.c compiles.  Nested pollution makes it very
 difficult to see which includes are actually needed or actually used.
 It is easier but more disgusting after fixing thousands of failed compiles
 when building up the list.  <sys/param.h> is often needed, but my test
 doesn't cover many files that need it.
 
 % 	test $i != ./machine/globaldata.h ||
 % 	    echo "#include <sys/time.h>
 % 		  #include <machine/segments.h>
 % 		  #include <machine/tss.h>" >>z.c
 
 In the kernel, it is a style bug to include <sys/time.h> directly, so not
 doing it here is correct.  <sys/time.h> is standard pollution in
 <sys/param.h>.
 
 % 	test $i != ./machine/i4b_debug.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/i4b_ioctl.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/i4b_rbch_ioctl.h ||
 % 	    echo "#include <sys/types.h>
 % 		  #include <machine/i4b_ioctl.h>" >>z.c
 % 	test $i != ./machine/i4b_trace.h || echo "#include <sys/time.h>" >>z.c
 % 	test $i != ./machine/if_wavelan_ieee.h ||
 % 	    echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/iic.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/in_cksum.h ||
 % 	    echo "#include <sys/types.h>
 % 		  #include <netinet/in.h>
 % 		  #include <netinet/in_systm.h>
 % 		  #include <netinet/ip.h>" >>z.c
 
 Messier headers depend on more things.  Networking headers are especially
 bad.
 
 % 	test $i != ./machine/ioctl_bt848.h ||
 % 	    echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/md_var.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/npx.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/pc/bios.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/pc/vesa.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/pcb.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/pcb_ext.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/pmap.h ||
 % 	    echo "#include <sys/types.h>
 % 		  #include <vm/vm.h>
 % 		  #include <vm/pmap.h>" >>z.c
 % 	test $i != ./machine/profile.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/si.h ||
 % 	    echo "#include <sys/types.h>
 % 		  #include <sys/tty.h>" >>z.c
 % 	test $i != ./machine/signal.h ||
 % 	    echo "#include <sys/types.h>
 % 		  #include <sys/signal.h>" >>z.c
 % 	test $i != ./machine/smb.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/specialreg.h ||
 % 	    echo "#include <sys/types.h>
 % 		  #include <machine/cpufunc.h>" >>z.c
 % 	test $i != ./machine/types.h || echo "#include <sys/types.h>" >>z.c
 % 	test $i != ./machine/vm86.h || echo "#include <sys/types.h>" >>z.c
 
 Example of a bad networking header.
 
 % 	test $i != ./netinet/icmp_var.h ||
 % 	    echo "#include <sys/types.h>
 % 		  #include <netinet/in.h>
 % 		  #include <netinet/in_systm.h>
 % 		  #include <netinet/ip.h>
 % 		  #include <netinet/ip_icmp.h>" >>z.c
 
 Although core POSIX headers have been cleaned up, the pollution and
 prerequisites for kernel headers have been cleaned down.  Everything is
 more complicated and bloated now.  Almost everything uses mutexes,
 so needs <sys/mutex.h>.  Mutexes sometimes use locking of a slightly
 different type, or lock profiling, so almost everything needs <sys/lock.h>
 too...  However, this complexity is mostly in <sys> and networking
 headers and .c files, since <machine> files are mostly too primitive to
 need mutexes.  For networking headers, mainly <net/if_var.h> is polluted
 with an include of <sys/mutex.h>.  This header is otherwise disgustingly
 polluted, and lots of the pollution including kernel mutexes escapes
 to userland.  There are new complexities and pollution involving rwlocks
 which I haven't kept track of.
 
 Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201108110300.p7B30XX5004382>