Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Feb 2001 16:51:41 -0500
From:      Michael Sinz <msinz@wgate.com>
To:        Ian Dowse <iedowse@maths.tcd.ie>
Cc:        Robert Watson <rwatson@FreeBSD.ORG>, Randell Jesup <rjesup@wgate.com>, arch@FreeBSD.ORG, Alfred Perlstein <bright@wintelcom.net>, Bruce Bauman <bbauman@wgate.com>
Subject:   Re: ELF and diskless boot
Message-ID:  <3A9C216D.11F2ABA7@wgate.com>
References:  <200102251552.aa44515@salmon.maths.tcd.ie>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------A42168EE212B4B225B38F0C6
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Ian Dowse wrote:
> 
> In message <Pine.NEB.3.96L.1010225101145.97061A-100000@fledge.watson.org>, Robe
> rt Watson writes:
> >
> >I won't comment on the symbol stripping issue since I don't know
> >much/anything about that, but I can comment that we're in the process of
> >moving to using sysctl() for top and other kernel-grubbing utilities, when
> >used on a live system.  top has already been changed to do this on
> >-CURRENT, and patches to systat/dmesg/vmstat/... are in the wings.  While
> 
> The reason that these utilities fail when not using loader(8) is
> that they depend on being able to look up static variables within
> the kernel using kldsym(). I think symbols declared as static end
> up with debugging information, so are only made available through
> some loader magic.

This I have figured out - but the questions I have are mainly "Why?"
That is, how did FreeBSD get to a point where non-exported interfaces
(symbols) are depended on for base functionality.  (I can understand
kernel debugging, but swapon and top?)

> It would be possible to go around the tree, removing the `static'
> from all symbols that are used by libkvm utilities - I've tried
> this, and it certainly fixes the problem with etherbooted kernels.
> It might also be possible to hack libkvm to try the old-style symbol
> lookup mechanism if kldsym fails to find a symbol.

What I have done now is to make a patch to Etherboot (which I have
submitted and have also attached here) that will also load the symbols
for the kernel.  It was a bit more work than I thought it would be due to
some undocumented "value x goes here but we don't use it" type of things.
(And bit y needs to be set but you only find out by looking at the code) :-)

Anyway, the logic in the loader patch to Etherboot could also solve the
problem with the FreeBSD gzip'ed kernel images (which currently don't work
for the same reason)

> However, the move towards using sysctl() instead of libkvm will
> solve this problem completely. Thanks to Thomas Moestl and others
> who have done the work to make this happen!

I agree that moving to defined interfaces is the way to go.  I would not
have guessed that this problem had existed had I not run into it.

BTW - I had a rather nasty hack that I did where the kernel kldsym used
sysctl to find the kernel symbols if it did not find them using its current
methods.  However, it was nasty since it really is not a symbol entry and
thus only worked for the libkvm users.  (And, in fact, could cause problems
with other users if they happened to go for one of those symbols)  This
is why I figured that we would just redo the BIOS for our systems.

-- 
Michael Sinz ---- Worldgate Communications ---- msinz@wgate.com
A master's secrets are only as good as
	the master's ability to explain them to others.
--------------A42168EE212B4B225B38F0C6
Content-Type: text/plain; charset=us-ascii;
 name="osloader.c.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="osloader.c.patch"

*** osloader.c.original	Tue Feb 27 14:41:31 2001
--- osloader.c	Tue Feb 27 14:49:12 2001
***************
*** 46,51 ****
--- 46,57 ----
          unsigned long           bi_extmem;
          unsigned long           bi_symtab;
          unsigned long           bi_esymtab;
+ #ifdef  IMAGE_FREEBSD
+ 	/* Note that these are in the FreeBSD headers but were not here... */
+ 	unsigned long           bi_kernend;		/* end of kernel space */
+ 	unsigned long           bi_envp;		/* environment */
+ 	unsigned long           bi_modulep;		/* preloaded modules */
+ #endif
  };
  
  /* a.out */
***************
*** 133,138 ****
--- 139,201 ----
  	Elf32_Size	p_align;	/* Alignment in memory and file. */
  } Elf32_Phdr;
  
+ #ifdef  IMAGE_FREEBSD
+ /*
+  * FreeBSD has this rather strange "feature" of its design.
+  * At some point in its evolution, FreeBSD started to rely
+  * externally on private/static/debug internal symbol information.
+  * That is, some of the interfaces that software uses to access
+  * and work with the FreeBSD kernel are made available not
+  * via the shared library symbol information (the .DYNAMIC section)
+  * but rather the debug symbols.  This means that any symbol, not
+  * just publicly defined symbols can be (and are) used by system
+  * tools to make the system work.  (such as top, swapinfo, swapon,
+  * etc)
+  *
+  * Even worse, however, is the fact that standard ELF loaders do
+  * not know how to load the symbols since they are not within
+  * an ELF PT_LOAD section.  The kernel needs these symbols to
+  * operate so the following changes/additions to the boot
+  * loading of EtherBoot have been made to get the kernel to load.
+  * All of the changes are within IMAGE_FREEBSD such that the
+  * extra/changed code only compiles when FREEBSD support is
+  * enabled.
+  */
+ 
+ /*
+  * Section header for FreeBSD (debug symbol kludge!) support
+  */
+ typedef struct {
+ 	Elf32_Word	sh_name;	/* Section name (index into the
+ 					   section header string table). */
+ 	Elf32_Word	sh_type;	/* Section type. */
+ 	Elf32_Word	sh_flags;	/* Section flags. */
+ 	Elf32_Addr	sh_addr;	/* Address in memory image. */
+ 	Elf32_Off	sh_offset;	/* Offset in file. */
+ 	Elf32_Size	sh_size;	/* Size in bytes. */
+ 	Elf32_Word	sh_link;	/* Index of a related section. */
+ 	Elf32_Word	sh_info;	/* Depends on section type. */
+ 	Elf32_Size	sh_addralign;	/* Alignment in bytes. */
+ 	Elf32_Size	sh_entsize;	/* Size of each entry in section. */
+ } Elf32_Shdr;
+ 
+ /* sh_type */
+ #define SHT_SYMTAB	2		/* symbol table section */
+ #define SHT_STRTAB	3		/* string table section */
+ 
+ /*
+  * Module information subtypes (for the metadata that we need to build)
+  */
+ #define MODINFO_END		0x0000		/* End of list */
+ #define MODINFO_NAME		0x0001		/* Name of module (string) */
+ #define MODINFO_TYPE		0x0002		/* Type of module (string) */
+ #define MODINFO_METADATA	0x8000		/* Module-specfic */
+ 
+ #define MODINFOMD_SSYM		0x0003		/* start of symbols */
+ #define MODINFOMD_ESYM		0x0004		/* end of symbols */
+ 
+ #endif	/* IMAGE_FREEBSD */
+ 
  /* The structure of a Multiboot 0.6 parameter block.  */
  struct multiboot_info {
  	unsigned int flags;
***************
*** 203,208 ****
--- 266,280 ----
  
  #ifdef	ELF_IMAGE
  static Elf32_Phdr *phdr;
+ 
+ #ifdef	IMAGE_FREEBSD
+ static Elf32_Shdr *shdr;	/* To support the FreeBSD kludge! */
+ static Address symtab_load;
+ static Address symstr_load;
+ static int symtabindex;
+ static int symstrindex;
+ #endif
+ 
  #endif
  
  #ifdef  IMAGE_FREEBSD
***************
*** 406,411 ****
--- 478,544 ----
  			info.bsdinfo.bi_kernelname = kernel;
  			info.bsdinfo.bi_nfs_diskless = NULL;
  			info.bsdinfo.bi_size = sizeof(info.bsdinfo);
+ 
+ 			/* Check if we have symbols loaded, and if so,
+ 			 * made the meta_data needed to pass those to
+ 			 * the kernel. */
+ 			if ((symtab_load !=0) && (symstr_load != 0))
+ 			{
+ 				unsigned long *t;
+ 
+ 				info.bsdinfo.bi_symtab = symtab_load;
+ 
+ 				/* End of symbols (long aligned...) */
+ 				/* Assumes size of long is a power of 2... */
+ 				info.bsdinfo.bi_esymtab = (symstr_load +
+ 					sizeof(long) +
+ 					*((long *)symstr_load) +
+ 					sizeof(long) - 1) & ~(sizeof(long) - 1);
+ 
+ 				/* Where we will build the meta data... */
+ 				t = (unsigned long *)info.bsdinfo.bi_esymtab;
+ 
+ #ifdef	DEBUG_ELF
+ 				printf("Metadata at %X\n",t);
+ #endif
+ 
+ 				/* Set up the pointer to the memory... */
+ 				info.bsdinfo.bi_modulep = (unsigned long)t;
+ 				
+ 				/* The metadata structure is an array of 32-bit
+ 				 * words where we store some information about the
+ 				 * system.  This is critical, as FreeBSD now looks
+ 				 * only for the metadata for the extended symbol
+ 				 * information rather than in the bootinfo.
+ 				 */
+ 				/* First, do the kernel name and the kernel type */
+ 				/* Note that this assumed x86 byte order... */
+ 
+ 				/* 'kernel\0\0' */
+ 				*t++=MODINFO_NAME; *t++= 7; *t++=0x6E72656B; *t++=0x00006C65;
+ 
+ 				/* 'elf kernel\0\0' */
+ 				*t++=MODINFO_TYPE; *t++=11; *t++=0x20666C65; *t++=0x6E72656B; *t++ = 0x00006C65;
+ 
+ 				/* Now the symbol start/end - note that they are
+ 				 * here in local/physical address - the Kernel
+ 				 * boot process will relocate the addresses. */
+ 				*t++=MODINFOMD_SSYM | MODINFO_METADATA; *t++=sizeof(*t); *t++=info.bsdinfo.bi_symtab;
+ 				*t++=MODINFOMD_ESYM | MODINFO_METADATA; *t++=sizeof(*t); *t++=info.bsdinfo.bi_esymtab;
+ 
+ 				*t++=MODINFO_END; *t++=0; /* end of metadata */
+ 
+ 				/* Since we have symbols we need to make
+ 				 * sure that the kernel knows its own end
+ 				 * of memory...  It is not _end but after
+ 				 * the symbols and the metadata... */
+ 				info.bsdinfo.bi_kernend = (unsigned long)t;
+ 
+ 				/* Signal locore.s that we have a valid bootinfo
+ 				 * structure that was completely filled in. */
+ 				freebsd_howto |= 0x80000000;
+ 			}
+ 
  			(*entry)(freebsd_howto, NODEV, 0, 0, 0, &info.bsdinfo, 0, 0, 0);
  			longjmp(jmp_bootmenu, 2);
  		}
***************
*** 521,526 ****
--- 654,670 ----
  				}
  				memcpy((void *)curaddr, data+offset, toread);
  				offset += toread;
+ #ifdef	IMAGE_FREEBSD
+ 				/* Count the bytes read even for the last block
+ 				 * as we will need to know where the last block
+ 				 * ends in order to load the symbols correctly.
+ 				 * (plus it could be useful elsewhere...)
+ 				 * Note that we need to count the actual size,
+ 				 * not just the end of the disk image size.
+ 				 */
+ 				curaddr += toread;
+ 				if (segment) curaddr += (phdr[segment].p_memsz - phdr[segment].p_filesz);
+ #endif
  				toread = 0;
  			}
  		}
***************
*** 547,552 ****
--- 691,858 ----
  			segment = i;
  		}
  		if (phdr[segment].p_type != PT_LOAD) {
+ #ifdef	IMAGE_FREEBSD
+ 			/* No more segments to be loaded - time to start the
+ 			 * nasty state machine to support the loading of
+ 			 * FreeBSD debug symbols due to the fact that FreeBSD
+ 			 * uses/exports the kernel's debug symbols in order
+ 			 * to make much of the system work!  Amazing (arg!)
+ 			 *
+ 			 * We depend on the fact that for the FreeBSD kernel,
+ 			 * there is only one section of debug symbols and that
+ 			 * the section is after all of the loaded sections in
+ 			 * the file.  This assumes a lot but is somewhat required
+ 			 * to make this code not be too annoying.  (Where do you
+ 			 * load symbols when the code has not loaded yet?)
+ 			 * Since this function is actually just a callback from
+ 			 * the network data transfer code, we need to be able to
+ 			 * work with the data as it comes in.  There is no chance
+ 			 * for doing a seek other than forwards.
+ 			 *
+ 			 * The process we use is to first load the section
+ 			 * headers.  Once they are loaded (shdr != 0) we then
+ 			 * look for where the symbol table and symbol table
+ 			 * strings are and setup some state that we found
+ 			 * them and fall into processing the first one (which
+ 			 * is the symbol table) and after that has been loaded,
+ 			 * we try the symbol strings.  Note that the order is
+ 			 * actually required as the memory image depends on
+ 			 * the symbol strings being loaded starting at the
+ 			 * end of the symbol table.  The kernel assumes this
+ 			 * layout of the image.
+ 			 *
+ 			 * At any point, if we get to the end of the load file
+ 			 * or the section requested is earlier in the file than
+ 			 * the current file pointer, we just end up falling
+ 			 * out of this and booting the kernel without this
+ 			 * information.
+ 			 */
+ 
+ 			/* Make sure that the next address is long aligned... */
+ 			/* Assumes size of long is a power of 2... */
+ 			curaddr = (curaddr + sizeof(long) - 1) & ~(sizeof(long) - 1);
+ 
+ 			/* If we have not yet gotten the shdr loaded, try that */
+ 			if (shdr == 0)
+ 			{
+ 				toread = info.elf32.e_shnum * info.elf32.e_shentsize;
+ 				skip = info.elf32.e_shoff - (loc + offset);
+ 				if (toread)
+ 				{
+ #ifdef	DEBUG_ELF
+ 					printf("shdr *, size %X, curaddr %X\n", toread, curaddr);
+ #endif
+ 
+ 					/* Start reading at the curaddr and make that the shdr */
+ 					shdr = (Elf32_Shdr *)curaddr;
+ 
+ 					/* Start to read... */
+ 					continue;
+ 				}
+ 			}
+ 			else
+ 			{
+ 				/* We have the shdr loaded, check if we have found
+ 				 * the indexs where the symbols are supposed to be */
+ 				if ((symtabindex == -1) && (symstrindex == -1))
+ 				{
+ 					/* Make sure that the address is page aligned... */
+ 					/* Symbols need to start in their own page(s)... */
+ 					curaddr = (curaddr + 4095) & ~4095;
+ 
+ 					/* Need to make new indexes... */
+ 					for (i=0; i < info.elf32.e_shnum; i++)
+ 					{
+ 						if (shdr[i].sh_type == SHT_SYMTAB)
+ 						{
+ 							int j;
+ 							for (j=0; j < info.elf32.e_phnum; j++)
+ 							{
+ 								/* Check only for loaded sections */
+ 								if ((phdr[i].p_type | 0x80) == (PT_LOAD | 0x80))
+ 								{
+ 									/* Only the extra symbols */
+ 									if ((shdr[i].sh_offset >= phdr[j].p_offset) &&
+ 									    ((shdr[i].sh_offset + shdr[i].sh_size) <=
+ 									     (phdr[j].p_offset + phdr[j].p_filesz)))
+ 									{
+ 										shdr[i].sh_offset=0;
+ 										shdr[i].sh_size=0;
+ 										break;
+ 									}
+ 								}
+ 							}
+ 							if ((shdr[i].sh_offset != 0) && (shdr[i].sh_size != 0))
+ 							{
+ 								symtabindex = i;
+ 								symstrindex = shdr[i].sh_link;
+ 							}
+ 						}
+ 					}
+ 				}
+ 
+ 				/* Check if we have a symbol table index and have not loaded it */
+                                 if ((symtab_load == 0) && (symtabindex >= 0))
+ 				{
+ 					/* No symbol table yet?  Load it first... */
+ 
+ 					/* This happens to work out in a strange way.
+ 					 * If we are past the point in the file already,
+ 					 * we will skip a *large* number of bytes which
+ 					 * ends up bringing us to the end of the file and
+ 					 * an old (default) boot.  Less code and lets
+ 					 * the state machine work in a cleaner way but this
+ 					 * is a nasty side-effect trick... */
+ 					skip = shdr[symtabindex].sh_offset - (loc + offset);
+ 
+ 					/* And we need to read this many bytes... */
+ 					toread = shdr[symtabindex].sh_size;
+ 
+ 					if (toread)
+ 					{
+ #ifdef	DEBUG_ELF
+ 						printf("db sym, size %X, curaddr %X\n", toread, curaddr);
+ #endif
+ 						/* Save where we are loading this... */
+ 						symtab_load = curaddr;
+ 
+ 						*((long *)curaddr) = toread;
+ 						curaddr += sizeof(long);
+ 
+ 						/* Start to read... */
+ 						continue;
+ 					}
+ 				}
+ 				else if ((symstr_load == 0) && (symstrindex >= 0))
+ 				{
+ 					/* We have already loaded the symbol table, so
+ 					 * now on to the symbol strings... */
+ 
+ 
+ 					/* Same nasty trick as above... */
+ 					skip = shdr[symstrindex].sh_offset - (loc + offset);
+ 
+ 					/* And we need to read this many bytes... */
+ 					toread = shdr[symstrindex].sh_size;
+ 
+ 					if (toread)
+ 					{
+ #ifdef	DEBUG_ELF
+ 						printf("db str, size %X, curaddr %X\n", toread, curaddr);
+ #endif
+ 						/* Save where we are loading this... */
+ 						symstr_load = curaddr;
+ 
+ 						*((long *)curaddr) = toread;
+ 						curaddr += sizeof(long);
+ 
+ 						/* Start to read... */
+ 						continue;
+ 					}
+ 				}
+ 			}
+ #endif	/* IMAGE_FREEBSD */
+ 
  			/* No more segments to be loaded, so just start the
  			 * kernel.  This saves a lot of network bandwidth if
  			 * debug info is in the kernel but not loaded.  */
***************
*** 659,664 ****
--- 965,982 ----
  			loc = 0;
  			skip = 0;
  			toread = 0;
+ #ifdef	IMAGE_FREEBSD
+ 			/* Make sure we have a null to start with... */
+ 			shdr = 0;
+ 
+ 			/* Clear the symbol index values... */
+ 			symtabindex = -1;
+ 			symstrindex = -1;
+ 
+ 			/* ...and the load addresses of the symbols  */
+ 			symtab_load = 0;
+ 			symstr_load = 0;
+ #endif
  		} else
  #endif
  #ifdef	TAGGED_IMAGE

--------------A42168EE212B4B225B38F0C6--


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3A9C216D.11F2ABA7>