Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 31 Jul 1997 21:10:12 -0700
From:      John Polstra <jdp@polstra.com>
To:        current@freebsd.org
Subject:   Re: core group topics 
Message-ID:  <199708010410.VAA03660@austin.polstra.com>
In-Reply-To: <199707311142.EAA28471@implode.root.com>
References:  <199707311142.EAA28471@implode.root.com>

next in thread | previous in thread | raw e-mail | index | archive | help
In article <199707311142.EAA28471@implode.root.com>,
David Greenman  <dg@root.com> wrote:

>    The only issue I have against ELF is that I'm concerned that the overhead
> for processing the much more sophisticated header at exec time might have a
> serious impact on exec performance (something I'm particularly sensitive to
> since I wrote the a.out exec code for FreeBSD).

This is a common concern, but I'm convinced it's not a problem.  I
think ELF can be loaded as fast as a.out, within the limits of
measurability.  Well, almost as fast, anyway.

People look at the ELF spec and they think, "Gads, all those
different kinds of sections!  It must be hard to load one of these
things."  But to load an ELF program or shared library, you don't
even look at the section table.  Instead, you use the Program
Header, which is specially constructed just for this purpose.  The
Program Header describes text, data, and bss segments, just like
a.out does.  It's not much different from a.out in complexity.

To illustrate, the loop that does the loading in the ELF bootloader
looks like this:

        printf("segments:");
        for (i = 0; i < head.e_phnum; i++) {
                ph = (Elf32_Phdr*)(phbuf + head.e_phentsize * i);
                if (ph->p_type == PT_LOAD)
                {
                        ph->p_vaddr &= ADDR_MASK;
                        printf(" 0x%x-0x%x", ph->p_vaddr,       
                               ph->p_vaddr + ph->p_memsz);
                        if (ph->p_filesz > 0)
                        {
                                poff = ph->p_offset;
                                xread((void *)ph->p_vaddr, ph->p_filesz);
                        }
                        if (ph->p_filesz < ph->p_memsz)
                        {
                                pbzero((void *)(ph->p_vaddr + ph->p_filesz),
                                       ph->p_memsz - ph->p_filesz);
                        }
                } 
        }
        printf(" \n");

The loop typically iterates over 6-8 items.  Of those, only 2
satisfy the "(ph->p_type == PT_LOAD)" condition -- the rest do
nothing.  I think it's virtually as fast as loading an a.out file,
and nobody has tried to optimize it yet.

The kernel exec code in "imgact_elf.c" looks more complicated.  But
that's only because it contains a whole bunch of debugging cruft.
There's a lot of room for optimization in it.

John
--
   John Polstra                                       jdp@polstra.com
   John D. Polstra & Co., Inc.                Seattle, Washington USA
   "Self-knowledge is always bad news."                 -- John Barth



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708010410.VAA03660>