Go forward to Free Store.
Go backward to Copying Objects.
Go up to Top.

Exception Handling
==================

   Note, exception handling in g++ is still under development.

   This section describes the mapping of C++ exceptions in the C++
front-end, into the back-end exception handling framework.

   The basic mechanism of exception handling in the back-end is
unwind-protect a la elisp.  This is a general, robust, and language
independent representation for exceptions.

   The C++ front-end exceptions are mapping into the unwind-protect
semantics by the C++ front-end.  The mapping is describe below.

   When -frtti is used, rtti is used to do exception object type
checking, when it isn't used, the encoded name for the type of the
object being thrown is used instead.  All code that originates
exceptions, even code that throws exceptions as a side effect, like
dynamic casting, and all code that catches exceptions must be compiled
with either -frtti, or -fno-rtti.  It is not possible to mix rtti base
exception handling objects with code that doesn't use rtti.  The
exceptions to this, are code that doesn't catch or throw exceptions,
catch (...), and code that just rethrows an exception.

   Currently we use the normal mangling used in building functions names
(int's are "i", const char * is PCc) to build the non-rtti base type
descriptors for exception handling.  These descriptors are just plain
NULL terminated strings, and internally they are passed around as char
*.

   In C++, all cleanups should be protected by exception regions.  The
region starts just after the reason why the cleanup is created has
ended.  For example, with an automatic variable, that has a constructor,
it would be right after the constructor is run.  The region ends just
before the finalization is expanded.  Since the backend may expand the
cleanup multiple times along different paths, once for normal end of the
region, once for non-local gotos, once for returns, etc, the backend
must take special care to protect the finalization expansion, if the
expansion is for any other reason than normal region end, and it is
`inline' (it is inside the exception region).  The backend can either
choose to move them out of line, or it can created an exception region
over the finalization to protect it, and in the handler associated with
it, it would not run the finalization as it otherwise would have, but
rather just rethrow to the outer handler, careful to skip the normal
handler for the original region.

   In Ada, they will use the more runtime intensive approach of having
fewer regions, but at the cost of additional work at run time, to keep a
list of things that need cleanups.  When a variable has finished
construction, they add the cleanup to the list, when the come to the end
of the lifetime of the variable, the run the list down.  If the take a
hit before the section finishes normally, they examine the list for
actions to perform.  I hope they add this logic into the back-end, as it
would be nice to get that alternative approach in C++.

   On an rs6000, xlC stores exception objects on that stack, under the
try block.  When is unwinds down into a handler, the frame pointer is
adjusted back to the normal value for the frame in which the handler
resides, and the stack pointer is left unchanged from the time at which
the object was thrown.  This is so that there is always someplace for
the exception object, and nothing can overwrite it, once we start
throwing.  The only bad part, is that the stack remains large.

   The below points out some things that work in g++'s exception
handling.

   All completely constructed temps and local variables are cleaned up
in all unwinded scopes.  Completely constructed parts of partially
constructed objects are cleaned up.  This includes partially built
arrays.  Exception specifications are now handled.

   The below points out some flaws in g++'s exception handling, as it
now stands.

   Only exact type matching or reference matching of throw types works
when -fno-rtti is used.  Only works on a SPARC (like Suns), i386, arm
and rs6000 machines.  Partial support is in for all other machines, but
a stack unwinder called __unwind_function has to be written, and added
to libgcc2 for them.  See below for details on __unwind_function.  Don't
expect exception handling to work right if you optimize, in fact the
compiler will probably core dump.  RTL_EXPRs for EH cond variables for
&& and || exprs should probably be wrapped in UNSAVE_EXPRs, and
RTL_EXPRs tweaked so that they can be unsaved, and the UNSAVE_EXPR code
should be in the backend, or alternatively, UNSAVE_EXPR should be ripped
out and exactly one finalization allowed to be expanded by the backend.
I talked with kenner about this, and we have to allow multiple
expansions.

   We only do pointer conversions on exception matching a la 15.3 p2
case 3: `A handler with type T, const T, T&, or const T& is a match for
a throw-expression with an object of type E if [3]T is a pointer type
and E is a pointer type that can be converted to T by a standard pointer
conversion (_conv.ptr_) not involving conversions to pointers to private
or protected base classes.' when -frtti is given.

   We don't call delete on new expressions that die because the ctor
threw an exception.  See except/18 for a test case.

   15.2 para 13: The exception being handled should be rethrown if
control reaches the end of a handler of the function-try-block of a
constructor or destructor, right now, it is not.

   15.2 para 12: If a return statement appears in a handler of
function-try-block of a constructor, the program is ill-formed, but this
isn't diagnosed.

   15.2 para 11: If the handlers of a function-try-block contain a jump
into the body of a constructor or destructor, the program is ill-formed,
but this isn't diagnosed.

   15.2 para 9: Check that the fully constructed base classes and
members of an object are destroyed before entering the handler of a
function-try-block of a constructor or destructor for that object.

   build_exception_variant should sort the incoming list, so that it
implements set compares, not exact list equality.  Type smashing should
smash exception specifications using set union.

   Thrown objects are usually allocated on the heap, in the usual way,
but they are never deleted.  They should be deleted by the catch
clauses.  If one runs out of heap space, throwing an object will
probably never work.  This could be relaxed some by passing an
__in_chrg parameter to track who has control over the exception object.
Thrown objects are not allocated on the heap when they are pointer to
object types.

   When the backend returns a value, it can create new exception regions
that need protecting.  The new region should rethrow the object in
context of the last associated cleanup that ran to completion.

   The structure of the code that is generated for C++ exception
handling code is shown below:

     Ln:					throw value;
             copy value onto heap
             jump throw (Ln, id, address of copy of value on heap)
     
                                             try {
     +Lstart:	the start of the main EH region
     |...						...
     +Lend:		the end of the main EH region
                                             } catch (T o) {
     						...1
                                             }
     Lresume:
             nop	used to make sure there is something before
                     the next region ends, if there is one
     ...                                     ...
     
             jump Ldone
     [
     Lmainhandler:    handler for the region Lstart-Lend
     	cleanup
     ] zero or more, depending upon automatic vars with dtors
     +Lpartial:
     |        jump Lover
     +Lhere:
             rethrow (Lhere, same id, same obj);
     Lterm:		handler for the region Lpartial-Lhere
             call terminate
     Lover:
     [
      [
             call throw_type_match
             if (eq) {
      ] these lines disappear when there is no catch condition
     +Lsregion2:
     |	...1
     |	jump Lresume
     |Lhandler:	handler for the region Lsregion2-Leregion2
     |	rethrow (Lresume, same id, same obj);
     +Leregion2
             }
     ] there are zero or more of these sections, depending upon how many
       catch clauses there are
     ----------------------------- expand_end_all_catch --------------------------
                     here we have fallen off the end of all catch
                     clauses, so we rethrow to outer
             rethrow (Lresume, same id, same obj);
     ----------------------------- expand_end_all_catch --------------------------
     [
     L1:     maybe throw routine
     ] depending upon if we have expanded it or not
     Ldone:
             ret
     
     start_all_catch emits labels: Lresume,

   The __unwind_function takes a pointer to the throw handler, and is
expected to pop the stack frame that was built to call it, as well as
the frame underneath and then jump to the throw handler.  It must
restore all registers to their proper values as well as all other
machine state as determined by the context in which we are unwinding
into.  The way I normally start is to compile:

   void *g;         foo(void* a) { g = a; }

   with -S, and change the thing that alters the PC (return, or ret
usually) to not alter the PC, making sure to leave all other semantics
(like adjusting the stack pointer, or frame pointers) in.  After that,
replicate the prologue once more at the end, again, changing the PC
altering instructions, and finally, at the very end, jump to `g'.

   It takes about a week to write this routine, if someone wants to
volunteer to write this routine for any architecture, exception support
for that architecture will be added to g++.  Please send in those code
donations.  One other thing that needs to be done, is to double check
that __builtin_return_address (0) works.

Specific Targets
----------------

   For the alpha, the __unwind_function will be something resembling:

     void
     __unwind_function(void *ptr)
     {
       /* First frame */
       asm ("ldq $15, 8($30)"); /* get the saved frame ptr; 15 is fp, 30 is sp */
       asm ("bis $15, $15, $30"); /* reload sp with the fp we found */
     
       /* Second frame */
       asm ("ldq $15, 8($30)"); /* fp */
       asm ("bis $15, $15, $30"); /* reload sp with the fp we found */
     
       /* Return */
       asm ("ret $31, ($16), 1"); /* return to PTR, stored in a0 */
     }

However, there are a few problems preventing it from working.  First of
all, the gcc-internal function `__builtin_return_address' needs to work
given an argument of 0 for the alpha.  As it stands as of August 30th,
1995, the code for `BUILT_IN_RETURN_ADDRESS' in `expr.c' will
definitely not work on the alpha.  Instead, we need to define the
macros `DYNAMIC_CHAIN_ADDRESS' (maybe),
`RETURN_ADDR_IN_PREVIOUS_FRAME', and definitely need a new definition
for `RETURN_ADDR_RTX'.

   In addition (and more importantly), we need a way to reliably find
the frame pointer on the alpha.  The use of the value 8 above to
restore the frame pointer (register 15) is incorrect.  On many systems,
the frame pointer is consistently offset to a specific point on the
stack.  On the alpha, however, the frame pointer is pushed last.  First
the return address is stored, then any other registers are saved (e.g.,
`s0'), and finally the frame pointer is put in place.  So `fp' could
have an offset of 8, but if the calling function saved any registers at
all, they add to the offset.

   The only places the frame size is noted are with the `.frame'
directive, for use by the debugger and the OSF exception handling model
(useless to us), and in the initial computation of the new value for
`sp', the stack pointer.  For example, the function may start with:

     lda $30,-32($30)
     .frame $15,32,$26,0

The 32 above is exactly the value we need.  With this, we can be sure
that the frame pointer is stored 8 bytes less--in this case, at 24(sp)).
The drawback is that there is no way that I (Brendan) have found to let
us discover the size of a previous frame *inside* the definition of
`__unwind_function'.

   So to accomplish exception handling support on the alpha, we need two
things: first, a way to figure out where the frame pointer was stored,
and second, a functional `__builtin_return_address' implementation for
except.c to be able to use it.

Backend Exception Support
-------------------------

   The backend must be extended to fully support exceptions.  Right now
there are a few hooks into the alpha exception handling backend that
resides in the C++ frontend from that backend that allows exception
handling to work in g++.  An exception region is a segment of generated
code that has a handler associated with it.  The exception regions are
denoted in the generated code as address ranges denoted by a starting PC
value and an ending PC value of the region.  Some of the limitations
with this scheme are:

   * The backend replicates insns for such things as loop unrolling and
     function inlining.  Right now, there are no hooks into the
     frontend's exception handling backend to handle the replication of
     insns.  When replication happens, a new exception region
     descriptor needs to be generated for the new region.

   * The backend expects to be able to rearrange code, for things like
     jump optimization.  Any rearranging of the code needs have
     exception region descriptors updated appropriately.

   * The backend can eliminate dead code.  Any associated exception
     region descriptor that refers to fully contained code that has
     been eliminated should also be removed, although not doing this is
     harmless in terms of semantics.

   The above is not meant to be exhaustive, but does include all things
I have thought of so far.  I am sure other limitations exist.

   Below are some notes on the migration of the exception handling code
backend from the C++ frontend to the backend.

   NOTEs are to be used to denote the start of an exception region, and
the end of the region.  I presume that the interface used to generate
these notes in the backend would be two functions,
start_exception_region and end_exception_region (or something like
that).  The frontends are required to call them in pairs.  When marking
the end of a region, an argument can be passed to indicate the handler
for the marked region.  This can be passed in many ways, currently a
tree is used.  Another possibility would be insns for the handler, or a
label that denotes a handler.  I have a feeling insns might be the the
best way to pass it.  Semantics are, if an exception is thrown inside
the region, control is transfered unconditionally to the handler.  If
control passes through the handler, then the backend is to rethrow the
exception, in the context of the end of the original region.  The
handler is protected by the conventional mechanisms; it is the
frontend's responsibility to protect the handler, if special semantics
are required.

   This is a very low level view, and it would be nice is the backend
supported a somewhat higher level view in addition to this view.  This
higher level could include source line number, name of the source file,
name of the language that threw the exception and possibly the name of
the exception.  Kenner may want to rope you into doing more than just
the basics required by C++.  You will have to resolve this.  He may want
you to do support for non-local gotos, first scan for exception handler,
if none is found, allow the debugger to be entered, without any cleanups
being done.  To do this, the backend would have to know the difference
between a cleanup-rethrower, and a real handler, if would also have to
have a way to know if a handler `matches' a thrown exception, and this
is frontend specific.

   The UNSAVE_EXPR tree code has to be migrated to the backend.  Exprs
such as TARGET_EXPRs, WITH_CLEANUP_EXPRs, CALL_EXPRs and RTL_EXPRs have
to be changed to support unsaving.  This is meant to be a complete list.
SAVE_EXPRs can be unsaved already.  expand_decl_cleanup should be
changed to unsave it's argument, if needed.  See
cp/tree.c:cp_expand_decl_cleanup, unsave_expr_now, unsave_expr, and
cp/expr.c:cplus_expand_expr(case UNSAVE_EXPR:) for the UNSAVE_EXPR code.
Now, as to why...  because kenner already tripped over the exact same
problem in Ada, we talked about it, he didn't like any of the solution,
but yet, didn't like no solution either.  He was willing to live with
the drawbacks of this solution.  The drawback is unsave_expr_now.  It
should have a callback into the frontend, to allow the unsaveing of
frontend special codes.  The callback goes in, inplace of the call to
my_friendly_abort.

   The stack unwinder is one of the hardest parts to do.  It is highly
machine dependent.  The form that kenner seems to like was a couple of
macros, that would do the machine dependent grunt work.  One preexisting
function that might be of some use is __builtin_return_address ().  One
macro he seemed to want was __builtin_return_address, and the other
would do the hard work of fixing up the registers, adjusting the stack
pointer, frame pointer, arg pointer and so on.

   The eh archive (~mrs/eh) might be good reading for understanding the
Ada perspective, and some of kenners mindset, and a detailed explanation
(Message-Id: <9308301130.AA10543@vlsi1.ultra.nyu.edu>) of the concepts
involved.

   Here is a guide to existing backend type code.  It is all in
cp/except.c.  Check out do_unwind, and expand_builtin_throw for current
code on how to figure out what handler matches an exception,
emit_exception_table for code on emitting the PC range table that is
built during compilation, expand_exception_blocks for code that emits
all the handlers at the end of a functions, end_protect to mark the end
of an exception region, start_protect to mark the start of an exception
region, lang_interim_eh is the master hook used by the backend into the
EH backend that now exists in the frontend, and expand_internal_throw to
raise an exception.