Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Oct 2003 13:52:07 -0700 (PDT)
From:      Kip Macy <kmacy@fsmware.com>
To:        hackers@freebsd.org
Subject:   process checkpoint restore facility now in DragonFly BSD
Message-ID:  <20031020134532.B63978@demos.bsdclusters.com>

next in thread | raw e-mail | index | archive | help

At BSDCon '03 it was mentioned that a process checkpoint / restore
facility would be a useful addition to FreeBSD. This post is to
announce that Matt and I have added such a facility to DragonFly BSD.
It is noteworthy for -hackers as anyone who is interested could still
port it with relative ease.

Basically you use it by kldload'ing the checkpt.ko module, which
should now be built automatically.  You then ^E the program you want
to checkpoint, and use the 'checkpt' utility in /usr/bin to resume it
from the checkpoint file.  The program is *NOT* killed by this signal, it
continues to run after the checkpoint file(s) have been generated.
Alternatively, you can send the program any signal that will cause it
to coredump and exit. You will then be able to restore from the core
dump! In conjunction with a shared file system this can be used for
process migration.

The checkpoint program is currently designed to work only with simple
programs... it will restore the signal, descriptors references to regular
files, the VM state (anonymous memory), as well as any nominal file
mappings, but it cannot restore sockets, pipes, or device descriptors.
So, while you can checkpoint a pipe sequence, you can't really restore it.
Pipes, ttys, and common devices (zero, null, bpf) will not be that hard to
add. Stream sockets are an open question.

Please note that there are *SEVERE* security issues with this module.
The module is not loaded into the kernel by default and, when loaded,
can only be used by users in the wheel group.  You can change the group
requirements with a sysctl (see the manual page for checkpt).  The
security issues relate to the restoration of signals and file descriptors
(in particular, the restoration system call will convert file handles
into file descriptors which could potentially allow any file in the system
to be accessed).  Matt has put in some basic security checks but they are
not meant to be all encompassing!

It is going into the tree now because Matt and I have done enough work on
it that anyone else interested in working on it can theoretically dig in.
Significant debugging is still in place.  We've left it as a module to
facilitate debugging.

It should be useable for scientific applications now. It should already
work considerably better then the linux equivalent what with the regular
file descriptor save/restore capability.

Any developer who wishes to work on the checkpointing module and related
code is welcome to!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031020134532.B63978>